date:20250224

Re: The COBOL front end, version 3, now in 14 easy pieces

2025-02-24 Thread Richard Biener

On Tue, Feb 25, 2025 at 2:48 AM James K. Lowden
 wrote:
>
> On Mon, 24 Feb 2025 14:51:27 +0100
> Richard Biener  wrote:
>
> > On Wed, Feb 19, 2025 at 12:38?AM James K. Lowden
> >  wrote:
> > >
> > > The following 14 patches constitute 105,720 lines of code in 83
> > > files to build and document the COBOL front end.
> [...]
> > > I tested these patches using "git apply" to an unpublished branch
> > > "cobol-patched".
> >
> > I have now built the compiler from the (now published) cobol-patched
> > branch.
>
> I rebuilt cobol-patched today. It now includes changes to gcc/doc/, and
> a new gcc/Makefile.in (previously omitted because I thought it was
> generated).  That deals with some but not all the issues below.
>
> > On x86_64-linux and noticed the following issues:
> >
> > The toplevel configure script on the branch wasn't re-generated (duh),
> > so it isn't
> > ready for use.
>
> Since we're not mailing patches, cobol-patched is now built by applying
> patches that include the configure scripts.

good.

> > gcc-cobol/gcc/cobol/parse.y:1361.10-16: error: require bison 3.5.1,
> > but have 3.0.4
> >  %require "3.5.1"  //3.8.2 also works, but not 3.8.0
> >   ^^^
> >
> > this requirement isn't documented, neither is a version requirement
> > for flex.  The
> > place for this is in gcc/doc/install.texi
>
> I think you will find the updated .texi files comprehesively (!)
> reflect the presence of gcobol and its requirements.
>
> There is a hitch.  The patches are against master branch as of 5
> February.  Many changes occurred in the time since.  Now that I know
> that, I *expect* a conflict extravaganza as we pull that together.
>
> Best IMO is for us to update our repository, i.e. to merge master into
> our master+cobol, then to cobol-stage, then to cobol-patched.
> Questions:
>
> 1. How often should we do that, would you say?

As it suits you, best when you think you fixed an important set of issues.
My idea is that cobol-patched should reflect the tree as if it were
post merge to GCC master (so we could basically merge from that
branch instead of re-applying patches).

> 2. Is cobol-patched more useful to you right now as-is, or should I
> regenerate it tomorrow?

It was useful as it is for discovering the issues I reported.  I will re-try
once you declare them fixed.

> > During the build we write to
> >
> > gcc/cobol/charmaps-dupe.cc
> > gcc/cobol/valconv-dupe.cc
>
> Not yet fixed.  Top of the list, now.
>
> > Installing the result via make install DESTDIR=/foo I see both a
> > 'gcobol' and a 'gcobc' program
> > being installed - is that intentional?
>
> Yes, as explained prior, gcobc is an emulation script.
>
> > I also see the gcobol.3
> > manpage reside directly in /foo/gcobol.3 rather
> > than in the expected /foo/usr/local/man/..
>
> gcc/Makefile.in didn't appear in the patches.  I now have installed:
>
> $ find /usr/local/cobol-patched/ -name '*cobol*'
> /usr/local/cobol-patched/
> /usr/local/cobol-patched/libexec/gcc/x86_64-pc-linux-gnu/15.0.1/cobol1
> /usr/local/cobol-patched/bin/gcobol
> /usr/local/cobol-patched/share/man/man1/gcobol.1
> /usr/local/cobol-patched/share/man/man3/gcobol.3
> /usr/local/cobol-patched/share/gcobol
>
> > Installing of libgcobol fails for me:
>
> Hmm, a little worse for me.  As you see above, the installed cobol files
> don't include the library.  This was from "make install".  Checking.

That's what happened to me before I re-generated the toplevel configure,
it didn't build libgcobol at all.

> > so the suppression seems incomplete?
>
> Or too complete.  :-/
>
> > Doing
> >
> > > ./install/gcc-cobol/usr/local/bin/gcobol t.cob -m32
> >
> > only fails during linking as we do build a 32bit binary but do not
> > find (the not built) 32bit runtime.  I suspect when writing a program
> > that would need 128bit integer or float we'd eventually ICE.
>
> I think the cobol-patched you were using was missing our patch of
> Friday.  Our intention is to prevent any 32-bit binary from being
> generated, in addition to not attempting to build the library.
>
> But we're not there yet.  On my system, configured as
>
>   $ ../configure --prefix=/usr/local/gcc-cobol --disable-bootstrap
> --disable-generated-files-in-srcdir --disable-multilib
> --enable-checking --enable-languages=c,cobol
>
> I see failure to link [eliding the "skipping" messages]:
>
> $ gcobol -oo -ffixed-form -m32 gcc/cobol/nist/NC/NC101A.cbl
> /usr/bin/ld: cannot find -lgcobol: No such file or directory
> /usr/bin/ld: cannot find -lstdc++: No such file or directory
> /usr/bin/ld: cannot find -lgcc: No such file or directory
> /usr/bin/ld: cannot find -lgcc_s: No such file or directory
>
> If compiled with -c,
>
> $ gcobol -oo -c -m32 -ffixed-form ../parser/gcc/cobol/nist/NC/NC101A.cbl
> $ file o o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not 
> stripped
>
> Not our intended result.  :-(  But, fixable.
>
> > The cobol frontend should attempt to intercept the selection of

Re: [PATCH v2 5/5][STAGE1] doc: document btf_type_tag and btf_decl_tag attributes

2025-02-24 Thread Indu Bhagat


On 2/6/25 11:54 AM, David Faust wrote:

gcc/
* doc/extend.texi (Common Variable Attributes): Document
btf_decl_tag attribute.
(Common Type Attributes): Document btf_type_tag attribute.
---
  gcc/doc/extend.texi | 68 +
  1 file changed, 68 insertions(+)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 2764597a479..4536cfd7c68 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8037,6 +8037,41 @@ align them on any target.
  The @code{aligned} attribute can also be used for functions
  (@pxref{Common Function Attributes}.)
  
+@cindex @code{btf_decl_tag} variable attribute

+@item btf_decl_tag (@var{argument})
+The @code{btf_decl_tag} attribute may be used to associate variable
+declarations, struct or union member declarations, function
+declarations, and function parameter declarations with arbitrary strings.


Currently we add this in the @node Common Variable Attributes.  The 
btf_decl_tag attribute is unique in that the effect is the same, be it 
associated with a variable decl or function.  But I wonder if some 
placeholder text also be included in the documentation for Common 
Function Attributes for clarity ?



+These strings are not interpreted by the compiler in any way, and have
+no effect on code generation.  Instead, these user-provided strings
+are recorded in DWARF (via @code{DW_AT_GNU_annotation} and
+@code{DW_TAG_GNU_annotation} extensions) and BTF information (via
+@code{BTF_KIND_DECL_TAG} records), and associated to the attributed
+declaration.  If neither DWARF nor BTF information is generated, the
+attribute has no effect.
+
+The argument is treated as an ordinary string in the source language
+with no additional special rules.
+
+The attribute may be supplied multiple times for a single declaration,
+in which case each distinct argument string will be recorded in a
+separate DIE or BTF record, each associated to the declaration.  For
+a single declaration with multiple @code{btf_decl_tag} attributes,
+the order of the @code{DW_TAG_GNU_annotation} DIEs produced is not
+guaranteed to maintain the order of attributes in the source code.
+
+For example:
+
+@smallexample
+int *foo __attribute__ ((btf_decl_tag ("__percpu")));
+@end smallexample
+
+@noindent
+when compiled with @code{-gbtf} results in an additional
+@code{BTF_KIND_DECL_TAG} BTF record to be emitted in the BTF info,
+associating the string ``__percpu'' with the normal @code{BTF_KIND_VAR}
+record for the variable ``foo''.
+


Remove "normal" or do you intend to say something specific ?


  @cindex @code{counted_by} variable attribute
  @item counted_by (@var{count})
  The @code{counted_by} attribute may be attached to the C99 flexible array
@@ -9226,6 +9261,39 @@ is given by the product of arguments 1 and 2, and that
  @code{malloc_type}, like the standard C function @code{malloc},
  returns an object whose size is given by argument 1 to the function.
  
+@cindex @code{btf_type_tag} type attribute

+@item btf_type_tag (@var{argument})
+The @code{btf_type_tag} attribute may be used to associate (to ``tag'')
+particular types with arbitrary string annotations.  These annotations
+are recorded in debugging info by supported debug formats, currently
+DWARF (via @code{DW_AT_GNU_annotation} and @code{DW_TAG_GNU_annotation}
+extensions) and BTF (via @code{BTF_KIND_TYPE_TAG} records).  These
+annotation strings are not interpreted by the compiler in any way, and
+have no effect on code generation.  If neither DWARF nor BTF
+information is generated, the attribute has no effect.
+
+The argument is treated as an ordinary string in the source language
+with no additional special rules.
+
+The attribute may be supplied multiple times for a single declaration,
+in which case each distinct argument string will be recorded in a
+separate DIE or BTF record, each associated to the type.  For a single
+type with multiple @code{btf_type_tag} attributes, the order of the
+@code{DW_TAG_GNU_annotation} DIEs produced is not guaranteed to
+maintain the order of attributes in the source code.
+> +For example the following code:
+
+@smallexample
+int * __attribute__ ((btf_type_tag ("__user"))) foo;
+@end smallexample
+
+@noindent
+when compiled with @code{-gbtf} results in an additional
+@code{BTF_KIND_TYPE_TAG} BTF record to be emitted in the BTF info,
+associating the string ``__user'' with the normal @code{BTF_KIND_PTR}
+record for the pointer-to-integer type used in the declaration.
+
  @cindex @code{copy} type attribute
  @item copy
  @itemx copy (@var{expression})

Re: [PATCH][_Hashtable] Fix hash code cache usage

2025-02-24 Thread Jonathan Wakely

On Mon, 24 Feb 2025 at 20:52, François Dumont  wrote:
>
> All good remarks as usual, here is a new version.
>
> I took the time to leverage on the new method to replace a usage of
> _M_src_hash_code.
>
>  libstdc++: [_Hashtable] Fix hash code cache usage when stateful
> hash functor
>
>  It is wrong to reuse a cached hash code from another container when
> this code depends
>  on the state of the container's Hash functor.
>
>  Add checks that Hash functor is stateless before reusing the cached
> hash code.
>
>  libstdc++-v3/ChangeLog:
>
>  * include/bits/hashtable_policy.h
>  (_Hash_code_base::_M_copy_code,
> _Hash_code_base::_M_stored_code): Remove.

"_M_store_code" not "_M_stored_code".

>  * include/bits/hashtable.h (_M_hash_code_ext): New.
>  (_M_copy_code): New.
>  (_M_assign): Use latter.
>  (_M_bucket_index_ex): New.
>  (_M_equals): Use latter.
>  (_M_store_code): New.

The changes to _M_src_hash_code and _M_merge_multi should be in the
ChangeLog too.

OK for trunk with those changes, thanks.


>  * testsuite/23_containers/unordered_map/modifiers/merge.cc
> (test10): New
>  test case.
>
> Tested under Linux x64.
>
> Ok to commit ?
>
> François
>
>
> On 19/02/2025 14:33, Jonathan Wakely wrote:
> > On 20/01/25 22:12 +0100, François Dumont wrote:
> >> Hi
> >>
> >> In my work on fancy pointer support I've decided to always cache the
> >> hash code.
> >>
> >> Doing so I spotted a bug in the management of this cache when hash
> >> functor is stateful.
> >>
> >> libstdc++: [_Hashtable] Fix hash code cache usage when hash
> >> functor is stateful
> >>
> >> It is wrong to reuse a cached hash code when this code depends
> >> then on the state
> >> of the Hash functor.
> >>
> >> Add checks that Hash functor is stateless before reusing the
> >> cached hash code.
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >> * include/bits/hashtable_policy.h
> >> (_Hash_code_base::_M_copy_code): Remove.
> >> * include/bits/hashtable.h (_M_copy_code): New.
> >
> > I like replacing the _M_copy_code overloads with one function, thanks.
> > We could do the same for _M_store_code too.
> >
> >> (_M_assign): Use latter.
> >> (_M_bucket_index_ex): New.
> >
> > I assume "ex" means "external"? That doesn't seem very clear to me
> > though. Maybe "ext" would be better.
> >
> >> (_M_equals): Use latter.
> >> *
> >> testsuite/23_containers/unordered_map/modifiers/merge.cc (test10): New
> >> test case.
> >>
> >> Tested under Linux x64, ok to commit ?
> >>
> >> François
> >
> >> diff --git a/libstdc++-v3/include/bits/hashtable.h
> >> b/libstdc++-v3/include/bits/hashtable.h
> >> index d6d76a743bb..b3c1d7aac24 100644
> >> --- a/libstdc++-v3/include/bits/hashtable.h
> >> +++ b/libstdc++-v3/include/bits/hashtable.h
> >> @@ -808,6 +808,36 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>   _M_bucket_index(__hash_code __c) const
> >>   { return __hash_code_base::_M_bucket_index(__c,
> >> _M_bucket_count); }
> >>
> >> +#pragma GCC diagnostic push
> >> +#pragma GCC diagnostic ignored "-Wc++17-extensions" // if constexpr
> >> +  // Like _M_bucket_index but when the node is coming from another
> >> +  // container instance.
> >> +  size_type
> >> +  _M_bucket_index_ex(const __node_value_type& __n) const
> >> +  {
> >> +if constexpr (__hash_cached::value)
> >> +  if constexpr (std::is_empty<_Hash>::value)
> >> +return _RangeHash{}(__n._M_hash_code, _M_bucket_count);
> >> +
> >> +return _RangeHash{}
> >> +  (this->_M_hash_code(_ExtractKey{}(__n._M_v())), _M_bucket_count);
> >> +  }
> >> +
> >> +  void
> >> +  _M_copy_code(__node_value_type& __to,
> >> +   const __node_value_type& __from) const
> >> +  {
> >> +if constexpr (__hash_cached::value)
> >> +  {
> >> +if constexpr (std::is_empty<_Hash>::value)
> >> +  __to._M_hash_code = __from._M_hash_code;
> >> +else
> >> +  __to._M_hash_code =
> >> +this->_M_hash_code(_ExtractKey{}(__from._M_v()));
> >> +  }
> >> +  }
> >
> >
> > These two functions are doing similar work, would it make sense to
> > factor out the common part to a new function:
> >
> >   // Get hash code for a node that comes from another _Hashtable.
> >   // Reuse a cached hash code if the hash function is stateless,
> >   // otherwise recalculate it using our own hash function.
> >   size_t
> >   _M_hash_code_ext(const __node_value_type& __from) const
> >   {
> > if constexpr (__and_<__hash_cached, is_empty<_Hash>>::value)
> >   return __from._M_hash_code;
> > else
> >   this->_M_hash_code(_ExtractKey{}(__from._M_v()));
> >   }
> >
> >   // Like _M_bucket_index but when the node is coming from another
> >   // container instance.
> >   size_type
> >   _M_bucket_index_

Re: The COBOL front end, version 3, now in 14 easy pieces (+NIST)

2025-02-24 Thread Andi Kleen

"James K. Lowden"  writes:

>> Having a minimal harness in GCCs testsuite is critical - I'd expect a
>> gcc/testsuite/gcobol.dg/dg.exp supporting execution tests.  I assume
>> Cobol has a way to exit OK or fatally and this should be
>> distinguished as testsuite PASS or FAIL.  
>
> Yes, a COBOL program exits with a return status.  And we rigged up NIST
> to do that.  What that means requires a long explanation, sorry.  

Regardless of the NIST usage you will need a frame work
of GCC specific COBOL tests. You are expected that for every bug that
you fix that wasn't detected by NIST you add a test case.
Surely you have some of those already?

-Andi

Re: [Fortran, Patch, PR107635, 4_v2] Fix class type and descriptor handling for new coarray interface [PR107635]

2025-02-24 Thread Harald Anlauf


Hi Andre,

Am 24.02.25 um 16:44 schrieb Andre Vehreschild:

Hi Harald,

I have added some comment(s). Can you take another look?


assuming that you refer to the attachment of the other submission:
that one is perfect!

This one is also OK.

Thanks for the patch(es)!

Harald


Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

Regards,
Andre

On Sat, 22 Feb 2025 17:36:55 +0100
Andre Vehreschild  wrote:


Hi Harald,

thanks for the review. Silently I'd hoped that there is some macro to get
the i-th argument, that I just haven't found and someone could point me to.
I will add a comment, when ko one comes up with the macro by Monday.

Thanks,
Andre
Andre Vehreschild * ve...@gmx.de
Am 22. Februar 2025 15:29:20 schrieb Harald Anlauf :


Hi Andre,

Am 21.02.25 um 14:35 schrieb Andre Vehreschild:

Hi all,

during testing and compiling some larger coarray codes, I found a few
deficiencies. One was with handling class types when splitting the coarray
expression and the other was that the backend_decl of a formal argument in
a function's symbol was not the same as the one the function was compiled
to. So looking at the function-decl's tree n-th formal argument is the way
to go there.

Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?


I am amazed that you do not get lost handling 9-fold nested
macros!  This is OK, as this touches your CAF code.

Otherwise, I'd recommend to add an explaining comment in the
code or code such that mere mortals have a better chance to
follow...

Thanks,
Harald


Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gmx dot de





--
Andre Vehreschild * Email: vehre ad gmx dot de

Re: [Fortran, Patch, PR107635, 4_v2] Fix class type and descriptor handling for new coarray interface [PR107635]

2025-02-24 Thread Andre Vehreschild


Hi Harald,

Oops, I've mixed up the two attachments. Sorry for that and thank you for
detecting it and ok'ing. I will merge tomorrow morning.

Thanks again,
Andre
Andre Vehreschild * ve...@gmx.de
Am 24. Februar 2025 20:22:25 schrieb Harald Anlauf :


Hi Andre,

Am 24.02.25 um 16:44 schrieb Andre Vehreschild:

Hi Harald,

I have added some comment(s). Can you take another look?


assuming that you refer to the attachment of the other submission:
that one is perfect!

This one is also OK.

Thanks for the patch(es)!

Harald


Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

Regards,
Andre

On Sat, 22 Feb 2025 17:36:55 +0100
Andre Vehreschild  wrote:


Hi Harald,

thanks for the review. Silently I'd hoped that there is some macro to get
the i-th argument, that I just haven't found and someone could point me to.
I will add a comment, when ko one comes up with the macro by Monday.

Thanks,
Andre
Andre Vehreschild * ve...@gmx.de
Am 22. Februar 2025 15:29:20 schrieb Harald Anlauf :


Hi Andre,

Am 21.02.25 um 14:35 schrieb Andre Vehreschild:

Hi all,

during testing and compiling some larger coarray codes, I found a few
deficiencies. One was with handling class types when splitting the coarray
expression and the other was that the backend_decl of a formal argument in
a function's symbol was not the same as the one the function was compiled
to. So looking at the function-decl's tree n-th formal argument is the way
to go there.

Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?


I am amazed that you do not get lost handling 9-fold nested
macros!  This is OK, as this touches your CAF code.

Otherwise, I'd recommend to add an explaining comment in the
code or code such that mere mortals have a better chance to
follow...

Thanks,
Harald


Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gmx dot de





--
Andre Vehreschild * Email: vehre ad gmx dot de

New German PO file for 'gcc' (version 15-b20250216)

2025-02-24 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

https://translationproject.org/latest/gcc/de.po

(This file, 'gcc-15-b20250216.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH][_Hashtable] Fix hash code cache usage

2025-02-24 Thread François Dumont


All good remarks as usual, here is a new version.

I took the time to leverage on the new method to replace a usage of 
_M_src_hash_code.


    libstdc++: [_Hashtable] Fix hash code cache usage when stateful 
hash functor


    It is wrong to reuse a cached hash code from another container when 
this code depends

    on the state of the container's Hash functor.

    Add checks that Hash functor is stateless before reusing the cached 
hash code.


    libstdc++-v3/ChangeLog:

    * include/bits/hashtable_policy.h
    (_Hash_code_base::_M_copy_code, 
_Hash_code_base::_M_stored_code): Remove.

    * include/bits/hashtable.h (_M_hash_code_ext): New.
    (_M_copy_code): New.
    (_M_assign): Use latter.
    (_M_bucket_index_ex): New.
    (_M_equals): Use latter.
    (_M_store_code): New.
    * testsuite/23_containers/unordered_map/modifiers/merge.cc 
(test10): New

    test case.

Tested under Linux x64.

Ok to commit ?

François


On 19/02/2025 14:33, Jonathan Wakely wrote:

On 20/01/25 22:12 +0100, François Dumont wrote:

Hi

In my work on fancy pointer support I've decided to always cache the 
hash code.


Doing so I spotted a bug in the management of this cache when hash 
functor is stateful.


    libstdc++: [_Hashtable] Fix hash code cache usage when hash 
functor is stateful


    It is wrong to reuse a cached hash code when this code depends 
then on the state

    of the Hash functor.

    Add checks that Hash functor is stateless before reusing the 
cached hash code.


    libstdc++-v3/ChangeLog:

    * include/bits/hashtable_policy.h 
(_Hash_code_base::_M_copy_code): Remove.

    * include/bits/hashtable.h (_M_copy_code): New.


I like replacing the _M_copy_code overloads with one function, thanks.
We could do the same for _M_store_code too.


    (_M_assign): Use latter.
    (_M_bucket_index_ex): New.


I assume "ex" means "external"? That doesn't seem very clear to me
though. Maybe "ext" would be better.


    (_M_equals): Use latter.
    * 
testsuite/23_containers/unordered_map/modifiers/merge.cc (test10): New

    test case.

Tested under Linux x64, ok to commit ?

François


diff --git a/libstdc++-v3/include/bits/hashtable.h 
b/libstdc++-v3/include/bits/hashtable.h

index d6d76a743bb..b3c1d7aac24 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -808,6 +808,36 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _M_bucket_index(__hash_code __c) const
  { return __hash_code_base::_M_bucket_index(__c, 
_M_bucket_count); }


+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wc++17-extensions" // if constexpr
+  // Like _M_bucket_index but when the node is coming from another
+  // container instance.
+  size_type
+  _M_bucket_index_ex(const __node_value_type& __n) const
+  {
+    if constexpr (__hash_cached::value)
+  if constexpr (std::is_empty<_Hash>::value)
+    return _RangeHash{}(__n._M_hash_code, _M_bucket_count);
+
+    return _RangeHash{}
+  (this->_M_hash_code(_ExtractKey{}(__n._M_v())), _M_bucket_count);
+  }
+
+  void
+  _M_copy_code(__node_value_type& __to,
+   const __node_value_type& __from) const
+  {
+    if constexpr (__hash_cached::value)
+  {
+    if constexpr (std::is_empty<_Hash>::value)
+  __to._M_hash_code = __from._M_hash_code;
+    else
+  __to._M_hash_code =
+    this->_M_hash_code(_ExtractKey{}(__from._M_v()));
+  }
+  }



These two functions are doing similar work, would it make sense to
factor out the common part to a new function:

  // Get hash code for a node that comes from another _Hashtable.
  // Reuse a cached hash code if the hash function is stateless,
  // otherwise recalculate it using our own hash function.
  size_t
  _M_hash_code_ext(const __node_value_type& __from) const
  {
    if constexpr (__and_<__hash_cached, is_empty<_Hash>>::value)
  return __from._M_hash_code;
    else
  this->_M_hash_code(_ExtractKey{}(__from._M_v()));
  }

  // Like _M_bucket_index but when the node is coming from another
  // container instance.
  size_type
  _M_bucket_index_ext(const __node_value_type& __n) const
  { return _M_bucket_index(_M_hash_code_ext(__n)); }

  void
  _M_copy_code(__node_value_type& __to,
   const __node_value_type& __from) const
  {
    if constexpr (__hash_cached::value)
  __to._M_hash_code = _M_hash_code_ext(__n);
  }



+#pragma GCC diagnostic pop
+
  // Find and insert helper functions and types

  // Find the node before the one matching the criteria.
@@ -1587,7 +1617,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__node_ptr __ht_n = __ht._M_begin();
__node_ptr __this_n
  = __node_gen(static_cast<_FromVal>(__ht_n->_M_v()));
-    this->_M_copy_code(*__this_n, *__ht_n);
+    _M_copy_code(*__this_n, *__ht_n);
_M_update_bbegin(__

Re: [PATCH] libstdc++: Implement LWG 4027 change to possibly-const-range [PR118083]

2025-02-24 Thread Patrick Palka

On Tue, 4 Feb 2025, Patrick Palka wrote:

> On Tue, 4 Feb 2025, Patrick Palka wrote:
> 
> > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps
> > 14 (after a long while)?
> > 
> > -- >8 --
> > 
> > LWG 4027 effectively makes the const range access CPOs ranges::cfoo behave
> > more consistently across C++23 and C++20 (pre-P2278R4) and more
> > consistently with the std::cfoo range accessors, as the below testcase
> > adjustments demonstrate (which mostly consist of reverting workarounds
> > added by r14-3771-gf12e26f3496275 and r13-7186-g0d94c6df183375).
> > 
> > In passing fix PR118083 which reports that the input_range constraint on
> > possibly-const-range is missing in our implementation.  A consequence of
> > this is that the const range access CPOs now consistently require the
> > type to be a range, and so in some our of tests we need to introduce
> > otherwise unused begin/end members.
> 
> ... now with the LWG 4027 testcases added (to cbegin.cc, so that it runs
> in both C++20 and C++23 mode):

Ping.

> 
> -- >8 --
> 
> Subject: [PATCH] libstdc++: Implement LWG 4027 change to possibly-const-range
>  [PR118083]
> 
>   PR libstdc++/118083
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/bits/ranges_base.h
>   (ranges::__access::__possibly_const_range): Adjust logic as per
>   LWG 4027.  Add missing input_range constraint.
>   * testsuite/std/ranges/access/cbegin.cc (test05): Verify LWG
>   4027 testcases.
>   * testsuite/std/ranges/access/cdata.cc: Adjust, simplify and
>   consolidate some tests after the above.
>   * testsuite/std/ranges/access/cend.cc: Likewise.
>   * testsuite/std/ranges/access/crbegin.cc: Likewise.
>   * testsuite/std/ranges/access/crend.cc: Likewise.
>   * testsuite/std/ranges/adaptors/join.cc: Likewise.
>   * testsuite/std/ranges/adaptors/take_while.cc: Likewise.
>   * testsuite/std/ranges/adaptors/transform.cc: Likewise.
> ---
>  libstdc++-v3/include/bits/ranges_base.h   |  4 +-
>  .../testsuite/std/ranges/access/cbegin.cc | 17 
>  .../testsuite/std/ranges/access/cdata.cc  | 21 -
>  .../testsuite/std/ranges/access/cend.cc   | 30 ++---
>  .../testsuite/std/ranges/access/crbegin.cc| 43 +--
>  .../testsuite/std/ranges/access/crend.cc  | 20 +
>  .../testsuite/std/ranges/adaptors/join.cc |  8 ++--
>  .../std/ranges/adaptors/take_while.cc |  2 -
>  .../std/ranges/adaptors/transform.cc  |  4 --
>  9 files changed, 59 insertions(+), 90 deletions(-)
> 
> diff --git a/libstdc++-v3/include/bits/ranges_base.h 
> b/libstdc++-v3/include/bits/ranges_base.h
> index 4dcfbf66d51..28fe64a9e9d 100644
> --- a/libstdc++-v3/include/bits/ranges_base.h
> +++ b/libstdc++-v3/include/bits/ranges_base.h
> @@ -642,11 +642,11 @@ namespace ranges
>namespace __access
>{
>  #if __glibcxx_ranges_as_const // >= C++23
> -template
> +template
>constexpr auto&
>__possibly_const_range(_Range& __r) noexcept
>{
> - if constexpr (constant_range && !constant_range<_Range>)
> + if constexpr (input_range)
> return const_cast(__r);
>   else
> return __r;
> diff --git a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc 
> b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc
> index 5423e782428..66b28f41877 100644
> --- a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc
> @@ -116,10 +116,27 @@ test04()
>VERIFY(std::ranges::cbegin(std::move(c)) == std::ranges::begin(c));
>  }
>  
> +void
> +test05()
> +{
> +  // LWG 4027 - possibly-const-range should prefer returning const R&
> +  auto r = std::views::single(0)
> +| std::views::transform([](int) { return 0; });
> +  using C1 = decltype(std::ranges::cbegin(r));
> +  using C1 = decltype(std::cbegin(r));
> +
> +  [] (auto x) {
> +auto r = std::views::single(x) | std::views::lazy_split(0);
> +static_assert( ! requires { (*std::ranges::cbegin(r)).front() = 42; });
> +static_assert( ! requires { (*std::cbegin(r)).front() = 42; });
> +  }(0);
> +}
> +
>  int
>  main()
>  {
>test01();
>test03();
>test04();
> +  test05();
>  }
> diff --git a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc 
> b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc
> index 62c347be43d..f474ab7ec99 100644
> --- a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc
> @@ -34,20 +34,21 @@ test01()
>{
>  int i = 0;
>  int j = 0;
> +
> +#if __cpp_lib_ranges_as_const
> +// These overloads mean that range and range are satisfied.
> +const int* begin() const { throw; }
> +const int* end() const { throw; }
> +#endif
> +
>  int* data() { return &j; }
>  const R* data() const noexcept { return nullptr; }
>};
>static_assert( has_cdata );
>static_assert( has_cdata );
>R r;
> -#if ! __cpp_lib_ran

Re: [RFA][PR tree-optimization/98028] Use relationship between operands to simplify SUB_OVERFLOW

2025-02-24 Thread Jakub Jelinek

On Sat, Feb 15, 2025 at 04:52:58PM -0700, Jeff Law wrote:
> On 2/12/25 2:22 PM, Jakub Jelinek wrote:
> 
> > 
> > Or just go with that even for GCC 15 (completely untested and dunno if
> > something needs to be done about s = NULL passed to query or not) for
> > now, with the advantage that it can do something even for the cases where
> > type is not compatible with types of arguments, and perhaps add additional
> > cases later?
> I added the check for s being non-null and made a trivial fix to your patch
> (IIRC you used "code" when it should have been "subcode").
> 
> Do you think there's still value in opening a new non-regression bug for
> additional cases?

I actually can't think of any further cases right now, so I think no need to
open a PR, just commit the patch ;)

Jakub

Re: [PATCH v2 4/5][STAGE1] btf: generate and output DECL_TAG and TYPE_TAG records

2025-02-24 Thread David Faust




On 2/24/25 11:03, Indu Bhagat wrote:
> On 2/6/25 11:54 AM, David Faust wrote:
>> Support the btf_decl_tag and btf_type_tag attributes in BTF by creating
>> and emitting BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records,
>> respectively, for them.
>>
>> Some care is required when -gprune-btf is in effect to avoid emitting
>> decl or type tags for declarations or types which have been pruned and
>> will not be emitted in BTF.
>>
> 
> The patch itself looks OK to me.  Two comments below.
> 
>> gcc/
>>  * btfout.cc (get_btf_kind): Handle DECL_TAG and TYPE_TAG kinds.
>>  (btf_calc_num_vbytes): Likewise.
>>  (btf_asm_type): Likewise.
>>  (output_asm_btf_vlen_bytes): Likewise.
>>  (output_btf_tags): New.
>>  (btf_output): Call it here.
>>  (btf_add_used_type): Replace with simple wrapper around...
>>  (btf_add_used_type_1): ...the implementation.  Handle
>>  BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>  (btf_add_vars): Update btf_add_used_type call.
>>  (btf_assign_tag_ids): New.
>>  (btf_mark_type_used): Update btf_add_used_type call.
>>  (btf_collect_pruned_types): Likewise.  Handle type and decl tags.
>>  (btf_finish): Call btf_assign_tag_ids.
>>
>> gcc/testsuite/
>>  * gcc.dg/debug/btf/btf-decl-tag-1.c: New test.
>>  * gcc.dg/debug/btf/btf-decl-tag-2.c: New test.
>>  * gcc.dg/debug/btf/btf-decl-tag-3.c: New test.
>>  * gcc.dg/debug/btf/btf-decl-tag-4.c: New test.
>>  * gcc.dg/debug/btf/btf-type-tag-1.c: New test.
>>  * gcc.dg/debug/btf/btf-type-tag-2.c: New test.
>>  * gcc.dg/debug/btf/btf-type-tag-3.c: New test.
>>  * gcc.dg/debug/btf/btf-type-tag-4.c: New test.
>>  * gcc.dg/debug/btf/btf-type-tag-5.c: New test.
>>  * gcc.dg/debug/btf/btf-type-tag-6.c: New test.
>>  * gcc.dg/debug/btf/btf-type-tag-c2x-1.c: New test.
>>
> 
> I think its worth adding a testcase in the BPF backend tests for testing 
> (no) interaction between CO-RE relocs with tags. E.g, when 
> __attribute__((preserve_access_index)) is used on a struct with type 
> tags, the type id in the CO-RE relocation records is the type and not 
> the type tag. Or other tests as you see fit.
> 
> WDYT ?

That is a very good idea.
I will add a few tests for this.

> 
>> include/
>>  * btf.h (BTF_KIND_DECL_TAG, BTF_KIND_TYPE_TAG) New defines.
>>  (struct btf_decl_tag): New.
>> ---
>>   gcc/btfout.cc | 171 +++---
>>   .../gcc.dg/debug/btf/btf-decl-tag-1.c |  14 ++
>>   .../gcc.dg/debug/btf/btf-decl-tag-2.c |  22 +++
>>   .../gcc.dg/debug/btf/btf-decl-tag-3.c |  22 +++
>>   .../gcc.dg/debug/btf/btf-decl-tag-4.c |  34 
>>   .../gcc.dg/debug/btf/btf-type-tag-1.c |  27 +++
>>   .../gcc.dg/debug/btf/btf-type-tag-2.c |  15 ++
>>   .../gcc.dg/debug/btf/btf-type-tag-3.c |  21 +++
>>   .../gcc.dg/debug/btf/btf-type-tag-4.c |  25 +++
>>   .../gcc.dg/debug/btf/btf-type-tag-5.c |  35 
>>   .../gcc.dg/debug/btf/btf-type-tag-6.c |  15 ++
>>   .../gcc.dg/debug/btf/btf-type-tag-c2x-1.c |  23 +++
>>   include/btf.h |  14 ++
>>   13 files changed, 414 insertions(+), 24 deletions(-)
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-1.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-2.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-3.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-4.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-1.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-2.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-3.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-4.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-5.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-6.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-c2x-1.c
>>
>> diff --git a/gcc/btfout.cc b/gcc/btfout.cc
>> index ff7ea42a961..c00e0c98015 100644
>> --- a/gcc/btfout.cc
>> +++ b/gcc/btfout.cc
>> @@ -141,6 +141,8 @@ get_btf_kind (uint32_t ctf_kind)
>>   case CTF_K_VOLATILE: return BTF_KIND_VOLATILE;
>>   case CTF_K_CONST:return BTF_KIND_CONST;
>>   case CTF_K_RESTRICT: return BTF_KIND_RESTRICT;
>> +case CTF_K_DECL_TAG: return BTF_KIND_DECL_TAG;
>> +case CTF_K_TYPE_TAG: return BTF_KIND_TYPE_TAG;
>>   default:;
>>   }
>> return BTF_KIND_UNKN;
>> @@ -217,6 +219,7 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)
>>   case BTF_KIND_CONST:
>>   case BTF_KIND_RESTRICT:
>>   case BTF_KIND_FUNC:
>> +case BTF_KIND_TYPE_TAG:
>>   /* These kinds have no vlen data.  */
>> break;
>>   
>> @@ -256,6 +259,10 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)
>> vlen_bytes += vlen * sizeof (struct btf_var_secinfo);
>> break;
>>   
>> +case BTF_KIND_D

Re: [PATCH] libstdc++: Implement LWG 4027 change to possibly-const-range [PR118083]

2025-02-24 Thread Jonathan Wakely

On Tue, 4 Feb 2025 at 22:27, Patrick Palka  wrote:
>
> On Tue, 4 Feb 2025, Patrick Palka wrote:
>
> > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps
> > 14 (after a long while)?

Yes, and yes, thanks.


> >
> > -- >8 --
> >
> > LWG 4027 effectively makes the const range access CPOs ranges::cfoo behave
> > more consistently across C++23 and C++20 (pre-P2278R4) and more
> > consistently with the std::cfoo range accessors, as the below testcase
> > adjustments demonstrate (which mostly consist of reverting workarounds
> > added by r14-3771-gf12e26f3496275 and r13-7186-g0d94c6df183375).
> >
> > In passing fix PR118083 which reports that the input_range constraint on
> > possibly-const-range is missing in our implementation.  A consequence of
> > this is that the const range access CPOs now consistently require the
> > type to be a range, and so in some our of tests we need to introduce
> > otherwise unused begin/end members.
>
> ... now with the LWG 4027 testcases added (to cbegin.cc, so that it runs
> in both C++20 and C++23 mode):
>
> -- >8 --
>
> Subject: [PATCH] libstdc++: Implement LWG 4027 change to possibly-const-range
>  [PR118083]
>
> PR libstdc++/118083
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/ranges_base.h
> (ranges::__access::__possibly_const_range): Adjust logic as per
> LWG 4027.  Add missing input_range constraint.
> * testsuite/std/ranges/access/cbegin.cc (test05): Verify LWG
> 4027 testcases.
> * testsuite/std/ranges/access/cdata.cc: Adjust, simplify and
> consolidate some tests after the above.
> * testsuite/std/ranges/access/cend.cc: Likewise.
> * testsuite/std/ranges/access/crbegin.cc: Likewise.
> * testsuite/std/ranges/access/crend.cc: Likewise.
> * testsuite/std/ranges/adaptors/join.cc: Likewise.
> * testsuite/std/ranges/adaptors/take_while.cc: Likewise.
> * testsuite/std/ranges/adaptors/transform.cc: Likewise.
> ---
>  libstdc++-v3/include/bits/ranges_base.h   |  4 +-
>  .../testsuite/std/ranges/access/cbegin.cc | 17 
>  .../testsuite/std/ranges/access/cdata.cc  | 21 -
>  .../testsuite/std/ranges/access/cend.cc   | 30 ++---
>  .../testsuite/std/ranges/access/crbegin.cc| 43 +--
>  .../testsuite/std/ranges/access/crend.cc  | 20 +
>  .../testsuite/std/ranges/adaptors/join.cc |  8 ++--
>  .../std/ranges/adaptors/take_while.cc |  2 -
>  .../std/ranges/adaptors/transform.cc  |  4 --
>  9 files changed, 59 insertions(+), 90 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/ranges_base.h 
> b/libstdc++-v3/include/bits/ranges_base.h
> index 4dcfbf66d51..28fe64a9e9d 100644
> --- a/libstdc++-v3/include/bits/ranges_base.h
> +++ b/libstdc++-v3/include/bits/ranges_base.h
> @@ -642,11 +642,11 @@ namespace ranges
>namespace __access
>{
>  #if __glibcxx_ranges_as_const // >= C++23
> -template
> +template
>constexpr auto&
>__possibly_const_range(_Range& __r) noexcept
>{
> -   if constexpr (constant_range && !constant_range<_Range>)
> +   if constexpr (input_range)
>   return const_cast(__r);
> else
>   return __r;
> diff --git a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc 
> b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc
> index 5423e782428..66b28f41877 100644
> --- a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc
> @@ -116,10 +116,27 @@ test04()
>VERIFY(std::ranges::cbegin(std::move(c)) == std::ranges::begin(c));
>  }
>
> +void
> +test05()
> +{
> +  // LWG 4027 - possibly-const-range should prefer returning const R&
> +  auto r = std::views::single(0)
> +| std::views::transform([](int) { return 0; });
> +  using C1 = decltype(std::ranges::cbegin(r));
> +  using C1 = decltype(std::cbegin(r));
> +
> +  [] (auto x) {
> +auto r = std::views::single(x) | std::views::lazy_split(0);
> +static_assert( ! requires { (*std::ranges::cbegin(r)).front() = 42; });
> +static_assert( ! requires { (*std::cbegin(r)).front() = 42; });
> +  }(0);
> +}
> +
>  int
>  main()
>  {
>test01();
>test03();
>test04();
> +  test05();
>  }
> diff --git a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc 
> b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc
> index 62c347be43d..f474ab7ec99 100644
> --- a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc
> @@ -34,20 +34,21 @@ test01()
>{
>  int i = 0;
>  int j = 0;
> +
> +#if __cpp_lib_ranges_as_const
> +// These overloads mean that range and range are satisfied.
> +const int* begin() const { throw; }
> +const int* end() const { throw; }
> +#endif
> +
>  int* data() { return &j; }
>  const R* data() const noexcept { return nullptr; }
>};
>static_assert( has_cdata );
>static_

Re: [PATCH v2 3/5][STAGE1] ctf: translate annotation DIEs to internal ctf

2025-02-24 Thread David Faust




On 2/24/25 11:04, Indu Bhagat wrote:
> On 2/6/25 11:54 AM, David Faust wrote:
>> Translate DW_TAG_GNU_annotation DIEs created for C attributes
>> btf_decl_tag and btf_type_tag into an in-memory representation in the
>> CTF/BTF container.  They will be output in BTF as BTF_KIND_DECL_TAG and
>> BTF_KIND_TYPE_TAG records.
>>
>> The new CTF kinds used to represent these annotations, CTF_K_DECL_TAG
>> and CTF_K_TYPE_TAG, are expected to be formalized in the next version of
>> the CTF specification.  For now they only exist in memory as a
>> translation step to BTF, and are not emitted when generating CTF
>> information.
>>
> 
> The patch overall looks OK to me.  Some comments inlined below.
> 
>> gcc/
>>  * ctfc.cc (ctf_dtu_d_union_selector): Handle CTF_K_DECL_TAG and
>>  CTF_K_TYPE_TAG.
>>  (ctf_add_type_tag, ctf_add_decl_tag): New.
>>  (ctf_add_variable): Return the new ctf_dvdef_ref rather than zero.
>>  (new_ctf_container): Initialize new members.
>>  (ctfc_delete_container): Deallocate new members.
>>  * ctfc.h (ctf_dvdef, ctf_dvdef_t, ctf_dvdef_ref): Move forward
>>  declarations earlier in file.
>>  (ctf_decl_tag_t): New typedef.
>>  (ctf_dtdef): Add ctf_decl_tag_t member to dtd_u union.
>>  (ctf_dtu_d_union_enum): Add new CTF_DTU_D_TAG enumerator.
>>  (ctf_container): Add ctfc_tags vector and ctfc_type_tags_map hash_map
>>  members.
>>  (ctf_add_type_tag, ctf_add_decl_tag): New function protos.
>>  (ctf_add_variable): Change prototype return type to ctf_dvdef_ref.
>>  * dwarf2ctf.cc (gen_ctf_type_tags, gen_ctf_decl_tags)
>>  (gen_ctf_decl_tags_for_var): New static functions.
>>  (gen_ctf_sou_type): Handle decl tags.
>>  (gen_ctf_function_type): Likewise.
>>  (gen_ctf_variable): Likewise.
>>  (gen_ctf_function): Likewise.
>>  (gen_ctf_type): Handle type tags.
>>
>> gcc/testsuite
>>  * gcc.dg/debug/ctf/ctf-decl-tag-1.c: New test.
>>  * gcc.dg/debug/ctf/ctf-type-tag-1.c: New test.
>>
>> include/
>>  * ctf.h (CTF_K_DECL_TAG, CTF_K_TYPE_TAG): New defines.
>> ---
>>   gcc/ctfc.cc   |  70 +++-
>>   gcc/ctfc.h|  41 -
>>   gcc/dwarf2ctf.cc  | 152 +-
>>   .../gcc.dg/debug/ctf/ctf-decl-tag-1.c |  31 
>>   .../gcc.dg/debug/ctf/ctf-type-tag-1.c |  19 +++
>>   include/ctf.h |   4 +
>>   6 files changed, 303 insertions(+), 14 deletions(-)
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-decl-tag-1.c
>>   create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-type-tag-1.c
>>
>> diff --git a/gcc/ctfc.cc b/gcc/ctfc.cc
>> index 51511d69baa..c478fa0a136 100644
>> --- a/gcc/ctfc.cc
>> +++ b/gcc/ctfc.cc
>> @@ -107,6 +107,9 @@ ctf_dtu_d_union_selector (ctf_dtdef_ref ctftype)
>> return CTF_DTU_D_ARGUMENTS;
>>   case CTF_K_SLICE:
>> return CTF_DTU_D_SLICE;
>> +case CTF_K_DECL_TAG:
>> +case CTF_K_TYPE_TAG:
>> +  return CTF_DTU_D_TAG;
>>   default:
>> /* The largest member as default.  */
>> return CTF_DTU_D_ARRAY;
>> @@ -445,6 +448,58 @@ ctf_add_reftype (ctf_container_ref ctfc, uint32_t flag, 
>> ctf_dtdef_ref ref,
>> return dtd;
>>   }
>>   
>> +ctf_dtdef_ref
>> +ctf_add_type_tag (ctf_container_ref ctfc, uint32_t flag, const char *value,
>> +  ctf_dtdef_ref ref_dtd)
>> +{
>> +  ctf_dtdef_ref dtd;
>> +   /* Create a DTD for the tag, but do not place it in the regular types 
>> list;
>> +  CTF format does not (yet) encode tags.  */
>> +  dtd = ggc_cleared_alloc ();
>> +
>> +  dtd->dtd_name = ctf_add_string (ctfc, value, &(dtd->dtd_data.ctti_name),
>> +  CTF_AUX_STRTAB);
>> +  /* A single DW_TAG_GNU_annotation DIE may be referenced by multiple DIEs,
>> + e.g. when multiple distinct types specify the same type tag.  We will
>> + synthesize multiple CTF DTD records in that case, so we cannot tie them
>> + all to the same key (the DW_TAG_GNU_annotation DIE) in ctfc_types.  */
>> +  dtd->dtd_key = NULL;
>> +  dtd->ref_type = ref_dtd;
>> +  dtd->dtd_data.ctti_info = CTF_TYPE_INFO (CTF_K_TYPE_TAG, flag, 0);
>> +  dtd->dtd_u.dtu_tag.ref_var = NULL; /* Not used for type tags.  */
>> +  dtd->dtd_u.dtu_tag.component_idx = 0; /* Not used for type tags.  */
>> +
>> +  /* Insert tag directly into the tag list.  Type ID will be assigned 
>> later.  */
>> +  vec_safe_push (ctfc->ctfc_tags, dtd);
>> +  return dtd;
>> +}
>> +
>> +ctf_dtdef_ref
>> +ctf_add_decl_tag (ctf_container_ref ctfc, uint32_t flag, const char *value,
>> +  ctf_dtdef_ref ref_dtd, uint32_t comp_idx)
>> +{
>> +   ctf_dtdef_ref dtd;
>> +   /* Create a DTD for the tag, but do not place it in the regular types 
>> list;
>> +  ctf format does not (yet) encode tags.  */
>> +  dtd = ggc_cleared_alloc ();
>> +
>> +  dtd->dtd_name = ctf_add_string (ctfc, value, &(dtd->dtd_data.ctti_

Re: The COBOL front end, version 3, now in 14 easy pieces

2025-02-24 Thread James K. Lowden

On Mon, 24 Feb 2025 14:51:27 +0100
Richard Biener  wrote:

> Installing the result via make install DESTDIR=/foo I see both a
> 'gcobol' and a 'gcobc' program
> being installed - is that intentional?  

Yes, that is intentional.  gcobol is the compiler driver, as you know.
gcobc is a shell script that emulates GnuCOBOL's cobc
executable.  It is being used to facilitate compatibility with
GnuCOBOL.  

I see it's not documented.  There is quite extensive documentation for
it in a directory we chose not to supply among our patches, for
simplicity.  I will look into adapting that page, 

gcc/cobol/scripts/rocky8/migration-guide.1

to something more specific to gcobc.  

> I also see the gcobol.3 manpage reside directly in /foo/gcobol.3
> rather than in the expected /foo/usr/local/man/..

There may be something missing somewhere higher up.  I had to add the man3 
directory to the install machinery.  I will investigate.  

> sources might reside on a R/O mounted volume

working on it.  Perhaps tomorrow. 

--jkl

Re: [PATCH v2 3/5][STAGE1] ctf: translate annotation DIEs to internal ctf

2025-02-24 Thread Indu Bhagat


On 2/6/25 11:54 AM, David Faust wrote:

Translate DW_TAG_GNU_annotation DIEs created for C attributes
btf_decl_tag and btf_type_tag into an in-memory representation in the
CTF/BTF container.  They will be output in BTF as BTF_KIND_DECL_TAG and
BTF_KIND_TYPE_TAG records.

The new CTF kinds used to represent these annotations, CTF_K_DECL_TAG
and CTF_K_TYPE_TAG, are expected to be formalized in the next version of
the CTF specification.  For now they only exist in memory as a
translation step to BTF, and are not emitted when generating CTF
information.



The patch overall looks OK to me.  Some comments inlined below.


gcc/
* ctfc.cc (ctf_dtu_d_union_selector): Handle CTF_K_DECL_TAG and
CTF_K_TYPE_TAG.
(ctf_add_type_tag, ctf_add_decl_tag): New.
(ctf_add_variable): Return the new ctf_dvdef_ref rather than zero.
(new_ctf_container): Initialize new members.
(ctfc_delete_container): Deallocate new members.
* ctfc.h (ctf_dvdef, ctf_dvdef_t, ctf_dvdef_ref): Move forward
declarations earlier in file.
(ctf_decl_tag_t): New typedef.
(ctf_dtdef): Add ctf_decl_tag_t member to dtd_u union.
(ctf_dtu_d_union_enum): Add new CTF_DTU_D_TAG enumerator.
(ctf_container): Add ctfc_tags vector and ctfc_type_tags_map hash_map
members.
(ctf_add_type_tag, ctf_add_decl_tag): New function protos.
(ctf_add_variable): Change prototype return type to ctf_dvdef_ref.
* dwarf2ctf.cc (gen_ctf_type_tags, gen_ctf_decl_tags)
(gen_ctf_decl_tags_for_var): New static functions.
(gen_ctf_sou_type): Handle decl tags.
(gen_ctf_function_type): Likewise.
(gen_ctf_variable): Likewise.
(gen_ctf_function): Likewise.
(gen_ctf_type): Handle type tags.

gcc/testsuite
* gcc.dg/debug/ctf/ctf-decl-tag-1.c: New test.
* gcc.dg/debug/ctf/ctf-type-tag-1.c: New test.

include/
* ctf.h (CTF_K_DECL_TAG, CTF_K_TYPE_TAG): New defines.
---
  gcc/ctfc.cc   |  70 +++-
  gcc/ctfc.h|  41 -
  gcc/dwarf2ctf.cc  | 152 +-
  .../gcc.dg/debug/ctf/ctf-decl-tag-1.c |  31 
  .../gcc.dg/debug/ctf/ctf-type-tag-1.c |  19 +++
  include/ctf.h |   4 +
  6 files changed, 303 insertions(+), 14 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-decl-tag-1.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-type-tag-1.c

diff --git a/gcc/ctfc.cc b/gcc/ctfc.cc
index 51511d69baa..c478fa0a136 100644
--- a/gcc/ctfc.cc
+++ b/gcc/ctfc.cc
@@ -107,6 +107,9 @@ ctf_dtu_d_union_selector (ctf_dtdef_ref ctftype)
return CTF_DTU_D_ARGUMENTS;
  case CTF_K_SLICE:
return CTF_DTU_D_SLICE;
+case CTF_K_DECL_TAG:
+case CTF_K_TYPE_TAG:
+  return CTF_DTU_D_TAG;
  default:
/* The largest member as default.  */
return CTF_DTU_D_ARRAY;
@@ -445,6 +448,58 @@ ctf_add_reftype (ctf_container_ref ctfc, uint32_t flag, 
ctf_dtdef_ref ref,
return dtd;
  }
  
+ctf_dtdef_ref

+ctf_add_type_tag (ctf_container_ref ctfc, uint32_t flag, const char *value,
+ ctf_dtdef_ref ref_dtd)
+{
+  ctf_dtdef_ref dtd;
+   /* Create a DTD for the tag, but do not place it in the regular types list;
+  CTF format does not (yet) encode tags.  */
+  dtd = ggc_cleared_alloc ();
+
+  dtd->dtd_name = ctf_add_string (ctfc, value, &(dtd->dtd_data.ctti_name),
+ CTF_AUX_STRTAB);
+  /* A single DW_TAG_GNU_annotation DIE may be referenced by multiple DIEs,
+ e.g. when multiple distinct types specify the same type tag.  We will
+ synthesize multiple CTF DTD records in that case, so we cannot tie them
+ all to the same key (the DW_TAG_GNU_annotation DIE) in ctfc_types.  */
+  dtd->dtd_key = NULL;
+  dtd->ref_type = ref_dtd;
+  dtd->dtd_data.ctti_info = CTF_TYPE_INFO (CTF_K_TYPE_TAG, flag, 0);
+  dtd->dtd_u.dtu_tag.ref_var = NULL; /* Not used for type tags.  */
+  dtd->dtd_u.dtu_tag.component_idx = 0; /* Not used for type tags.  */
+
+  /* Insert tag directly into the tag list.  Type ID will be assigned later.  
*/
+  vec_safe_push (ctfc->ctfc_tags, dtd);
+  return dtd;
+}
+
+ctf_dtdef_ref
+ctf_add_decl_tag (ctf_container_ref ctfc, uint32_t flag, const char *value,
+ ctf_dtdef_ref ref_dtd, uint32_t comp_idx)
+{
+   ctf_dtdef_ref dtd;
+   /* Create a DTD for the tag, but do not place it in the regular types list;
+  ctf format does not (yet) encode tags.  */
+  dtd = ggc_cleared_alloc ();
+
+  dtd->dtd_name = ctf_add_string (ctfc, value, &(dtd->dtd_data.ctti_name),
+ CTF_AUX_STRTAB);
+  /* A single DW_TAG_GNU_annotation DIE may be referenced by multiple DIEs,
+ e.g. when multiple distinct declarations specify the same decl tag.
+ We will synthesize multiple CTF DTD records in that case, so we cannot

Re: [PATCH v2 4/5][STAGE1] btf: generate and output DECL_TAG and TYPE_TAG records

2025-02-24 Thread Indu Bhagat


On 2/6/25 11:54 AM, David Faust wrote:

Support the btf_decl_tag and btf_type_tag attributes in BTF by creating
and emitting BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records,
respectively, for them.

Some care is required when -gprune-btf is in effect to avoid emitting
decl or type tags for declarations or types which have been pruned and
will not be emitted in BTF.



The patch itself looks OK to me.  Two comments below.


gcc/
* btfout.cc (get_btf_kind): Handle DECL_TAG and TYPE_TAG kinds.
(btf_calc_num_vbytes): Likewise.
(btf_asm_type): Likewise.
(output_asm_btf_vlen_bytes): Likewise.
(output_btf_tags): New.
(btf_output): Call it here.
(btf_add_used_type): Replace with simple wrapper around...
(btf_add_used_type_1): ...the implementation.  Handle
BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
(btf_add_vars): Update btf_add_used_type call.
(btf_assign_tag_ids): New.
(btf_mark_type_used): Update btf_add_used_type call.
(btf_collect_pruned_types): Likewise.  Handle type and decl tags.
(btf_finish): Call btf_assign_tag_ids.

gcc/testsuite/
* gcc.dg/debug/btf/btf-decl-tag-1.c: New test.
* gcc.dg/debug/btf/btf-decl-tag-2.c: New test.
* gcc.dg/debug/btf/btf-decl-tag-3.c: New test.
* gcc.dg/debug/btf/btf-decl-tag-4.c: New test.
* gcc.dg/debug/btf/btf-type-tag-1.c: New test.
* gcc.dg/debug/btf/btf-type-tag-2.c: New test.
* gcc.dg/debug/btf/btf-type-tag-3.c: New test.
* gcc.dg/debug/btf/btf-type-tag-4.c: New test.
* gcc.dg/debug/btf/btf-type-tag-5.c: New test.
* gcc.dg/debug/btf/btf-type-tag-6.c: New test.
* gcc.dg/debug/btf/btf-type-tag-c2x-1.c: New test.



I think its worth adding a testcase in the BPF backend tests for testing 
(no) interaction between CO-RE relocs with tags. E.g, when 
__attribute__((preserve_access_index)) is used on a struct with type 
tags, the type id in the CO-RE relocation records is the type and not 
the type tag. Or other tests as you see fit.


WDYT ?


include/
* btf.h (BTF_KIND_DECL_TAG, BTF_KIND_TYPE_TAG) New defines.
(struct btf_decl_tag): New.
---
  gcc/btfout.cc | 171 +++---
  .../gcc.dg/debug/btf/btf-decl-tag-1.c |  14 ++
  .../gcc.dg/debug/btf/btf-decl-tag-2.c |  22 +++
  .../gcc.dg/debug/btf/btf-decl-tag-3.c |  22 +++
  .../gcc.dg/debug/btf/btf-decl-tag-4.c |  34 
  .../gcc.dg/debug/btf/btf-type-tag-1.c |  27 +++
  .../gcc.dg/debug/btf/btf-type-tag-2.c |  15 ++
  .../gcc.dg/debug/btf/btf-type-tag-3.c |  21 +++
  .../gcc.dg/debug/btf/btf-type-tag-4.c |  25 +++
  .../gcc.dg/debug/btf/btf-type-tag-5.c |  35 
  .../gcc.dg/debug/btf/btf-type-tag-6.c |  15 ++
  .../gcc.dg/debug/btf/btf-type-tag-c2x-1.c |  23 +++
  include/btf.h |  14 ++
  13 files changed, 414 insertions(+), 24 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-1.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-2.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-3.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-4.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-1.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-2.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-3.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-4.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-5.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-6.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-c2x-1.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index ff7ea42a961..c00e0c98015 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -141,6 +141,8 @@ get_btf_kind (uint32_t ctf_kind)
  case CTF_K_VOLATILE: return BTF_KIND_VOLATILE;
  case CTF_K_CONST:return BTF_KIND_CONST;
  case CTF_K_RESTRICT: return BTF_KIND_RESTRICT;
+case CTF_K_DECL_TAG: return BTF_KIND_DECL_TAG;
+case CTF_K_TYPE_TAG: return BTF_KIND_TYPE_TAG;
  default:;
  }
return BTF_KIND_UNKN;
@@ -217,6 +219,7 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)
  case BTF_KIND_CONST:
  case BTF_KIND_RESTRICT:
  case BTF_KIND_FUNC:
+case BTF_KIND_TYPE_TAG:
  /* These kinds have no vlen data.  */
break;
  
@@ -256,6 +259,10 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)

vlen_bytes += vlen * sizeof (struct btf_var_secinfo);
break;
  
+case BTF_KIND_DECL_TAG:

+  vlen_bytes += sizeof (struct btf_decl_tag);
+  break;
+
  default:
break;
  }
@@ -452,6 +459,20 @@ btf_asm_type (ctf_dtdef_ref dtd)
 and should write 0.  */
dw2_asm_output_data (4, 0, "(unused)");
return;
+case BTF_KIND_DECL_TAG:
+  {
+

Re: [PATCH] [testsuite] adjust expectations of x86 vect-simd-clone tests

2025-02-24 Thread Mike Stump

On Feb 13, 2025, at 1:51 AM, Alexandre Oliva  wrote:
> 
> Some vect-simd-clone tests fail when targeting ancient x86 variants,
> because the expected transformations only take place with -msse4 or
> higher.
> 
> So arrange for these tests to take an -msse4 option on x86, so that
> the expected vectorization takes place, but decay to a compile test if
> vect.exp would enable execution but the target doesn't have an sse4
> runtime.  This requires the new dg-do-if to override the action on a
> target while retaining the default action on others, instead of
> disabling the test.
> 
> We can count on avx512f compile-time support for these tests, because
> vect_simd_clones requires that on x86, and that implies sse4 support,
> so we need not complicate the scan conditionals with tests for sse4,
> except on the last test.
> 
> Regstrapped on x86_64-linux-gnu, also tested with gcc-14 targeting
> x86_64-elf, targeting a cpu without sse4 support.  Ok to install?

Ok.

If you or others notice needed follow on work, ok for that as it should be easy 
enough to figure out once the bulk is in.

I thought I saw one more needing review. I accidentally trashed the message 
while catching up and went to find it via name and no other comments. I didn't 
find it. Feel free to reping it.

Re: [PATCH] avoid-store-forwarding: Handle REG_EH_REGION notes

2025-02-24 Thread Jeff Law





On 2/24/25 9:43 AM, Michael Matz wrote:



It depends, but even if that were no problem in some specific cases
(perhaps by ensuring that such sequence isn't intermingled with insns that
are in different EH regions) that doesn't seem to be what the proposal is
doing?  From the description it moves a EH note from a load (where it
presumably was for a reason) that occurs after a store to something that
so happens to be the (new) last instruction of a bb.  Because otherwise
the load couldn't be moved earlier into the middle of a BB.  In essence
the load would lose the EH note and a new EH note would be generated for a
different insn out of thin air.  I don't see how that can ever be a
correct transformation in general.  What _if_ the very load traps then?
What if the new insn cannot trap?

(Note that loads and stores may trap differently even on the same address,
e.g. write to read-only mem, or an misaligned load but an aligned (or
smaller-sized) write; though I'll say that GCC assumes that loads cannot
trap if shadowed by an equivalent write, in some other passes)
Good points.  The alignment case was just starting to rattle around in 
my head.It may simply be the case we can't do much here.


jeff

[PATCH] RISC-V: Avoid updating vl right before branching on avl

2025-02-24 Thread Edwin Lu

See [1] thread for original patch which spawned this one.

We are currently seeing the following code where we perform a vsetvl
before a branching instruction against the avl.

vsetvli a5,a1,e32,m1,tu,ma
vle32.v v2,0(a0)
sub a1,a1,a5 <-- a1 potentially set to 0
sh2add  a0,a5,a0
vfmacc.vv   v1,v2,v2
vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0
beq a1,zero,.L12 <-- check if avl is 0

Since we are branching off of the avl, we don't need to update vl until
after the branch is taken. Search the ready queue for vsetvls scheduled
before branching instructions that branch off of the same regno and
promote the branches to execute first. This can improve performancy by
potentially avoiding setting VL=0 which may be expensive on some uarches.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675622.html

PR/117974

gcc/ChangeLog:

* config/riscv/riscv.cc (vsetvl_avl_regno): New helper function.
(insn_increases_zeroness_p): Ditto.
(riscv_promote_ready): Ditto.
(riscv_sched_reorder): Implement hook.
(TARGET_SCHED_REORDER): Define Hook.
* config/riscv/riscv.opt: New flag.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr117974.c: New test.

Signed-off-by: Edwin Lu 
Co-authored-by: Palmer Dabbelt 
---
 gcc/config/riscv/riscv.cc | 103 ++
 gcc/config/riscv/riscv.opt|   4 +
 .../gcc.target/riscv/rvv/vsetvl/pr117974.c|  15 +++
 3 files changed, 122 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr117974.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 89aa25d5da9..cf0866fa3fb 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -14035,6 +14035,106 @@ bool need_shadow_stack_push_pop_p ()
   return is_zicfiss_p () && riscv_save_return_addr_reg_p ();
 }

+static int
+vsetvl_avl_regno(rtx_insn *insn)
+{
+  if (recog_memoized (insn) < 0)
+return -1;
+
+  if (get_attr_type (insn) != TYPE_VSETVL
+  && get_attr_type (insn) != TYPE_VSETVL_PRE)
+return -1;
+
+  extract_insn (insn);
+  /* From vector.md, vsetvl operands are as follows:
+  ;; operands[0]: VL.
+  ;; operands[1]: AVL.
+  ;; operands[2]: SEW
+  ;; operands[3]: LMUL
+  ;; operands[4]: Tail policy 0 or 1 (undisturbed/agnostic)
+  ;; operands[5]: Mask policy 0 or 1 (undisturbed/agnostic)
+ Return regno of avl operand.  */
+  return REGNO (recog_data.operand[1]);
+}
+
+static bool
+insn_increases_zeroness_p(rtx_insn *insn, int regno)
+{
+  /* Check for branching against zero.  */
+  if (JUMP_P (insn))
+{
+  extract_insn (insn);
+  bool match_reg = false;
+  bool comp_zero = false;
+  for (int i = 0; i < recog_data.n_operands; i++)
+   {
+ if (REG_P (recog_data.operand[i])
+ && REGNO (recog_data.operand[i]) == regno)
+   match_reg = true;
+ if (CONST_INT_P (recog_data.operand[i])
+ && XINT (recog_data.operand[i], 0) == 0
+ && XWINT (recog_data.operand[i], 0) == 0)
+   comp_zero = true;
+   }
+  return match_reg && comp_zero;
+}
+  return false;
+}
+
+/* Copied from MIPS.  Removes the instruction at index LOWER from ready
+   queue READY and reinserts it in from of the instruction at index
+   HIGHER.  LOWER must be <= HIGHER.  */
+static void
+riscv_promote_ready (rtx_insn **ready, int lower, int higher)
+{
+  rtx_insn *new_head;
+  int i;
+
+  new_head = ready[lower];
+  for (i = lower; i < higher; i++)
+ready[i] = ready[i + 1];
+  ready[i] = new_head;
+}
+
+/* Attempt to avoid issuing VSETVL-type instructions before a branch that
+   ensures they are non-zero, as setting VL=0 dynamically can be slow.  */
+static int
+riscv_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED,
+rtx_insn **ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED)
+{
+if (! TARGET_AVOID_VL_EQ_0)
+  return riscv_issue_rate ();
+
+for (int i = *nreadyp - 1; i >= 0; i--)
+  {
+   /* Find the vsetvl.  */
+   int avl_regno = vsetvl_avl_regno (ready[i]);
+   if (avl_regno == -1 || i == 0)
+ continue;
+   for (int j = i - 1; j >= 0; j--)
+ {
+   /* Exit if another vsetvl is found before finding a branch insn
+  in the ready queue.  */
+   if (recog_memoized (ready[j]) >= 0
+   && get_attr_type (ready[j]) == TYPE_VSETVL
+   && get_attr_type (ready[j]) == TYPE_VSETVL_PRE)
+ break;
+   /* Find branch.  */
+   if (recog_memoized (ready[j]) >= 0
+   && insn_increases_zeroness_p (ready[j], avl_regno))
+ {
+   /* Right now the only zeroness-increasing pattern we recognize
+  is a branch-not-zero, so there's no sense in looking for any
+

Re: [PATCH] RISC-V: Avoid updating vl right before branching on avl

2025-02-24 Thread Edwin Lu




On 2/24/2025 4:34 PM, Jeff Law wrote:



On 2/24/25 5:07 PM, Edwin Lu wrote:

See [1] thread for original patch which spawned this one.

We are currently seeing the following code where we perform a vsetvl
before a branching instruction against the avl.

 vsetvli a5,a1,e32,m1,tu,ma
 vle32.v v2,0(a0)
 sub a1,a1,a5 <-- a1 potentially set to 0
 sh2add  a0,a5,a0
 vfmacc.vv   v1,v2,v2
 vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl 
to 0

 beq a1,zero,.L12 <-- check if avl is 0

Since we are branching off of the avl, we don't need to update vl until
after the branch is taken. Search the ready queue for vsetvls scheduled
before branching instructions that branch off of the same regno and
promote the branches to execute first. This can improve performancy by
potentially avoiding setting VL=0 which may be expensive on some 
uarches.


[1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675622.html

PR/117974

gcc/ChangeLog:

* config/riscv/riscv.cc (vsetvl_avl_regno): New helper function.
(insn_increases_zeroness_p): Ditto.
(riscv_promote_ready): Ditto.
(riscv_sched_reorder): Implement hook.
(TARGET_SCHED_REORDER): Define Hook.
* config/riscv/riscv.opt: New flag.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr117974.c: New test.
So I preferred the earlier approach of disabling speculation of the 
vsetvls, though I'm guessing you're looking at this approach because 
that was insufficient?
I don't think that it was because it was insufficient, but that it might 
be too constraining. I'm not exactly certain if there is the possibility 
where speculatively issuing some vsetvls in some situations could 
improve perf. I think with this patch, it targets the specific issue 
we're facing directly which is updating vl=0 when it's unnecessary. I 
don't know which patch would be better so I'm putting this out there as 
a potential alternative.
Regardless, like the disabling of speculation I tend to think this 
should be controlled by an entry in the tuning structure rather than 
flags. Flags are fine for internal testing, but I don't see a 
compelling need for a flag to control this in general.


Jeff


That makes sense to me. I completely forgot that we had talked about 
putting it into the tuning structure when writing this. I'll update the 
tunes respectively in a later version of the patches depending on which 
one moves forward.


Edwin

Contents of PO file 'cpplib-15-b20250216.ro.po'

2025-02-24 Thread Translation Project Robot



cpplib-15-b20250216.ro.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

New Romanian PO file for 'cpplib' (version 15-b20250216)

2025-02-24 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Romanian team of translators.  The file is available at:

https://translationproject.org/latest/cpplib/ro.po

(This file, 'cpplib-15-b20250216.ro.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH] RISC-V: Avoid updating vl right before branching on avl

2025-02-24 Thread Jeff Law





On 2/24/25 6:10 PM, Edwin Lu wrote:

So I preferred the earlier approach of disabling speculation of the 
vsetvls, though I'm guessing you're looking at this approach because 
that was insufficient?
I don't think that it was because it was insufficient, but that it might 
be too constraining. I'm not exactly certain if there is the possibility 
where speculatively issuing some vsetvls in some situations could 
improve perf.
On an out of order core I would expect the hardware to do the 
speculation and doing it in the compiler is of marginal value, if any.




 I think with this patch, it targets the specific issue
we're facing directly which is updating vl=0 when it's unnecessary. I 
don't know which patch would be better so I'm putting this out there as 
a potential alternative.
I suspect (but haven't verified) that the only time you'll see the 
conditional branch and the vsetvl in the ready queue together is in the 
speculative case.  Now it may be the case that you can be more targeted 
in inhibiting speculation, but again, for an out of order core I don't 
think it's actually going to matter.


Essentially when we aren't speculating the conditional branch will be 
dependent on all the instructions in the block.  So the conditional 
branch won't go into the ready queue until after everything else has issued.


Given we're not planning to move forward until stage1 reopens, you've 
got time to play with it internally. I've also got a mental todo to 
check with our designers to see if we have any sensitivity to this issue.


jeff

Re: [PATCH] RISC-V: Avoid updating vl right before branching on avl

2025-02-24 Thread Jeff Law





On 2/24/25 5:07 PM, Edwin Lu wrote:

See [1] thread for original patch which spawned this one.

We are currently seeing the following code where we perform a vsetvl
before a branching instruction against the avl.

 vsetvli a5,a1,e32,m1,tu,ma
 vle32.v v2,0(a0)
 sub a1,a1,a5 <-- a1 potentially set to 0
 sh2add  a0,a5,a0
 vfmacc.vv   v1,v2,v2
 vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0
 beq a1,zero,.L12 <-- check if avl is 0

Since we are branching off of the avl, we don't need to update vl until
after the branch is taken. Search the ready queue for vsetvls scheduled
before branching instructions that branch off of the same regno and
promote the branches to execute first. This can improve performancy by
potentially avoiding setting VL=0 which may be expensive on some uarches.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675622.html

PR/117974

gcc/ChangeLog:

* config/riscv/riscv.cc (vsetvl_avl_regno): New helper function.
(insn_increases_zeroness_p): Ditto.
(riscv_promote_ready): Ditto.
(riscv_sched_reorder): Implement hook.
(TARGET_SCHED_REORDER): Define Hook.
* config/riscv/riscv.opt: New flag.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr117974.c: New test.
So I preferred the earlier approach of disabling speculation of the 
vsetvls, though I'm guessing you're looking at this approach because 
that was insufficient?


Regardless, like the disabling of speculation I tend to think this 
should be controlled by an entry in the tuning structure rather than 
flags.  Flags are fine for internal testing, but I don't see a 
compelling need for a flag to control this in general.


Jeff

Re: [PATCH] RISC-V: Avoid updating vl right before branching on avl

2025-02-24 Thread Vineet Gupta

On 2/24/25 16:07, Edwin Lu wrote:
> See [1] thread for original patch which spawned this one.
>
> We are currently seeing the following code where we perform a vsetvl
> before a branching instruction against the avl.
>
> vsetvli a5,a1,e32,m1,tu,ma
> vle32.v v2,0(a0)
> sub a1,a1,a5 <-- a1 potentially set to 0
> sh2add  a0,a5,a0
> vfmacc.vv   v1,v2,v2
> vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0
> beq a1,zero,.L12 <-- check if avl is 0
>
> Since we are branching off of the avl, we don't need to update vl until
> after the branch is taken. Search the ready queue for vsetvls scheduled
> before branching instructions that branch off of the same regno and
> promote the branches to execute first. This can improve performancy by
> potentially avoiding setting VL=0 which may be expensive on some uarches.
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675622.html
>
>   PR/117974
>
> gcc/ChangeLog:
>
>   * config/riscv/riscv.cc (vsetvl_avl_regno): New helper function.
>   (insn_increases_zeroness_p): Ditto.
>   (riscv_promote_ready): Ditto.
>   (riscv_sched_reorder): Implement hook.
>   (TARGET_SCHED_REORDER): Define Hook.
>   * config/riscv/riscv.opt: New flag.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/riscv/rvv/vsetvl/pr117974.c: New test.
>
> Signed-off-by: Edwin Lu 
> Co-authored-by: Palmer Dabbelt 
> ---
>  gcc/config/riscv/riscv.cc | 103 ++
>  gcc/config/riscv/riscv.opt|   4 +
>  .../gcc.target/riscv/rvv/vsetvl/pr117974.c|  15 +++
>  3 files changed, 122 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr117974.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 89aa25d5da9..cf0866fa3fb 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -14035,6 +14035,106 @@ bool need_shadow_stack_push_pop_p ()
>return is_zicfiss_p () && riscv_save_return_addr_reg_p ();
>  }
>
> +static int
> +vsetvl_avl_regno(rtx_insn *insn)
> +{
> +  if (recog_memoized (insn) < 0)
> +return -1;
> +
> +  if (get_attr_type (insn) != TYPE_VSETVL
> +  && get_attr_type (insn) != TYPE_VSETVL_PRE)
> +return -1;
> +
> +  extract_insn (insn);
> +  /* From vector.md, vsetvl operands are as follows:
> +  ;; operands[0]: VL.
> +  ;; operands[1]: AVL.
> +  ;; operands[2]: SEW
> +  ;; operands[3]: LMUL
> +  ;; operands[4]: Tail policy 0 or 1 (undisturbed/agnostic)
> +  ;; operands[5]: Mask policy 0 or 1 (undisturbed/agnostic)
> + Return regno of avl operand.  */
> +  return REGNO (recog_data.operand[1]);
> +}
> +
> +static bool
> +insn_increases_zeroness_p(rtx_insn *insn, int regno)

Mnir point, but this is just one case of introducing VL=0 zeroness specific to
branch, not the general case.

> +{
> +  /* Check for branching against zero.  */
> +  if (JUMP_P (insn))
> +{
> +  extract_insn (insn);
> +  bool match_reg = false;
> +  bool comp_zero = false;
> +  for (int i = 0; i < recog_data.n_operands; i++)
> + {
> +   if (REG_P (recog_data.operand[i])
> +   && REGNO (recog_data.operand[i]) == regno)
> + match_reg = true;
> +   if (CONST_INT_P (recog_data.operand[i])
> +   && XINT (recog_data.operand[i], 0) == 0
> +   && XWINT (recog_data.operand[i], 0) == 0)
> + comp_zero = true;
> + }
> +  return match_reg && comp_zero;
> +}
> +  return false;
> +}
> +
> +/* Copied from MIPS.  Removes the instruction at index LOWER from ready
> +   queue READY and reinserts it in from of the instruction at index
> +   HIGHER.  LOWER must be <= HIGHER.  */
> +static void
> +riscv_promote_ready (rtx_insn **ready, int lower, int higher)
> +{
> +  rtx_insn *new_head;
> +  int i;
> +
> +  new_head = ready[lower];
> +  for (i = lower; i < higher; i++)
> +ready[i] = ready[i + 1];
> +  ready[i] = new_head;
> +}
> +
> +/* Attempt to avoid issuing VSETVL-type instructions before a branch that
> +   ensures they are non-zero, as setting VL=0 dynamically can be slow.  */
> +static int
> +riscv_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose 
> ATTRIBUTE_UNUSED,
> +  rtx_insn **ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED)
> +{
> +if (! TARGET_AVOID_VL_EQ_0)
> +  return riscv_issue_rate ();
> +
> +for (int i = *nreadyp - 1; i >= 0; i--)
> +  {
> + /* Find the vsetvl.  */
> + int avl_regno = vsetvl_avl_regno (ready[i]);
> + if (avl_regno == -1 || i == 0)
> +   continue;
> + for (int j = i - 1; j >= 0; j--)
> +   {
> + /* Exit if another vsetvl is found before finding a branch insn
> +in the ready queue.  */
> + if (recog_memoized (ready[j]) >= 0
> + && get_attr_type (ready[j]) == TYPE_VSETVL
> + && get_attr_type (ready[j]) == TYPE_VSETVL_PRE)
> +

Reply

2025-02-24 Thread Farhan Faruqui

Hello,

Is this email address still the most reliable way to get in touch with you?


*Farhan Faruqui*

Re: [PATCH] c++, v2: Fix range for with PMFs [PR118923]

2025-02-24 Thread Jason Merrill


On 2/24/25 10:01 AM, Jakub Jelinek wrote:

On Mon, Feb 24, 2025 at 08:29:31AM -0500, Jason Merrill wrote:

On 2/24/25 5:52 AM, Jakub Jelinek wrote:

The following testcases segfault because the new range for -frange-for-ext-temps
temporary extension extends even the internal TARGET_EXPRs created by
get_member_function_from_ptrfunc.

The following patch fixes that by marking those TARGET_EXPR_INTERNAL_P.
I'm not calling get_internal_target_expr because I really want to force
the TARGET_EXPR creation rather than say mark some existing TARGET_EXPR
internal,


Hmm, good point, but that seems like a reason to use force_target_expr in
gen_internal_target_expr.


and get_internal_target_expr doesn't take complain argument
either and always uses tf_warning_or_error.


That shouldn't be an issue for any internal temporaries; complain is only
relevant for building the cleanup that internal temps must not have.

It looks like there are a few more internal uses of force_target_expr in
cp_finish_decl and build_comparison_op.


So like this if it passes full bootstrap/regtest?
So far passed make check-g++ in gcc (98,11,14,17,20,23,26) and make check
in libstdc++-v3.


OK.


2025-02-24  Jakub Jelinek  

PR c++/118923
* tree.cc (get_internal_target_expr): Use force_target_expr
instead of build_target_expr_with_type.
* typeck.cc (get_member_function_from_ptrfunc): Use
get_internal_target_expr instead of force_target_expr.
* decl.cc (cp_finish_decl): Likewise.
* method.cc (build_comparison_op): Likewise.

* g++.dg/cpp0x/pr118923.C: New test.
* g++.dg/cpp1y/pr118923.C: New test.

--- gcc/cp/tree.cc.jj   2025-02-24 00:06:25.776732504 +0100
+++ gcc/cp/tree.cc  2025-02-24 15:04:28.991381102 +0100
@@ -982,8 +982,7 @@ tree
  get_internal_target_expr (tree init)
  {
init = convert_bitfield_to_declared_type (init);
-  tree t = build_target_expr_with_type (init, TREE_TYPE (init),
-   tf_warning_or_error);
+  tree t = force_target_expr (TREE_TYPE (init), init, tf_warning_or_error);
TARGET_EXPR_INTERNAL_P (t) = true;
return t;
  }
--- gcc/cp/typeck.cc.jj 2025-01-23 11:17:39.651880614 +0100
+++ gcc/cp/typeck.cc2025-02-24 15:05:27.817562229 +0100
@@ -4219,16 +4219,14 @@ get_member_function_from_ptrfunc (tree *
  && !DECL_P (instance_ptr)
  && !TREE_CONSTANT (instance_ptr)))
instance_ptr = instance_save_expr
- = force_target_expr (TREE_TYPE (instance_ptr), instance_ptr,
-  complain);
+ = get_internal_target_expr (instance_ptr);
  
/* See above comment.  */

if (TREE_SIDE_EFFECTS (function)
  || (!nonvirtual
  && !DECL_P (function)
  && !TREE_CONSTANT (function)))
-   function
- = force_target_expr (TREE_TYPE (function), function, complain);
+   function = get_internal_target_expr (function);
  
/* Start by extracting all the information from the PMF itself.  */

e3 = pfn_from_ptrmemfunc (function);
--- gcc/cp/decl.cc.jj   2025-02-24 00:06:25.665734038 +0100
+++ gcc/cp/decl.cc  2025-02-24 15:08:52.027729921 +0100
@@ -9377,8 +9377,7 @@ cp_finish_decl (tree decl, tree init, bo
  tree guard = NULL_TREE;
  if (cleanups || cleanup)
{
- guard = force_target_expr (boolean_type_node,
-boolean_false_node, tf_none);
+ guard = get_internal_target_expr (boolean_false_node);
  add_stmt (guard);
  guard = TARGET_EXPR_SLOT (guard);
}
@@ -9407,8 +9406,7 @@ cp_finish_decl (tree decl, tree init, bo
 popped that all, so push those extra cleanups around
 the whole sequence with a guard variable.  */
  gcc_assert (TREE_CODE (sl) == STATEMENT_LIST);
- guard = force_target_expr (integer_type_node,
-integer_zero_node, tf_none);
+ guard = get_internal_target_expr (integer_zero_node);
  add_stmt (guard);
  guard = TARGET_EXPR_SLOT (guard);
  for (unsigned i = 0; i < n_extra_cleanups; ++i)
--- gcc/cp/method.cc.jj 2025-01-15 08:59:46.383217835 +0100
+++ gcc/cp/method.cc2025-02-24 15:10:18.674537705 +0100
@@ -1597,7 +1597,7 @@ build_comparison_op (tree fndecl, bool d
  /* Some other array, will need runtime loop.  */
  else
{
- idx = force_target_expr (sizetype, maxval, complain);
+ idx = get_internal_target_expr (maxval);
  loop_indexes = tree_cons (idx, NULL_TREE, loop_indexes);
}
  expr_type = TREE_TYPE (expr_type);
--- gcc/testsuite/g++.dg/cpp0x/pr118923.C.jj2025-02-24 15:02:04.966385945 
+0100
+++ gcc/testsuite/g++.dg/c

Re: The COBOL front end, version 3, now in 14 easy pieces

2025-02-24 Thread James K. Lowden

On Mon, 24 Feb 2025 14:51:27 +0100
Richard Biener  wrote:

> On Wed, Feb 19, 2025 at 12:38?AM James K. Lowden
>  wrote:
> >
> > The following 14 patches constitute 105,720 lines of code in 83
> > files to build and document the COBOL front end.  
[...]
> > I tested these patches using "git apply" to an unpublished branch
> > "cobol-patched".
> 
> I have now built the compiler from the (now published) cobol-patched
> branch.

I rebuilt cobol-patched today. It now includes changes to gcc/doc/, and
a new gcc/Makefile.in (previously omitted because I thought it was
generated).  That deals with some but not all the issues below. 

> On x86_64-linux and noticed the following issues:
> 
> The toplevel configure script on the branch wasn't re-generated (duh),
> so it isn't
> ready for use.

Since we're not mailing patches, cobol-patched is now built by applying 
patches that include the configure scripts. 

> gcc-cobol/gcc/cobol/parse.y:1361.10-16: error: require bison 3.5.1,
> but have 3.0.4
>  %require "3.5.1"  //3.8.2 also works, but not 3.8.0
>   ^^^
> 
> this requirement isn't documented, neither is a version requirement
> for flex.  The
> place for this is in gcc/doc/install.texi

I think you will find the updated .texi files comprehesively (!)
reflect the presence of gcobol and its requirements.  

There is a hitch.  The patches are against master branch as of 5
February.  Many changes occurred in the time since.  Now that I know
that, I *expect* a conflict extravaganza as we pull that together.  

Best IMO is for us to update our repository, i.e. to merge master into
our master+cobol, then to cobol-stage, then to cobol-patched.
Questions: 

1. How often should we do that, would you say?  
2. Is cobol-patched more useful to you right now as-is, or should I
regenerate it tomorrow?  

> During the build we write to
> 
> gcc/cobol/charmaps-dupe.cc
> gcc/cobol/valconv-dupe.cc

Not yet fixed.  Top of the list, now. 

> Installing the result via make install DESTDIR=/foo I see both a
> 'gcobol' and a 'gcobc' program
> being installed - is that intentional?  

Yes, as explained prior, gcobc is an emulation script. 

> I also see the gcobol.3
> manpage reside directly in /foo/gcobol.3 rather
> than in the expected /foo/usr/local/man/..

gcc/Makefile.in didn't appear in the patches.  I now have installed: 

$ find /usr/local/cobol-patched/ -name '*cobol*'
/usr/local/cobol-patched/
/usr/local/cobol-patched/libexec/gcc/x86_64-pc-linux-gnu/15.0.1/cobol1
/usr/local/cobol-patched/bin/gcobol
/usr/local/cobol-patched/share/man/man1/gcobol.1
/usr/local/cobol-patched/share/man/man3/gcobol.3
/usr/local/cobol-patched/share/gcobol

> Installing of libgcobol fails for me:

Hmm, a little worse for me.  As you see above, the installed cobol files
don't include the library.  This was from "make install".  Checking. 

> so the suppression seems incomplete?

Or too complete.  :-/

> Doing
> 
> > ./install/gcc-cobol/usr/local/bin/gcobol t.cob -m32
> 
> only fails during linking as we do build a 32bit binary but do not
> find (the not built) 32bit runtime.  I suspect when writing a program
> that would need 128bit integer or float we'd eventually ICE.

I think the cobol-patched you were using was missing our patch of
Friday.  Our intention is to prevent any 32-bit binary from being
generated, in addition to not attempting to build the library.  

But we're not there yet.  On my system, configured as

  $ ../configure --prefix=/usr/local/gcc-cobol --disable-bootstrap
--disable-generated-files-in-srcdir --disable-multilib
--enable-checking --enable-languages=c,cobol

I see failure to link [eliding the "skipping" messages]:

$ gcobol -oo -ffixed-form -m32 gcc/cobol/nist/NC/NC101A.cbl 
/usr/bin/ld: cannot find -lgcobol: No such file or directory
/usr/bin/ld: cannot find -lstdc++: No such file or directory
/usr/bin/ld: cannot find -lgcc: No such file or directory
/usr/bin/ld: cannot find -lgcc_s: No such file or directory

If compiled with -c, 

$ gcobol -oo -c -m32 -ffixed-form ../parser/gcc/cobol/nist/NC/NC101A.cbl
$ file o o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not 
stripped

Not our intended result.  :-(  But, fixable.  

> The cobol frontend should attempt to intercept the selection of a
> not supported multilib (but don't ask me how) and gracefully exit,
> possibly calling
> 
>  sorry ("selected multilib not available for Cobol");
> 
> or so.  There's the LANG_HOOKS_POST_OPTIONS langhook which
> might be a place to do this.  The way to check is likely whether
> a mode for __int128 exists and is supported.  The C fronted does
> 
>   if (targetm.scalar_mode_supported_p (TImode))
> 
> it also checks float128_type_node, but I didn't find who initializes
> that.

Thank you for the pointer.  It's late here today.  I will try that in
the morning if Bob hasn't already solved it by then.  

Regards, 

--jkl

Re: [PATCH] RISC-V: Fix .cfi_offset directive when push pop in zc

2025-02-24 Thread Kito Cheng

Thanks, will push after verified on my hand :)

On Mon, Feb 24, 2025 at 3:20 PM Hsing Yu Peng  wrote:
>
> Address Fei's comment.
>
> Thanks
> Lino
>
> Fei Gao  於 2025年2月24日 週一 下午2:20寫道：
> >
> > The case you hit is s2 set in frame mask but s1 not. So you're trying to 
> > set s1 bit manually.
> > If so, riscv_adjust_multi_push_cfi_prologue seems the better place for your 
> > change.
> >
> > static rtx
> > riscv_adjust_multi_push_cfi_prologue (int saved_size)
> > {
> >   rtx dwarf = NULL_RTX;
> >   rtx adjust_sp_rtx, reg, mem, insn;
> >   unsigned int mask = cfun->machine->frame.mask;
> >   int offset;
> >   int saved_cnt = 0;
> >
> > -  if (mask & S10_MASK)
> > -mask |= S11_MASK; /* this corner case is covered in your for loop 
> > below.  */
> >
> > + unsigned num_multi_push = riscv_multi_push_regs_count 
> > (frame->mask);
> > + for (int i = 0; i < num_multi_push; i++)
> > +   {
> > + gcc_assert (zcmp_save_reg_order[i] != INVALID_REGNUM);
> > + mask |= 1 << (zcmp_save_reg_order[i] - GP_REG_FIRST); /* here 
> > mask is used
> > 
> >  instead of frame->mask*/
> > +   }
> > ...
> >
> > BR
> > Fei

Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]

2025-02-24 Thread Robin Dapp

> This patch would like to perform the overflow to smode check before IOR
> the base2 series, and perform the clean highest bit if the const_vector
> overflow to smode occurs.  If no overflow, there will do nothing here.

I agree with the general idea but I'm a bit wary fiddling with the coefficients
directly.  I think for a fixed-size, non VLA vector it should be sufficient to
check whether the last step overflows.

For VLA we actually know the largest possible runtime vector length and could
use it the check?  But maybe it's probably easier to bail with VLA anyway.

In addition: if we're already checking the step anyway we could get rid of the
negative step restriction as well.

-- 
Regards
 Robin

Re: [PATCH] analyzer: Handle nonnull_if_nonzero attribute [PR117023]

2025-02-24 Thread Jakub Jelinek

On Fri, Feb 14, 2025 at 09:14:48AM -0500, David Malcolm wrote:
> I haven't used ranger yet within the analyzer, and I wonder if there is
> a philosophical divide here between the goals of optimization versus
> bug finding: the optimizer makes use of undefined behavior in order to
> add assumptions about the state of the code ("X can't happen"), whereas
> the analyzer ought to report on places where undefined behavior could
> happen ("if X happened here, it would be bad, let's warn the user").

You're right, that is why also using the ranger in the __builtin_*object_size
handling and sanitizers is problematic.
Maybe we'll need a ranger mode which only looks at clearly guaranteed things
rather than exploiting undefined behavior (including whatever is saved in
SSA_NAME_RANGE_INFO).

> One way of expressing the conditional nature of this property in the
> analyzer would be to "bifurcate" the analysis so that there are two
> out-edges in the exploded graph, covering the ptr == NULL vs ptr !=
> NULL conditions explicitly and separately.  But that's fiddly, alas,
> and not something to be doing for GCC 15 at this stage (sorry).

Indeed.

> I see that the code from 
>state_t state = sm_ctxt.get_state (stmt, arg);
> onwards seems to be a copy-and-paste of the existing code from within
> the
>/* Handle "__attribute__((nonnull))".   */
> suite.  malloc_state_machine::on_stmt is already longer than I'm
> comfortable with, so please can you introduce a subroutine to avoid the
> copy-and-paste (probably a private member function of
> malloc_state_machine).
> 
> OK for trunk with that change (assuming usual testing, of course)

Here is what I've committed after another bootstrap/regtest on
x86_64-linux and i686-linux.

2025-02-24  Jakub Jelinek  

PR c/117023
gcc/analyzer/
* sm-malloc.cc (malloc_state_machine::handle_nonnull): New private
method.
(malloc_state_machine::on_stmt): Use it for nonnull attribute arguments.
Handle also nonnull_if_nonzero attributes.
gcc/testsuite/
* c-c++-common/analyzer/call-summaries-malloc.c
(test_use_without_check): Pass 4 rather than sz to memset.
* c-c++-common/analyzer/strncpy-1.c (test_null_dst,
test_null_src): Pass 42 rather than count to strncpy.

--- gcc/analyzer/sm-malloc.cc.jj2024-10-24 22:56:14.58433 +0200
+++ gcc/analyzer/sm-malloc.cc   2025-02-24 00:28:51.55033 +0100
@@ -501,6 +501,12 @@ private:
   void on_zero_assignment (sm_context &sm_ctxt,
   const gimple *stmt,
   tree lhs) const;
+  void handle_nonnull (sm_context &sm_ctx,
+  const supernode *node,
+  const gimple *stmt,
+  tree fndecl,
+  tree arg,
+  unsigned i) const;
 
   /* A map for consolidating deallocators so that they are
  unique per deallocator FUNCTION_DECL.  */
@@ -2004,6 +2010,43 @@ malloc_state_machine::maybe_assume_non_n
 }
 }
 
+/* Helper method for malloc_state_machine::on_stmt.  Handle a single
+   argument (Ith argument ARG) if it is nonnull or nonnull_if_nonzero
+   and size is nonzero.  */
+
+void
+malloc_state_machine::handle_nonnull (sm_context &sm_ctxt,
+ const supernode *node,
+ const gimple *stmt,
+ tree fndecl,
+ tree arg,
+ unsigned i) const
+{
+  state_t state = sm_ctxt.get_state (stmt, arg);
+  /* Can't use a switch as the states are non-const.  */
+  /* Do use the fndecl that caused the warning so that the
+ misused attributes are printed and the user not confused.  */
+  if (unchecked_p (state))
+{
+  tree diag_arg = sm_ctxt.get_diagnostic_tree (arg);
+  sm_ctxt.warn (node, stmt, arg,
+   make_unique (*this, diag_arg, fndecl,
+   i));
+  const allocation_state *astate
+   = as_a_allocation_state (state);
+  sm_ctxt.set_next_state (stmt, arg, astate->get_nonnull ());
+}
+  else if (state == m_null)
+{
+  tree diag_arg = sm_ctxt.get_diagnostic_tree (arg);
+  sm_ctxt.warn (node, stmt, arg,
+   make_unique (*this, diag_arg, fndecl, i));
+  sm_ctxt.set_next_state (stmt, arg, m_stop);
+}
+  else if (state == m_start)
+maybe_assume_non_null (sm_ctxt, arg, stmt);
+}
+
 /* Implementation of state_machine::on_stmt vfunc for malloc_state_machine.  */
 
 bool
@@ -2118,37 +2161,37 @@ malloc_state_machine::on_stmt (sm_contex
   just the specified pointers.  */
if (bitmap_empty_p (nonnull_args)
|| bitmap_bit_p (nonnull_args, i))
- {
-   state_t state = sm_ctxt.get_state (stmt, arg);
-   /* Can't use a switch as the states a

[committed] openmp: Fix diagnostics typo [PR118993]

2025-02-24 Thread Jakub Jelinek

Hi!

There is a typo in one of the OpenMP gimplification diagnostics messages.
The following patch fixes that and adjusts tests which just copied that
message including typo to dg-warning regexps in 2 tests.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2025-02-24  Jakub Jelinek  

PR middle-end/118993
* gimplify.cc (gimplify_scan_omp_clauses): Fix diagnostics typo,
undfined -> undefined.

* c-c++-common/gomp/allocate-18.c: Adjust dg-warning regex for
diagnostics typo fix.
* gfortran.dg/gomp/allocate-clause.f90: Likewise.

--- gcc/gimplify.cc.jj  2025-01-30 22:09:17.523382838 +0100
+++ gcc/gimplify.cc 2025-02-23 23:59:56.021122542 +0100
@@ -14007,7 +14007,7 @@ gimplify_scan_omp_clauses (tree *list_p,
  && (code == OMP_TARGET || code == OMP_TASK || code == 
OMP_TASKLOOP))
warning_at (OMP_CLAUSE_LOCATION (c), OPT_Wopenmp,
"allocator with access trait set to % "
-   "results in undfined behavior for %qs directive",
+   "results in undefined behavior for %qs directive",
code == OMP_TARGET ? "target"
   : (code == OMP_TASK
  ? "task" : "taskloop"));
--- gcc/testsuite/c-c++-common/gomp/allocate-18.c.jj2024-12-04 
00:45:08.606203630 +0100
+++ gcc/testsuite/c-c++-common/gomp/allocate-18.c   2025-02-24 
00:00:08.043956210 +0100
@@ -36,16 +36,16 @@ test1 ()
x[0] = 1;
 
   #pragma omp target allocate(omp_thread_mem_alloc: x) firstprivate(x) /* 
uses_allocators(omp_thread_mem_alloc) */
-   /* { dg-warning "allocator with access trait set to 'thread' results in 
undfined behavior for 'target' directive \\\[-Wopenmp\\\]" "" { target *-*-* } 
.-1 } */
+   /* { dg-warning "allocator with access trait set to 'thread' results in 
undefined behavior for 'target' directive \\\[-Wopenmp\\\]" "" { target *-*-* } 
.-1 } */
x[0] = 1;
 
   #pragma omp taskloop allocate(omp_thread_mem_alloc: x) firstprivate(x)
-   /* { dg-warning "allocator with access trait set to 'thread' results in 
undfined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* 
} .-1 } */
+   /* { dg-warning "allocator with access trait set to 'thread' results in 
undefined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* 
} .-1 } */
for (int i = 0; i < 5; i++)
  x[i] = i;
 
   #pragma omp parallel master taskloop simd allocate(omp_thread_mem_alloc: x) 
firstprivate(x)
-   /* { dg-warning "allocator with access trait set to 'thread' results in 
undfined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* 
} .-1 } */
+   /* { dg-warning "allocator with access trait set to 'thread' results in 
undefined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* 
} .-1 } */
for (int i = 0; i < 5; i++)
  x[i] = i;
 
@@ -53,7 +53,7 @@ test1 ()
   #pragma omp masked
   {
 #pragma omp task allocate(omp_thread_mem_alloc: x) firstprivate(x)
-  /* { dg-warning "allocator with access trait set to 'thread' results in 
undfined behavior for 'task' directive \\\[-Wopenmp\\\]" "" { target *-*-* } 
.-1 } */
+  /* { dg-warning "allocator with access trait set to 'thread' results in 
undefined behavior for 'task' directive \\\[-Wopenmp\\\]" "" { target *-*-* } 
.-1 } */
   x[0] = 1;
   }
 }
--- gcc/testsuite/gfortran.dg/gomp/allocate-clause.f90.jj   2024-10-07 
10:49:58.098362880 +0200
+++ gcc/testsuite/gfortran.dg/gomp/allocate-clause.f90  2025-02-24 
00:00:12.407895839 +0100
@@ -34,18 +34,18 @@ subroutine test1 ()
   !$omp end parallel
 
   !$omp target allocate(omp_thread_mem_alloc: x) firstprivate(x) ! 
uses_allocators(omp_thread_mem_alloc)
-! { dg-warning "allocator with access trait set to 'thread' results in 
undfined behavior for 'target' directive \\\[-Wopenmp\\\]" "" { target *-*-* } 
.-1 }
+! { dg-warning "allocator with access trait set to 'thread' results in 
undefined behavior for 'target' directive \\\[-Wopenmp\\\]" "" { target *-*-* } 
.-1 }
 x(1) = 1
   !$omp end target
 
   !$omp taskloop allocate(omp_thread_mem_alloc: x) firstprivate(x)
-   ! { dg-warning "allocator with access trait set to 'thread' results in 
undfined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* 
} .-1 }
+   ! { dg-warning "allocator with access trait set to 'thread' results in 
undefined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* 
} .-1 }
do i = 1, 5
  x(i) = i
end do
 
   !$omp parallel master taskloop simd allocate(omp_thread_mem_alloc: x) 
firstprivate(x)
-   ! { dg-warning "allocator with access trait set to 'thread' results in 
undfined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* 
} .-1 }
+   ! { dg-warning "allocator with access trait set to 'thread' results in 
undefined behavior for 'taskloop'

[PATCH] reassoc: Fix up optimize_range_tests_to_bit_test [PR118915]

2025-02-24 Thread Jakub Jelinek

Hi!

The following testcase is miscompiled due to a bug in
optimize_range_tests_to_bit_test.  It is trying to optimize
check for a in [-34,-34] or [-26,-26] or [-6,-6] or [-4,inf] ranges.
Another reassoc optimization folds the the test for the first
two ranges into (a + 34U) & ~8U in [0U,0U] range, and extract_bit_test_mask
actually has code to virtually undo it and treat that again as test
for a being -34 or -26.  The problem is that optimize_range_tests_to_bit_test
remembers in the type variable TREE_TYPE (ranges[i].exp); from the first
range.  If extract_bit_test_mask doesn't do that virtual undoing of the
BIT_AND_EXPR handling, that is just fine, the returned exp is ranges[i].exp.
But if the first range is BIT_AND_EXPR, the type could be different, the
BIT_AND_EXPR form has the optional cast to corresponding unsigned type
in order to avoid introducing UB.  Now, type was used to fill in the
max value if ranges[j].high was missing in subsequently tested range,
and so in this particular testcase the [-4,inf] range which was
signed int and so [-4,INT_MAX] was treated as [-4,UINT_MAX] instead.
And we were subtracting values of 2 different types and trying to make
sense out of that.

The following patch fixes this by using the type of the low bound
(which is always non-NULL) for the max value of the high bound instead.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2025-02-24  Jakub Jelinek  

PR tree-optimization/118915
* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): For
highj == NULL_TREE use TYPE_MAX_VALUE (TREE_TYPE (lowj)) rather
than TYPE_MAX_VALUE (type).

* gcc.c-torture/execute/pr118915.c: New test.

--- gcc/tree-ssa-reassoc.cc.jj  2025-02-13 19:59:56.157572421 +0100
+++ gcc/tree-ssa-reassoc.cc 2025-02-20 22:47:42.646500373 +0100
@@ -3362,7 +3362,7 @@ optimize_range_tests_to_bit_test (enum t
continue;
  highj = ranges[j].high;
  if (highj == NULL_TREE)
-   highj = TYPE_MAX_VALUE (type);
+   highj = TYPE_MAX_VALUE (TREE_TYPE (lowj));
  wide_int mask2;
  exp2 = extract_bit_test_mask (ranges[j].exp, prec, lowi, lowj,
highj, &mask2, NULL);
--- gcc/testsuite/gcc.c-torture/execute/pr118915.c.jj   2025-02-20 
22:46:21.960212578 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr118915.c  2025-02-20 
22:46:01.019137888 +0100
@@ -0,0 +1,22 @@
+/* PR tree-optimization/118915 */
+
+int a;
+
+int
+foo (int c, int d, int e, int f)
+{
+  if (!d || !e)
+return -22;
+  if (c > 16)
+return -22;
+  if (!f)
+return -22;
+  return 2;
+}
+
+int
+main ()
+{
+  if (foo (a + 21, a + 6, a + 34, a + 26) != -22)
+__builtin_abort ();
+}
Jakub

[PATCH PING] combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]

2025-02-24 Thread Uros Bizjak

I would like to ping for the following patch that fixes P1 regression:

gcc/ChangeLog:

* combine.cc (distribute_notes) : Remove
REG_UNUSED note from i2 when the register is also mentioned in i3.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr118739.c: New test.

https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675585.html

Thanks,
Uros.

Re: [PATCH] avoid-store-forwarding: Handle REG_EH_REGION notes

2025-02-24 Thread Richard Biener

On Fri, Feb 21, 2025 at 4:55 PM Konstantinos Eleftheriou
 wrote:
>
> Hi Richard, thanks for the feedback.
>
> On Tue, Feb 18, 2025 at 9:17 PM Richard Biener
>  wrote:
> >
> >
> >
> > > Am 18.02.2025 um 17:04 schrieb Konstantinos Eleftheriou 
> > > :
> > >
> > > From: kelefth 
> > >
> > > The pass rejects the transformation when there are instructions in the
> > > sequence that might throw an exception. This was added due to having
> > > cases that the load instruction contains a REG_EH_REGION note and
> > > moving it before the store instructions caused an error, as it was
> > > no longer the last instruction in the basic block.
> > >
> > > This patch handles those cases by moving a possible REG_EH_REGION
> > > note from the load instruction of the store-load sequence to the
> > > last instruction of the basic block.
> >
> > But that’s not a correct transform and will lead to bogus exception 
> > handling?  You’d need to move the note and split the block, possibly 
> > updating the EH info on the side.
>
> I had originally thought about splitting the block, but tried to get
> away with this solution. In case that we are splitting the block after
> the load, we wouldn't need to also move the note, right?

I don't think it makes much sense in trying to handle loads/stores with EH,
you generally can't re-order stuff around EH since that changes what is
observable from the EH handler and the EH might also guard exceptions
on later stmts.

Richard.

>
> Thanks,
> Konstantinos
>
> > Richard
> >
> > > gcc/ChangeLog:
> > >
> > >* avoid-store-forwarding.cc (process_store_forwarding):
> > >(store_forwarding_analyzer::avoid_store_forwarding):
> > >Move a possible REG_EH_REGION note from the load instruction
> > >to the last instruction of the basic block.
> > > ---
> > > gcc/avoid-store-forwarding.cc | 13 -
> > > 1 file changed, 12 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc
> > > index 34a7bba4043..05c91bb1a82 100644
> > > --- a/gcc/avoid-store-forwarding.cc
> > > +++ b/gcc/avoid-store-forwarding.cc
> > > @@ -400,6 +400,17 @@ process_store_forwarding (vec 
> > > &stores, rtx_insn *load_insn,
> > >   if (load_elim)
> > > delete_insn (load_insn);
> > >
> > > +  /* Find possible REG_EH_REGION note in the load instruction and move it
> > > + into the last instruction of the basic block.  */
> > > +  rtx reg_eh_region_note = find_reg_note (load_insn, REG_EH_REGION, 
> > > NULL_RTX);
> > > +  if (reg_eh_region_note != NULL_RTX)
> > > +{
> > > +  remove_note (load_insn, reg_eh_region_note);
> > > +  basic_block load_bb = BLOCK_FOR_INSN (load_insn);
> > > +  add_reg_note (BB_END (load_bb), REG_EH_REGION,
> > > +XEXP (reg_eh_region_note, 0));
> > > +}
> > > +
> > >   return true;
> > > }
> > >
> > > @@ -425,7 +436,7 @@ store_forwarding_analyzer::avoid_store_forwarding 
> > > (basic_block bb)
> > >
> > >   rtx set = single_set (insn);
> > >
> > > -  if (!set || insn_could_throw_p (insn))
> > > +  if (!set)
> > >{
> > >  store_exprs.truncate (0);
> > >  continue;
> > > --
> > > 2.47.0
> > >

Re: [PATCH] Further use of mod_scope in modified_type_die

2025-02-24 Thread Richard Biener

On Thu, Oct 3, 2024 at 6:39 PM Tom Tromey  wrote:
>
> I am working on some changes to GNAT to emit hierarchical DWARF --
> i.e., where entities will have simple names nested in a DW_TAG_module.
>
> While working on this I found a couple of paths in modified_type_die
> where "mod_scope" should be used, but is not.  I suspect these cases
> are only reachable by Ada code, as in both spots (subrange types and
> base types), I believe that other languages don't generally have named
> types in a non-top-level scope, and in these other situations,
> mod_scope will still be correct.
>
> gcc
>
> * dwarf2out.cc (modified_type_die): Use mod_scope for
> ranged types and base types.
>
> Issue: eng/toolchain/gcc#241
> ---
>  gcc/dwarf2out.cc | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
> index 38aedb64470..67d2827c279 100644
> --- a/gcc/dwarf2out.cc
> +++ b/gcc/dwarf2out.cc
> @@ -13927,7 +13927,7 @@ modified_type_die (tree type, int cv_quals, bool 
> reverse,
>tree bias = NULL_TREE;
>if (lang_hooks.types.get_type_bias)
> bias = lang_hooks.types.get_type_bias (type);
> -  mod_type_die = subrange_type_die (type, low, high, bias, context_die);
> +  mod_type_die = subrange_type_die (type, low, high, bias, mod_scope);

that looks good.  But why not for the ARRAY_TYPE case dircetly above?

>item_type = TREE_TYPE (type);
>  }
>else if (is_base_type (type))
> @@ -13964,10 +13964,10 @@ modified_type_die (tree type, int cv_quals, bool 
> reverse,
> {
>   dw_die_ref after_die
> = modified_type_die (type, cv_quals, false, context_die);
> - add_child_die_after (comp_unit_die (), mod_type_die, after_die);
> + add_child_die_after (mod_scope, mod_type_die, after_die);

For the next DW_AT_endianity there's an assert for the correct
placement but not here  So I'm not
positive this change is according to the comment.  In fact we're
realing with base-type DIEs here,
and those are usually directly at comp_unit_die (), no?

> }
>else
> -   add_child_die (comp_unit_die (), mod_type_die);
> +   add_child_die (mod_scope, mod_type_die);
>
>add_pubtype (type, mod_type_die);
>  }
> --
> 2.46.2
>

[PATCH] c++: Adjust #embed support for P1967R14

2025-02-24 Thread Jakub Jelinek

Hi!

Now that the #embed paper has been voted in, the following patch
removes the pedwarn for C++26 on it (and adjusts pedwarn warning for
older C++ versions) and predefines __cpp_pp_embed FTM.

I believe we otherwise implement everything in the paper already,
except I'm really confused by the
[Example:

#embed  limit(__has_include("a.h"))

#if __has_embed( limit(__has_include("a.h")))
// ill-formed: __has_include [cpp.cond] cannot appear here
#endif 

— end example]
part.  My reading of both C23 and C++ with the P1967R14 paper in
is that the first case (#embed with __has_include or __has_embed in its
clauses) is what is clearly invalid and so the ill-formed note should be
for #embed.  And the __has_include/__has_embed in __has_embed is actually
questionable.
Both C and C++ have something like
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
or
"The identifiers __has_include and __has_cpp_attribute shall not appear
in any context not mentioned in this subclause."
(into which P1967R14 adds __has_embed) in the conditional inclusion
subclause.
#embed is defined in a different one, so using those in there is invalid
(unless "using the rules specified for conditional inclusion" wording
e.g. in limit clause overrides that).
The reason why I think it is fuzzy for __has_embed is that __has_embed
is actually defined in the Conditional inclusion subclause (so that
would mean one can use __has_include, __has_embed and __has_*attribute
in there) but its clauses are described in a different one.

GCC currently accepts
#embed __FILE__ limit (__has_include ())
#if __has_embed (__FILE__ limit (__has_include ()))
#endif
#embed __FILE__ limit (__has_embed ("a.c"))
#if __has_embed (__FILE__ limit (__has_embed ("a.c")))
#endif
with the exception of __has_embed in #embed which results in a strange
message.
Note, it isn't just about limit clause, but also about
prefix/suffix/if_empty, except that in those cases the "using the rules
specified for conditional inclusion" doesn't apply.

In any case, I'd hope that can be dealt with incrementally (and should
be handled the same for both C and C++).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2025-02-24  Jakub Jelinek  

libcpp/
* init.cc (lang_defaults): Set embed for GNUCXX26 and CXX26.
* directives.cc (do_embed): Adjust pedwarn wording for embed in C++.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine __cpp_pp_embed=202502
for C++26.
gcc/testsuite/
* g++.dg/cpp/embed-1.C: Adjust for pedwarn wording change and don't
expect any error for C++26.
* g++.dg/cpp/embed-2.C: Adjust for pedwarn wording change and don't
expect any warning for C++26.
* g++.dg/cpp26/feat-cxx26.C: Test __cpp_pp_embed value.

--- libcpp/init.cc.jj   2025-02-13 19:59:56.204572159 +0100
+++ libcpp/init.cc  2025-02-20 21:28:07.340156536 +0100
@@ -149,8 +149,8 @@ static const struct lang_flags lang_defa
   /* CXX20*/ { 1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,0,0,0,0,0,0,1,0,0,1 },
   /* GNUCXX23 */ { 1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,0,0,1 },
   /* CXX23*/ { 1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,0,0,1 },
-  /* GNUCXX26 */ { 1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,0,0,1 },
-  /* CXX26*/ { 1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,0,0,1 },
+  /* GNUCXX26 */ { 1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,1,0,1 },
+  /* CXX26*/ { 1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,1,0,1 },
   /* ASM  */ { 0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 }
 };
 
--- libcpp/directives.cc.jj 2025-02-13 19:59:56.202572170 +0100
+++ libcpp/directives.cc2025-02-20 21:40:56.379899457 +0100
@@ -1367,7 +1367,7 @@ do_embed (cpp_reader *pfile)
 {
   if (CPP_OPTION (pfile, cplusplus))
cpp_error (pfile, CPP_DL_PEDWARN,
-  "%<#%s%> is a GCC extension", "embed");
+  "%<#%s%> before C++26 is a GCC extension", "embed");
   else
cpp_error (pfile, CPP_DL_PEDWARN,
   "%<#%s%> before C23 is a GCC extension", "embed");
--- gcc/c-family/c-cppbuiltin.cc.jj 2025-02-13 19:59:55.144578075 +0100
+++ gcc/c-family/c-cppbuiltin.cc2025-02-20 21:34:18.626480792 +0100
@@ -1093,6 +1093,7 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_deleted_function=202403L");
  cpp_define (pfile, "__cpp_variadic_friend=202403L");
  cpp_define (pfile, "__cpp_pack_indexing=202311L");
+ cpp_define (pfile, "__cpp_pp_embed=202502L");
}
   if (flag_concepts && cxx_dialect > cxx14)
cpp_define (pfile, "__cpp_concepts=202002L");
--- gcc/testsuite/g++.dg/cpp/embed-1.C.jj   2024-10-13 18:47:45.508432900 
+0200
+++ gcc/testsuite/g++.dg/cpp/embed-1.C  2025-02-20 21:38:23.306353505 +0100
@@ -6,9 +6,9 @@
 #endif
 
 int a =
-#embed __FILE__ limit (

Re: [PATCH] RISC-V: Fix .cfi_offset directive when push pop in zc

2025-02-24 Thread Kito Cheng

Pushed with minor coding style fix:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4dcd3c7749734133f7f59509b1a118f3a13de4ee

On Mon, Feb 24, 2025 at 4:00 PM Kito Cheng  wrote:
>
> Thanks, will push after verified on my hand :)
>
> On Mon, Feb 24, 2025 at 3:20 PM Hsing Yu Peng  wrote:
> >
> > Address Fei's comment.
> >
> > Thanks
> > Lino
> >
> > Fei Gao  於 2025年2月24日 週一 下午2:20寫道：
> > >
> > > The case you hit is s2 set in frame mask but s1 not. So you're trying to 
> > > set s1 bit manually.
> > > If so, riscv_adjust_multi_push_cfi_prologue seems the better place for 
> > > your change.
> > >
> > > static rtx
> > > riscv_adjust_multi_push_cfi_prologue (int saved_size)
> > > {
> > >   rtx dwarf = NULL_RTX;
> > >   rtx adjust_sp_rtx, reg, mem, insn;
> > >   unsigned int mask = cfun->machine->frame.mask;
> > >   int offset;
> > >   int saved_cnt = 0;
> > >
> > > -  if (mask & S10_MASK)
> > > -mask |= S11_MASK; /* this corner case is covered in your for loop 
> > > below.  */
> > >
> > > + unsigned num_multi_push = riscv_multi_push_regs_count 
> > > (frame->mask);
> > > + for (int i = 0; i < num_multi_push; i++)
> > > +   {
> > > + gcc_assert (zcmp_save_reg_order[i] != INVALID_REGNUM);
> > > + mask |= 1 << (zcmp_save_reg_order[i] - GP_REG_FIRST); /* 
> > > here mask is used
> > >   
> > >instead of frame->mask*/
> > > +   }
> > > ...
> > >
> > > BR
> > > Fei

Re: [PATCH] reassoc: Fix up optimize_range_tests_to_bit_test [PR118915]

2025-02-24 Thread Richard Biener

On Mon, 24 Feb 2025, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled due to a bug in
> optimize_range_tests_to_bit_test.  It is trying to optimize
> check for a in [-34,-34] or [-26,-26] or [-6,-6] or [-4,inf] ranges.
> Another reassoc optimization folds the the test for the first
> two ranges into (a + 34U) & ~8U in [0U,0U] range, and extract_bit_test_mask
> actually has code to virtually undo it and treat that again as test
> for a being -34 or -26.  The problem is that optimize_range_tests_to_bit_test
> remembers in the type variable TREE_TYPE (ranges[i].exp); from the first
> range.  If extract_bit_test_mask doesn't do that virtual undoing of the
> BIT_AND_EXPR handling, that is just fine, the returned exp is ranges[i].exp.
> But if the first range is BIT_AND_EXPR, the type could be different, the
> BIT_AND_EXPR form has the optional cast to corresponding unsigned type
> in order to avoid introducing UB.  Now, type was used to fill in the
> max value if ranges[j].high was missing in subsequently tested range,
> and so in this particular testcase the [-4,inf] range which was
> signed int and so [-4,INT_MAX] was treated as [-4,UINT_MAX] instead.
> And we were subtracting values of 2 different types and trying to make
> sense out of that.
> 
> The following patch fixes this by using the type of the low bound
> (which is always non-NULL) for the max value of the high bound instead.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2025-02-24  Jakub Jelinek  
> 
>   PR tree-optimization/118915
>   * tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): For
>   highj == NULL_TREE use TYPE_MAX_VALUE (TREE_TYPE (lowj)) rather
>   than TYPE_MAX_VALUE (type).
> 
>   * gcc.c-torture/execute/pr118915.c: New test.
> 
> --- gcc/tree-ssa-reassoc.cc.jj2025-02-13 19:59:56.157572421 +0100
> +++ gcc/tree-ssa-reassoc.cc   2025-02-20 22:47:42.646500373 +0100
> @@ -3362,7 +3362,7 @@ optimize_range_tests_to_bit_test (enum t
>   continue;
> highj = ranges[j].high;
> if (highj == NULL_TREE)
> - highj = TYPE_MAX_VALUE (type);
> + highj = TYPE_MAX_VALUE (TREE_TYPE (lowj));
> wide_int mask2;
> exp2 = extract_bit_test_mask (ranges[j].exp, prec, lowi, lowj,
>   highj, &mask2, NULL);
> --- gcc/testsuite/gcc.c-torture/execute/pr118915.c.jj 2025-02-20 
> 22:46:21.960212578 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr118915.c2025-02-20 
> 22:46:01.019137888 +0100
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/118915 */
> +
> +int a;
> +
> +int
> +foo (int c, int d, int e, int f)
> +{
> +  if (!d || !e)
> +return -22;
> +  if (c > 16)
> +return -22;
> +  if (!f)
> +return -22;
> +  return 2;
> +}
> +
> +int
> +main ()
> +{
> +  if (foo (a + 21, a + 6, a + 34, a + 26) != -22)
> +__builtin_abort ();
> +}
>   Jakub
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]

2025-02-24 Thread Richard Biener

On Wed, Feb 12, 2025 at 1:16 PM Uros Bizjak  wrote:
>
> The combine pass is trying to combine:
>
> Trying 16, 22, 21 -> 23:
>16: r104:QI=flags:CCNO>0
>22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>   REG_UNUSED flags:CC
>21: r119:QI=flags:CCNO<=0
>   REG_DEAD flags:CCNO
>23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
>   REG_DEAD r120:QI
>   REG_DEAD r119:QI
>   REG_UNUSED flags:CC
>
> and creates the following two insn sequence:
>
> modifying insn i222: r104:QI=flags:CCNO>0
>   REG_DEAD flags:CC
> deferring rescan insn with uid = 22.
> modifying insn i323: r110:QI=flags:CCNO<=0
>   REG_DEAD flags:CC
> deferring rescan insn with uid = 23.
>
> where the REG_DEAD note in i2 is not correct, because the flags
> register is still referenced in i3.  In try_combine() megafunction, we
> have this part:
>
> --cut here--
> /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
> if (i3notes)
>   distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
> elim_i2, elim_i1, elim_i0);
> if (i2notes)
>   distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
> elim_i2, elim_i1, elim_i0);
> if (i1notes)
>   distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
> elim_i2, local_elim_i1, local_elim_i0);
> if (i0notes)
>   distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
> elim_i2, elim_i1, local_elim_i0);
> if (midnotes)
>   distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
> elim_i2, elim_i1, elim_i0);
> --cut here--
>
> where the compiler distributes REG_UNUSED note from i2:
>
>22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>   REG_UNUSED flags:CC
>
> via distribute_notes() using the following:
>
> --cut here--
>   /* Otherwise, if this register is now referenced in i2
>  then the register used to be modified in one of the
>  original insns.  If it was i3 (say, in an unused
>  parallel), it's now completely gone, so the note can
>  be discarded.  But if it was modified in i2, i1 or i0
>  and we still reference it in i2, then we're
>  referencing the previous value, and since the
>  register was modified and REG_UNUSED, we know that
>  the previous value is now dead.  So, if we only
>  reference the register in i2, we change the note to
>  REG_DEAD, to reflect the previous value.  However, if
>  we're also setting or clobbering the register as
>  scratch, we know (because the register was not
>  referenced in i3) that it's unused, just as it was
>  unused before, and we place the note in i2.  */
>   if (from_insn != i3 && i2 && INSN_P (i2)
>   && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
> {
>   if (!reg_set_p (XEXP (note, 0), PATTERN (i2)))
> PUT_REG_NOTE_KIND (note, REG_DEAD);
>   if (! (REG_P (XEXP (note, 0))
>  ? find_regno_note (i2, REG_NOTE_KIND (note),
> REGNO (XEXP (note, 0)))
>  : find_reg_note (i2, REG_NOTE_KIND (note),
>   XEXP (note, 0
> place = i2;
> }
> --cut here--
>
> However, the flags register is not UNUSED (or DEAD), because it is
> used in i3.  The proposed solution is to remove the REG_UNUSED note
> from i2 when the register is also mentioned in i3.

But there is

  /* Otherwise, if this register is used by I3, then this register
 now dies here, so we must put a REG_DEAD note here unless there
 is one already.  */
  else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
   && ! (REG_P (XEXP (note, 0))
 ? find_regno_note (i3, REG_DEAD,
REGNO (XEXP (note, 0)))
 : find_reg_note (i3, REG_DEAD, XEXP (note, 0
{
  PUT_REG_NOTE_KIND (note, REG_DEAD);
  place = i3;
}

which of course isn't taken as there is a NOTE in i3 already.  Still the
note is supposed to be moved there (just not emitted, it's already there.
So a better fix is to refactor the above to

  /* Otherwise, if this register is used by I3, then this register
 now dies here, so we must put a REG_DEAD note here unless there
 is one already.  */
  else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
 {
 if (! (REG_P (XEXP (note, 0))
 ? find_regno_note (i3, REG_DEAD,
REGNO (XEXP (note, 0)))
 : find_reg_note (i3, REG_DEAD, XEXP (note, 0
   {
 PUT_REG_NOTE_KIND (note, REG_DEAD);
 place = i3;
   }
 }

?  At least the else { case seems to assume the reg isn't refe

Re: Fix 'libstdc++-v3/src/c++20/tzdb.cc' build for 'GTHREADS && !GTHREADS_CXX0X' configurations

2025-02-24 Thread Jonathan Wakely

On Mon, 24 Feb 2025 at 06:23, François Dumont  wrote:
>
>
> On 20/02/2025 18:28, Thomas Schwinge wrote:
> > Hi!
> >
> > On 2025-02-20T16:36:56+, Jonathan Wakely  wrote:
> >> On 20/02/25 15:50 +0100, Thomas Schwinge wrote:
> >> >From 820e015494e25187c9a5ffbd69911ec6ce612789 Mon Sep 17 00:00:00 2001
> >>> From: Jonathan Wakely 
> >>> Date: Thu, 20 Feb 2025 14:08:11 +
> >>> Subject: [PATCH] Fix 'libstdc++-v3/src/c++20/tzdb.cc' build for 
> >>> '__GTHREADS &&
> >>> !__GTHREADS_CXX0X' configurations
> >>>
> >>> libstdc++-v3/
> >>> * src/c++20/tzdb.cc [__GTHREADS && !__GTHREADS_CXX0X]: Use
> >>> '__gnu_cxx::__mutex'.
> >>>
> >>> Co-authored-by: Thomas Schwinge 
> >>> ---
> >>> libstdc++-v3/src/c++20/tzdb.cc | 8 +++-
> >>> 1 file changed, 7 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/libstdc++-v3/src/c++20/tzdb.cc 
> >>> b/libstdc++-v3/src/c++20/tzdb.cc
> >>> index 7e8cce7ce8c..70ba7b0ef53 100644
> >>> --- a/libstdc++-v3/src/c++20/tzdb.cc
> >>> +++ b/libstdc++-v3/src/c++20/tzdb.cc
> >>> @@ -35,6 +35,9 @@
> >>> #include  // atomic, atomic
> >>> #include  // atomic>
> >>> #include   // mutex
> >>> +#if defined __GTHREADS && ! defined _GLIBCXX_HAS_GHTREADS
> >>> +# include  // __gnu_cxx::__mutex
> >>> +#endif
> >>> #include  // filesystem::read_symlink
> >>>
> >>> #ifndef _AIX
> >>> @@ -97,11 +100,14 @@ namespace std::chrono
> >>> {
> >>>namespace
> >>>{
> >>> -#if ! USE_ATOMIC_SHARED_PTR
> >>> #ifndef __GTHREADS
> >>>  // Dummy no-op mutex type for single-threaded targets.
> >>>  struct mutex { void lock() { } void unlock() { } };
> >>> +#elif ! defined _GLIBCXX_HAS_GHTREADS
> >> This still has my GHTREADS typo.
> > Eh.  I had fixed that, but apparently lost it.  Re-fixed.
> >
> >> A comment here would be good too:
> >>
> >>   // Use __gnu_cxx::__mutex is std::mutex isn't available.
> > Added.
>
> Minor but comment was added with alas the small typo to say 'if' rather
> than 'is':
>
>   // Use __gnu_cxx::__mutex if std::mutex isn't available.


Ah yes, thanks.

It looks like Thomas hasn't pushed it yet, so I hope he can fix that
before pushing.

[PATCH] tree-optimization/118973 - stray abnormal edge after DCE

2025-02-24 Thread Richard Biener

DCE preserves stmts performing abnormal control flow transfer but
currently has an exception for replaceable allocations and cxa_atexit
calls.  That results in a broken CFG since DCE isn't set up to prune
abnormal edges possibly hanging off those.

While we could try to add this handling, the following is the safe
fix at this point and more suitable for backporting.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/118973
* tree-ssa-dce.cc (mark_stmt_if_obviously_necessary): Calls
that alter control flow in unpredictable ways need to be
preserved.

* g++.dg/torture/pr118973.C: New testcase.
---
 gcc/testsuite/g++.dg/torture/pr118973.C | 10 ++
 gcc/tree-ssa-dce.cc |  8 
 2 files changed, 14 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr118973.C

diff --git a/gcc/testsuite/g++.dg/torture/pr118973.C 
b/gcc/testsuite/g++.dg/torture/pr118973.C
new file mode 100644
index 000..8cb9e4b023a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr118973.C
@@ -0,0 +1,10 @@
+// { dg-do compile }
+
+int foo() __attribute__((returns_twice));
+
+void a()
+{
+  int a;
+  if(foo()) new int;
+  &a;
+}
diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc
index 461283ba858..ba9cd6536ae 100644
--- a/gcc/tree-ssa-dce.cc
+++ b/gcc/tree-ssa-dce.cc
@@ -391,15 +391,15 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool 
aggressive)
   {
gcall *call = as_a  (stmt);
 
-   /* Never elide a noreturn call we pruned control-flow for.  */
-   if ((gimple_call_flags (call) & ECF_NORETURN)
-   && gimple_call_ctrl_altering_p (call))
+   /* Never elide a noreturn call we pruned control-flow for.
+  Same for statements that can alter control flow in unpredictable
+  ways.  */
+   if (gimple_call_ctrl_altering_p (call))
  {
mark_stmt_necessary (call, true);
return;
  }
 
-
if (is_removable_allocation_p (call, false))
  return;
 
-- 
2.43.0

[PATCH v2] RISC-V: Fix a typo in zce to zcf implication

2025-02-24 Thread Yuriy Kolerov

zce must imply zcf but this rule was corrupted after
refactoring in 9e12010b5e724277ea. This may be observed
ater generating an .s file from any source code file with
-mriscv-attribute -march=rv32if_zce -mabi=ilp32 -S
options. A full march will be presented in arch attribute:

rv32i2p1_f2p2_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0

As you see, zcf is not presented here though f_zce pair is
passed in -march. According to The RISC-V Instruction
Set Manual:

Specifying Zce on RV32 with F includes Zca, Zcb, Zcmp,
Zcmt and Zcf.

PR target/118906

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: fix zce to zcf
implication.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/attribute-zce-1.c: New test.
* gcc.target/riscv/attribute-zce-2.c: New test.
* gcc.target/riscv/attribute-zce-3.c: New test.
* gcc.target/riscv/attribute-zce-4.c: New test.

Signed-off-by: Yuriy Kolerov 
---
 gcc/common/config/riscv/riscv-common.cc  | 2 +-
 gcc/testsuite/gcc.target/riscv/attribute-zce-1.c | 6 ++
 gcc/testsuite/gcc.target/riscv/attribute-zce-2.c | 6 ++
 gcc/testsuite/gcc.target/riscv/attribute-zce-3.c | 6 ++
 gcc/testsuite/gcc.target/riscv/attribute-zce-4.c | 6 ++
 5 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-zce-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-zce-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-zce-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-zce-4.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 5038f0eb959..b34409adf39 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -213,7 +213,7 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"zcmp", "zca"},
   {"zcmt", "zca"},
   {"zcmt", "zicsr"},
-  {"zcf", "f",
+  {"zce", "zcf",
[] (const riscv_subset_list *subset_list) -> bool
{
  return subset_list->xlen () == 32 && subset_list->lookup ("f");
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-zce-1.c 
b/gcc/testsuite/gcc.target/riscv/attribute-zce-1.c
new file mode 100644
index 000..e477414d4d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/attribute-zce-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-mriscv-attribute -march=rv32i_zce -mabi=ilp32" } */
+
+void foo(){}
+
+/* { dg-final { scan-assembler ".attribute arch, 
\"rv32i2p1_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0\"" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-zce-2.c 
b/gcc/testsuite/gcc.target/riscv/attribute-zce-2.c
new file mode 100644
index 000..7008eb5ea1f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/attribute-zce-2.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-mriscv-attribute -march=rv32if_zce -mabi=ilp32" } */
+
+void foo(){}
+
+/* { dg-final { scan-assembler ".attribute arch, 
\"rv32i2p1_f2p2_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcf1p0_zcmp1p0_zcmt1p0\"" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-zce-3.c 
b/gcc/testsuite/gcc.target/riscv/attribute-zce-3.c
new file mode 100644
index 000..89ebaaf4063
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/attribute-zce-3.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-mriscv-attribute -march=rv64i_zce -mabi=lp64" } */
+
+void foo(){}
+
+/* { dg-final { scan-assembler ".attribute arch, 
\"rv64i2p1_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0\"" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-zce-4.c 
b/gcc/testsuite/gcc.target/riscv/attribute-zce-4.c
new file mode 100644
index 000..cacbcaac35f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/attribute-zce-4.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-mriscv-attribute -march=rv64if_zce -mabi=lp64" } */
+
+void foo(){}
+
+/* { dg-final { scan-assembler ".attribute arch, 
\"rv64i2p1_f2p2_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0\"" } } */
-- 
2.34.1

[PATCH v3 0/3] [AArch64, OpenMP] Support SVE types with various OpenMP clauses and constructs

2025-02-24 Thread Tejas Belagod

This series is based on a previous thread and review comments from RichardS and 
Jakub upstream:

https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674698.html
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674219.html
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674434.html

This series removes all the tests earlier patches covered as per review comments
and only includes the code changes.  I'm reworking through all the tests, so 
are 
not included in this series - I will send a separate thread once I incorporate
all the changes Jakub has requested for.

This patch also includes Richard's patch that the rest of the series is based on

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606741.html

Patches have been rebased, retested and bootstrapped on trunk. OK for trunk?

Richard Sandiford (1):
  gomp: Various fixes for SVE types [PR101018]

Tejas Belagod (2):
  Add function to strip pointer type and get down to the actual pointee
type.
  AArch64: Diagnose OpenMP offloading when SVE types involved.

 gcc/config/aarch64/aarch64-sve-builtins.cc | 34 +++-
 gcc/fold-const.cc  |  7 +++
 gcc/gimplify.cc| 62 ++
 gcc/omp-low.cc |  2 +-
 gcc/poly-int.h | 19 +++
 gcc/target.h   | 37 -
 gcc/tree.h |  9 
 7 files changed, 156 insertions(+), 14 deletions(-)

-- 
2.25.1

[PATCH v3 2/3] Add function to strip pointer type and get down to the actual pointee type.

2025-02-24 Thread Tejas Belagod

Add a function to traverse down the pointer layers to the pointee type.

gcc/ChangeLog:
* tree.h (strip_pointer_types): New.
---
 gcc/tree.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/tree.h b/gcc/tree.h
index 21f3cd5525c..580997b4e5d 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5048,6 +5048,15 @@ strip_array_types (tree type)
   return type;
 }
 
+inline const_tree
+strip_pointer_types (const_tree type)
+{
+  while (POINTER_TYPE_P (type))
+type = TREE_TYPE (type);
+
+  return type;
+}
+
 /* Desription of the reason why the argument of valid_constant_size_p
is not a valid size.  */
 enum cst_size_error {
-- 
2.25.1

[PATCH v3 3/3] AArch64: Diagnose OpenMP offloading when SVE types involved.

2025-02-24 Thread Tejas Belagod

The target clause in OpenMP is used to offload loop kernels to accelarator
peripeherals.  target's 'map' clause is used to move data from and to the
accelarator.  When the data is SVE type, it may not be suitable because of
various reasons i.e. the two SVE targets may not agree on vector size or
some targets don't support variable vector size.  This makes SVE unsuitable
for use in OMP's 'map' clause.  This patch diagnoses all such cases and issues
an error where SVE types are not suitable.

Co-authored-by: Andrea Corallo 

gcc/ChangeLog:

* target.h (type_context_kind): Add new context kinds for target 
clauses.
(omp_type_context): Query if the context is of OMP kind.
* config/aarch64/aarch64-sve-builtins.cc (verify_type_context): Diagnose
SVE types for a given OpenMP context.
(omp_type_context): New.
* gimplify.cc (omp_notice_variable): Diagnose implicitly-mapped SVE
objects in OpenMP regions.
(gimplify_scan_omp_clauses): Diagnose SVE types for various target
clauses.
---
 gcc/config/aarch64/aarch64-sve-builtins.cc | 34 -
 gcc/gimplify.cc| 43 +-
 gcc/target.h   | 37 ++-
 3 files changed, 111 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 5d2062726d6..e145689c355 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -5182,7 +5182,11 @@ bool
 verify_type_context (location_t loc, type_context_kind context,
 const_tree type, bool silent_p)
 {
-  if (!sizeless_type_p (type))
+  const_tree tmp = type;
+  if (omp_type_context (context) && POINTER_TYPE_P (type))
+tmp = strip_pointer_types (tmp);
+
+  if (!sizeless_type_p (tmp))
 return true;
 
   switch (context)
@@ -5242,6 +5246,34 @@ verify_type_context (location_t loc, type_context_kind 
context,
   if (!silent_p)
error_at (loc, "capture by copy of SVE type %qT", type);
   return false;
+
+case TCTX_OMP_MAP:
+  if (!silent_p)
+   error_at (loc, "SVE type %qT not allowed in map clause", type);
+  return false;
+
+case TCTX_OMP_MAP_IMP_REF:
+  if (!silent_p)
+   error ("cannot reference %qT object types in target region", type);
+  return false;
+
+case TCTX_OMP_PRIVATE:
+  if (!silent_p)
+   error_at (loc, "SVE type %qT not allowed in target private clause", 
type);
+  return false;
+
+case TCTX_OMP_FIRSTPRIVATE:
+  if (!silent_p)
+   error_at (loc, "SVE type %qT not allowed in target firstprivate 
clause", type);
+  return false;
+
+case TCTX_OMP_DEVICE_ADDR:
+  if (!silent_p)
+   error_at (loc, "SVE type %qT not allowed in target device clauses", 
type);
+  return false;
+
+default:
+  break;
 }
   gcc_unreachable ();
 }
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index a03ca8cf4ee..47635bb3119 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -9081,11 +9081,20 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
decl, bool in_code)
  | GOVD_MAP_ALLOC_ONLY)) == flags)
{
  tree type = TREE_TYPE (decl);
+ location_t loc = DECL_SOURCE_LOCATION (decl);
 
  if (gimplify_omp_ctxp->target_firstprivatize_array_bases
  && omp_privatize_by_reference (decl))
type = TREE_TYPE (type);
- if (!omp_mappable_type (type))
+
+ if (!verify_type_context (loc, TCTX_OMP_MAP_IMP_REF, type))
+   {
+ /* Check if TYPE can appear in a target region.
+verify_type_context has already issued an error if it
+can't.  */
+ nflags |= GOVD_MAP | GOVD_EXPLICIT;
+   }
+ else if (!omp_mappable_type (type))
{
  error ("%qD referenced in target region does not have "
 "a mappable type", decl);
@@ -12815,6 +12824,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
   unsigned int flags;
   tree decl;
   auto_vec addr_tokens;
+  tree op = NULL_TREE;
+  location_t loc = OMP_CLAUSE_LOCATION (c);
 
   if (grp_end && c == OMP_CLAUSE_CHAIN (grp_end))
{
@@ -12822,6 +12833,36 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
  grp_end = NULL_TREE;
}
 
+  if (code == OMP_TARGET
+ || code == OMP_TARGET_DATA
+ || code == OMP_TARGET_ENTER_DATA
+ || code == OMP_TARGET_EXIT_DATA)
+   /* Do some target-specific type checks for map operands.  */
+   switch (OMP_CLAUSE_CODE (c))
+ {
+ case OMP_CLAUSE_MAP:
+   op = OMP_CLAUSE_OPERAND (c, 0);
+   verify_type_context (loc, TCTX_OMP_MAP, TREE_TYPE (op));
+   break;
+

[PATCH] c++: Fix range for with PMFs [PR118923]

2025-02-24 Thread Jakub Jelinek

Hi!

The following testcases segfault because the new range for -frange-for-ext-temps
temporary extension extends even the internal TARGET_EXPRs created by
get_member_function_from_ptrfunc.

The following patch fixes that by marking those TARGET_EXPR_INTERNAL_P.
I'm not calling get_internal_target_expr because I really want to force
the TARGET_EXPR creation rather than say mark some existing TARGET_EXPR
internal, and get_internal_target_expr doesn't take complain argument
either and always uses tf_warning_or_error.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2025-02-24  Jakub Jelinek  

PR c++/118923
* typeck.cc (get_member_function_from_ptrfunc): Set
TARGET_EXPR_INTERNAL_P flag on force_target_expr returned
TARGET_EXPRs.

* g++.dg/cpp0x/pr118923.C: New test.
* g++.dg/cpp1y/pr118923.C: New test.

--- gcc/cp/typeck.cc.jj 2025-02-13 19:59:55.463576294 +0100
+++ gcc/cp/typeck.cc2025-02-20 21:13:51.420841171 +0100
@@ -4218,17 +4218,23 @@ get_member_function_from_ptrfunc (tree *
  || (!nonvirtual
  && !DECL_P (instance_ptr)
  && !TREE_CONSTANT (instance_ptr)))
-   instance_ptr = instance_save_expr
- = force_target_expr (TREE_TYPE (instance_ptr), instance_ptr,
-  complain);
+   {
+ instance_ptr = instance_save_expr
+   = force_target_expr (TREE_TYPE (instance_ptr), instance_ptr,
+complain);
+ TARGET_EXPR_INTERNAL_P (instance_ptr) = true;
+   }
 
   /* See above comment.  */
   if (TREE_SIDE_EFFECTS (function)
  || (!nonvirtual
  && !DECL_P (function)
  && !TREE_CONSTANT (function)))
-   function
- = force_target_expr (TREE_TYPE (function), function, complain);
+   {
+ function
+   = force_target_expr (TREE_TYPE (function), function, complain);
+ TARGET_EXPR_INTERNAL_P (function) = true;
+   }
 
   /* Start by extracting all the information from the PMF itself.  */
   e3 = pfn_from_ptrmemfunc (function);
--- gcc/testsuite/g++.dg/cpp0x/pr118923.C.jj2025-02-24 11:37:54.638601995 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/pr118923.C   2025-02-24 11:39:33.940212633 
+0100
@@ -0,0 +1,66 @@
+// PR c++/118923
+// { dg-do run { target c++11 } }
+// { dg-additional-options "-frange-for-ext-temps" { target c++23 } }
+// { dg-additional-options "-fno-range-for-ext-temps" { target c++20_down } }
+
+int g;
+
+struct A {
+  int a[3];
+  A (int x, int y, int z) : a{x, y, z} { if ((g++ & 7) != 4) __builtin_abort 
(); }
+  A (const A &x) = delete;
+  ~A () { if ((g++ & 7) != 7 - 2 * (__cpp_range_based_for >= 202211)) 
__builtin_abort (); }
+  int *begin () { return a; }
+  int *end () { return a + 3; }
+};
+
+struct B {
+  B () { if ((g++ & 7) != 3) __builtin_abort (); }
+  B (const B &) = delete;
+  ~B () { if ((g++ & 7) != 5 + (__cpp_range_based_for >= 202211)) 
__builtin_abort (); }
+};
+
+struct C {
+  A foo (const B &) { return { 1, 2, 3 }; }
+  A bar (const B &) { return { 4, 5, 6 }; }
+  bool baz () { return b; }
+  bool b = false;
+  static C c;
+};
+
+C C::c;
+
+struct D {
+  D () { if ((g++ & 5) != 0) __builtin_abort (); }
+  D (const D &) = delete;
+  ~D () { if ((g & 7) != 1 && (g & 7) != 6 + (__cpp_range_based_for >= 
202211)) __builtin_abort (); g++; }
+};
+
+inline C *
+qux (const D &)
+{
+  return &C::c;
+}
+
+void 
+foo ()
+{
+  int z = 1;
+  auto d = qux (D {})->baz () ? &C::bar : &C::foo;
+  for (const int &r : (qux (D {})->*d) (B {}))
+if (z++ != r)
+  __builtin_abort ();
+  C::c.b = true;
+  d = qux (D {})->baz () ? &C::bar : &C::foo;
+  for (const int &r : (qux (D {})->*d) (B {}))
+if (z++ != r)
+  __builtin_abort ();
+}
+
+int
+main ()
+{
+  foo ();
+  if (g != 16)
+__builtin_abort ();
+}
--- gcc/testsuite/g++.dg/cpp1y/pr118923.C.jj2025-02-20 21:17:37.852648770 
+0100
+++ gcc/testsuite/g++.dg/cpp1y/pr118923.C   2025-02-20 21:19:46.662108221 
+0100
@@ -0,0 +1,38 @@
+// PR c++/118923
+// { dg-do run { target c++14 } }
+
+struct A {
+  int a[3] = { 0, 0, 0 };
+  int *begin () { return a; }
+  int *end () { return a + 3; }
+};
+
+struct B {
+  A foo () { return { 1, 2, 3 }; }
+  A bar () { return { 1, 2, 3 }; }
+  bool baz () { return b; }
+  bool b = false;
+  static B c;
+};
+
+B B::c;
+
+inline B *
+qux ()
+{
+  return &B::c;
+}
+
+void 
+foo ()
+{
+  auto d = qux ()->baz () ? &B::foo : &B::bar;
+  for (const int &r : (qux ()->*d) ())
+;
+}
+
+int
+main ()
+{
+  foo ();
+}

Jakub

[PATCH v3 1/3] gomp: Various fixes for SVE types [PR101018]

2025-02-24 Thread Tejas Belagod

From: Richard Sandiford 

Various parts of the omp code checked whether the size of a decl
was an INTEGER_CST in order to determine whether the decl was
variable-sized or not.  If it was variable-sized, it was expected
to have a DECL_VALUE_EXPR replacement, as for VLAs.

This patch uses poly_int_tree_p instead, so that variable-length
SVE vectors are treated like constant-length vectors.  This means
that some structures become poly_int-sized, with some fields at
poly_int offsets, but we already have code to handle that.

An alternative would have been to handle the data via indirection
instead.  However, that's likely to be more complicated, and it
would contradict is_variable_sized, which already uses a check
for TREE_CONSTANT rather than INTEGER_CST.

gimple_add_tmp_var should probably not add a safelen of 1
for SVE vectors, but that's really a separate thing and might
be hard to test.

Co-authored-by: Tejas Belagod 

gcc/
PR middle-end/101018
* poly-int.h (can_and_p): New function.
* fold-const.cc (poly_int_binop): Use it to optimize BIT_AND_EXPRs
involving POLY_INT_CSTs.
* gimplify.cc (gimplify_bind_expr): Use poly_int_tree_p instead
of INTEGER_CST when checking for constant-sized omp data.
(omp_add_variable): Likewise.
(omp_notice_variable): Likewise.
(gimplify_adjust_omp_clauses_1): Likewise.
(gimplify_adjust_omp_clauses): Likewise.
* omp-low.cc (scan_sharing_clauses): Likewise.
(lower_omp_target): Likewise.
---
 gcc/fold-const.cc |  7 +++
 gcc/gimplify.cc   | 19 +--
 gcc/omp-low.cc|  2 +-
 gcc/poly-int.h| 19 +++
 4 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index f9f7f4d2f91..072a0b62e13 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -1284,6 +1284,13 @@ poly_int_binop (poly_wide_int &res, enum tree_code code,
return false;
   break;
 
+case BIT_AND_EXPR:
+  if (TREE_CODE (arg2) != INTEGER_CST
+ || !can_and_p (wi::to_poly_wide (arg1), wi::to_wide (arg2),
+&res))
+   return false;
+  break;
+
 case BIT_IOR_EXPR:
   if (TREE_CODE (arg2) != INTEGER_CST
  || !can_ior_p (wi::to_poly_wide (arg1), wi::to_wide (arg2),
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index cc0172cf96e..a03ca8cf4ee 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -9160,7 +9160,8 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
decl, bool in_code)
   && (flags & (GOVD_SEEN | GOVD_LOCAL)) == GOVD_SEEN
   && DECL_SIZE (decl))
 {
-  if (TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
+  tree size;
+  if (!poly_int_tree_p (DECL_SIZE (decl)))
{
  splay_tree_node n2;
  tree t = DECL_VALUE_EXPR (decl);
@@ -9171,16 +9172,14 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
decl, bool in_code)
  n2->value |= GOVD_SEEN;
}
   else if (omp_privatize_by_reference (decl)
-  && TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl)))
-  && (TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl
-  != INTEGER_CST))
+  && (size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl
+  && !poly_int_tree_p (size))
{
  splay_tree_node n2;
- tree t = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl)));
- gcc_assert (DECL_P (t));
- n2 = splay_tree_lookup (ctx->variables, (splay_tree_key) t);
+ gcc_assert (DECL_P (size));
+ n2 = splay_tree_lookup (ctx->variables, (splay_tree_key) size);
  if (n2)
-   omp_notice_variable (ctx, t, true);
+   omp_notice_variable (ctx, size, true);
}
 }
 
@@ -14437,7 +14436,7 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void 
*data)
   if ((gimplify_omp_ctxp->region_type & ORT_ACC) == 0)
OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P (clause) = 1;
   if (DECL_SIZE (decl)
- && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
+ && !poly_int_tree_p (DECL_SIZE (decl)))
{
  tree decl2 = DECL_VALUE_EXPR (decl);
  gcc_assert (INDIRECT_REF_P (decl2));
@@ -15178,7 +15177,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, 
gimple_seq body, tree *list_p,
  if (!DECL_P (decl))
break;
  if (DECL_SIZE (decl)
- && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
+ && !poly_int_tree_p (DECL_SIZE (decl)))
{
  tree decl2 = DECL_VALUE_EXPR (decl);
  gcc_assert (INDIRECT_REF_P (decl2));
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index c36ae38cf8e..00bcf27fa47 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -1461,7 +1461,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
  else
install_var_field (decl, false, 11, ctx);
  if (DECL_SIZE (decl)
- && TREE_CODE (

Re: [PATCH] ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243)

2025-02-24 Thread Martin Jambor

Hello,

and ping please.

Thanks,

Martin


On Mon, Feb 10 2025, Martin Jambor wrote:
> Hello, 
>
> among other things, IPA-SRA checks whether splitting out a bit of an
> aggregate or something passed by reference would lead into a clash
> with an already known IPA-CP constant a way which would cause problems
> later on.  Unfortunately the test is done only in
> adjust_parameter_descriptions and is missing when accesses are
> propagated from callees to callers, which leads to miscompilation
> reported as PR 118243 (where the callee is a function created by
> ipa-split).
>
> The matter is then further complicated by the fact that we consider
> complex numbers as scalars even though they can be modified piecemeal
> (IPA-CP can detect and propagate the pieces separately too) which then
> confuses the parameter manipulation machinery furter.
>
> This patch simply adds the missing check to avoid the IPA-SRA
> transform in these cases too, which should be suitable for backporting
> to all affected release branches.  It is a bit of a shame as in the PR
> testcase we do propagate both components of the complex number in
> question and the transformation phase could recover.  I have some
> prototype patches in this direction but that is something for (a)
> stage 1.
>
> Bootstrapped and tested on x86_64-linux.  OK for master and (assuming it
> applies cleanly and passes the checks there too) to all active release
> branches?
>
> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
> 2025-02-10  Martin Jambor  
>
>   PR ipa/118243
>   * ipa-sra.cc (pull_accesses_from_callee): New parameters
>   caller_ipcp_ts and param_idx.  Check that scalar pulled accesses would
>   not clash with a known IPA-CP aggregate constant.
>   (param_splitting_across_edge): Pass IPA-CP transformation summary and
>   caller parameter index to pull_accesses_from_callee.
>
> gcc/testsuite/ChangeLog:
>
> 2025-02-10  Martin Jambor  
>
>   PR ipa/118243
>   * g++.dg/ipa/pr118243.C: New test.
> ---
>  gcc/ipa-sra.cc  | 38 +++
>  gcc/testsuite/g++.dg/ipa/pr118243.C | 40 +
>  2 files changed, 68 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr118243.C
>
> diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
> index ad80d22f8ce..5d1703ed394 100644
> --- a/gcc/ipa-sra.cc
> +++ b/gcc/ipa-sra.cc
> @@ -3640,15 +3640,19 @@ enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, 
> ACC_PROP_CERTAIN};
>  
>  /* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC,
> (which belongs to CALLER) if they would not violate some constraint there.
> -   If successful, return NULL, otherwise return the string reason for failure
> -   (which can be written to the dump file).  DELTA_OFFSET is the known offset
> -   of the actual argument withing the formal parameter (so of ARG_DESCS 
> within
> -   PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not
> -   known. In case of success, set *CHANGE_P to true if propagation actually
> -   changed anything.  */
> +   CALLER_IPCP_TS describes the caller, PARAM_IDX is the index of the 
> parameter
> +   described by PARAM_DESC.  If successful, return NULL, otherwise return the
> +   string reason for failure (which can be written to the dump file).
> +   DELTA_OFFSET is the known offset of the actual argument withing the formal
> +   parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the size of 
> the
> +   actual argument or zero, if not known. In case of success, set *CHANGE_P 
> to
> +   true if propagation actually changed anything.  */
>  
>  static const char *
> -pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc,
> +pull_accesses_from_callee (cgraph_node *caller,
> +ipcp_transformation *caller_ipcp_ts,
> +int param_idx,
> +isra_param_desc *param_desc,
>  isra_param_desc *arg_desc,
>  unsigned delta_offset, unsigned arg_size,
>  bool *change_p)
> @@ -3673,6 +3677,17 @@ pull_accesses_from_callee (cgraph_node *caller, 
> isra_param_desc *param_desc,
>   continue;
>  
>unsigned offset = argacc->unit_offset + delta_offset;
> +
> +  if (caller_ipcp_ts && !AGGREGATE_TYPE_P (argacc->type))
> + {
> +   ipa_argagg_value_list avl (caller_ipcp_ts);
> +   tree value = avl.get_value (param_idx, offset);
> +   if (value && ((tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value)))
> +  / BITS_PER_UNIT)
> + != argacc->unit_size))
> + return " propagated access would conflict with an IPA-CP constant";
> + }
> +
>/* Given that accesses are initially stored according to increasing
>offset and decreasing size in case of equal offsets, the following
>searches could be written more efficiently if we kept the ordering

[PATCH] vect: Use original LHS type for gather pattern [PR118950].

2025-02-24 Thread Robin Dapp

Hi,

in PR118950 we do not zero masked elements in a gather load.
While recognizing a gather/scatter pattern we do not use the original
type of the LHS.  This matters because the type can differ with bool
patterns (e.g. _Bool vs unsigned char) and we don't notice the need
for zeroing out the padding bytes.

This patch just uses the original LHS's type.

Bootstrapped and regtested on x86, aarch64, and power10.
Regtested on rv64gcv_zvl512b.


PR middle-end/118950

gcc/ChangeLog:

* tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Use
original LHS's type.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr118950.c: New test.
---
 .../gcc.target/riscv/rvv/autovec/pr118950.c   | 29 +++
 gcc/tree-vect-patterns.cc |  3 +-
 2 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c
new file mode 100644
index 000..604d4264eac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-require-effective-target riscv_v_ok } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-std=gnu99 -Wno-pedantic" } */
+
+unsigned char a;
+long long r;
+_Bool h = 1;
+short j[23];
+_Bool k[3][23];
+
+void b(_Bool h, short j[], _Bool k[][23]) {
+  for (int m = 0; m < 23; m += 3)
+for (short n = 0; n < 22; n += 4)
+  a = ({
+unsigned char o = a;
+unsigned char p = j[n] ? h : k[m][n];
+o > p ? o : p;
+  });
+}
+
+int main() {
+  for (int m = 0; m < 23; ++m)
+j[m] = 10;
+  b(h, j, k);
+  r = a;
+  if (r != 1)
+__builtin_abort ();
+}
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 6fc97d1b6ef..4f0a7ea162b 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -6022,7 +6022,8 @@ vect_recog_gather_scatter_pattern (vec_info *vinfo,
   else
pattern_stmt = gimple_build_call_internal (gs_info.ifn, 4, base,
   offset, scale, zero);
-  tree load_lhs = vect_recog_temp_ssa_var (gs_info.element_type, NULL);
+  tree lhs = gimple_get_lhs (stmt_info->stmt);
+  tree load_lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
   gimple_call_set_lhs (pattern_stmt, load_lhs);
 }
   else
-- 
2.48.1


-- 
Regards
 Robin

[PATCH] RISC-V: Include pattern stmts for dynamic LMUL computation [PR114516].

2025-02-24 Thread Robin Dapp

Hi,

when scanning for program points, i.e. vector statements, we're missing
pattern statements.  In PR114516 this becomes obvious as we choose
LMUL=8 assuming there are only three statements but the divmod pattern
adds another three.  Those push us beyond four registers so we need to
switch to LMUL=4.

This patch adds pattern statements to the program points which helps
calculate a better register pressure estimate.

Regtested on rv64gcv_zvl512b.

PR target/114516

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc (compute_estimated_lmul):
Add pattern statements to program points.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/riscv/rvv/pr114516.c: New test.
---
 gcc/config/riscv/riscv-vector-costs.cc| 29 +++
 .../vect/costmodel/riscv/rvv/pr114516.c   | 29 +++
 2 files changed, 58 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c

diff --git a/gcc/config/riscv/riscv-vector-costs.cc 
b/gcc/config/riscv/riscv-vector-costs.cc
index 5149751ae76..8aad05c6180 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -215,6 +215,35 @@ compute_local_program_points (
 "program point %d: %G", info.point,
 gsi_stmt (si));
}
+
+ /* If the statement is part of a pattern, also add the other
+pattern statements.  */
+ gimple_seq pattern_def_seq;
+ if (STMT_VINFO_IN_PATTERN_P (stmt_info)
+ && (pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info)))
+   {
+ gimple_stmt_iterator si2;
+
+ for (si2 = gsi_start (pattern_def_seq);
+  !gsi_end_p (si2);
+  gsi_next (&si2))
+   {
+ stmt_vec_info pattern_def_stmt_info
+   = vinfo->lookup_stmt (gsi_stmt (si2));
+ if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info)
+ || STMT_VINFO_LIVE_P (pattern_def_stmt_info))
+   {
+ stmt_point info = {point, gsi_stmt (si2),
+ pattern_def_stmt_info};
+ program_points.safe_push (info);
+ point++;
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, vect_location,
+"program point %d: %G",
+info.point, gsi_stmt (si2));
+   }
+   }
+   }
}
  program_points_per_bb.put (bb, program_points);
}
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c
new file mode 100644
index 000..5a514d1bbf6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zba_zbb -mabi=lp64d -mrvv-max-lmul=dynamic -O3 
-fdump-tree-vect-details -Wno-pedantic" } */
+
+typedef float real_t;
+__attribute__((aligned(64))) real_t a[32000];
+real_t s315()
+{
+for (int i = 0; i < 32000; i++)
+a[i] = (i * 7) % 32000;
+real_t x, chksum;
+int index;
+for (int nl = 0; nl < 256; nl++) {
+x = a[0];
+index = 0;
+for (int i = 0; i < 32000; ++i) {
+if (a[i] > x) {
+x = a[i];
+index = i;
+}
+}
+chksum = x + (real_t) index;
+}
+return index + x + 1;
+}
+
+/* { dg-final { scan-assembler {e32,m4} } } */
+/* { dg-final { scan-assembler-not {e32,m8} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-times "Preferring smaller LMUL loop because it 
has unexpected spills" 1 "vect" } } */
-- 
2.48.1


-- 
Regards
 Robin

RE: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]

2025-02-24 Thread Li, Pan2

Thanks Robin for comments.

> I agree with the general idea but I'm a bit wary fiddling with the 
> coefficients
> directly.  I think for a fixed-size, non VLA vector it should be sufficient to
> check whether the last step overflows.

Initial idea I would like to take care of VLS and VLA in the same chunk of 
code, given
the expand_const_vector is sort of complicated currently(I don't want to make 
it worse by add more if VLA, else if VLS ... etc).

I will try to leverage poly_int mult, shift, and compare if possible, instead 
of touch the coefficients directly in v2.

> For VLA we actually know the largest possible runtime vector length and could
> use it the check?  But maybe it's probably easier to bail with VLA anyway.
> In addition: if we're already checking the step anyway we could get rid of the
> negative step restriction as well.

I don't explore more cases here consider we are in stage 4. I think the 
expand_const_vector need some
refactor up to a point. 

Pan

-Original Message-
From: Robin Dapp  
Sent: Monday, February 24, 2025 4:29 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; Robin 
Dapp 
Subject: Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave 
[PR118931]

> This patch would like to perform the overflow to smode check before IOR
> the base2 series, and perform the clean highest bit if the const_vector
> overflow to smode occurs.  If no overflow, there will do nothing here.

I agree with the general idea but I'm a bit wary fiddling with the coefficients
directly.  I think for a fixed-size, non VLA vector it should be sufficient to
check whether the last step overflows.

For VLA we actually know the largest possible runtime vector length and could
use it the check?  But maybe it's probably easier to bail with VLA anyway.

In addition: if we're already checking the step anyway we could get rid of the
negative step restriction as well.

-- 
Regards
 Robin

Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]

2025-02-24 Thread Robin Dapp

> I don't explore more cases here consider we are in stage 4. I think the
> expand_const_vector need some refactor up to a point.

I added the negative step check just some weeks ago and I'd see it as
simplification to remove the restriction again if you're touching the actual
step anyway.  So I wouldn't worry about stage 4 in that regard.

-- 
Regards
 Robin

RE: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]

2025-02-24 Thread Li, Pan2

> I added the negative step check just some weeks ago and I'd see it as
> simplification to remove the restriction again if you're touching the actual
> step anyway.  So I wouldn't worry about stage 4 in that regard.

Oh, I see. I'll have a try after this bug fix.

Pan 

-Original Message-
From: Robin Dapp  
Sent: Monday, February 24, 2025 7:44 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; Robin 
Dapp 
Subject: Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave 
[PR118931]

> I don't explore more cases here consider we are in stage 4. I think the
> expand_const_vector need some refactor up to a point.

I added the negative step check just some weeks ago and I'd see it as
simplification to remove the restriction again if you're touching the actual
step anyway.  So I wouldn't worry about stage 4 in that regard.

-- 
Regards
 Robin

Re: [PATCH] RISC-V: Include pattern stmts for dynamic LMUL computation [PR114516].

2025-02-24 Thread 钟居哲

LGTM



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2025-02-24 19:14
To: gcc-patches
CC: pal...@dabbelt.com; kito.ch...@gmail.com; juzhe.zh...@rivai.ai; 
jeffreya...@gmail.com; pan2...@intel.com; rdapp@gmail.com
Subject: [PATCH] RISC-V: Include pattern stmts for dynamic LMUL computation 
[PR114516].
Hi,
 
when scanning for program points, i.e. vector statements, we're missing
pattern statements.  In PR114516 this becomes obvious as we choose
LMUL=8 assuming there are only three statements but the divmod pattern
adds another three.  Those push us beyond four registers so we need to
switch to LMUL=4.
 
This patch adds pattern statements to the program points which helps
calculate a better register pressure estimate.
 
Regtested on rv64gcv_zvl512b.
 
PR target/114516
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-costs.cc (compute_estimated_lmul):
Add pattern statements to program points.
 
gcc/testsuite/ChangeLog:
 
* gcc.dg/vect/costmodel/riscv/rvv/pr114516.c: New test.
---
gcc/config/riscv/riscv-vector-costs.cc| 29 +++
.../vect/costmodel/riscv/rvv/pr114516.c   | 29 +++
2 files changed, 58 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c
 
diff --git a/gcc/config/riscv/riscv-vector-costs.cc 
b/gcc/config/riscv/riscv-vector-costs.cc
index 5149751ae76..8aad05c6180 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -215,6 +215,35 @@ compute_local_program_points (
 "program point %d: %G", info.point,
 gsi_stmt (si));
}
+
+   /* If the statement is part of a pattern, also add the other
+ pattern statements.  */
+   gimple_seq pattern_def_seq;
+   if (STMT_VINFO_IN_PATTERN_P (stmt_info)
+   && (pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info)))
+ {
+   gimple_stmt_iterator si2;
+
+   for (si2 = gsi_start (pattern_def_seq);
+!gsi_end_p (si2);
+gsi_next (&si2))
+ {
+   stmt_vec_info pattern_def_stmt_info
+ = vinfo->lookup_stmt (gsi_stmt (si2));
+   if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info)
+   || STMT_VINFO_LIVE_P (pattern_def_stmt_info))
+ {
+   stmt_point info = {point, gsi_stmt (si2),
+   pattern_def_stmt_info};
+   program_points.safe_push (info);
+   point++;
+   if (dump_enabled_p ())
+ dump_printf_loc (MSG_NOTE, vect_location,
+  "program point %d: %G",
+  info.point, gsi_stmt (si2));
+ }
+ }
+ }
}
  program_points_per_bb.put (bb, program_points);
}
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c
new file mode 100644
index 000..5a514d1bbf6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zba_zbb -mabi=lp64d -mrvv-max-lmul=dynamic -O3 
-fdump-tree-vect-details -Wno-pedantic" } */
+
+typedef float real_t;
+__attribute__((aligned(64))) real_t a[32000];
+real_t s315()
+{
+for (int i = 0; i < 32000; i++)
+a[i] = (i * 7) % 32000;
+real_t x, chksum;
+int index;
+for (int nl = 0; nl < 256; nl++) {
+x = a[0];
+index = 0;
+for (int i = 0; i < 32000; ++i) {
+if (a[i] > x) {
+x = a[i];
+index = i;
+}
+}
+chksum = x + (real_t) index;
+}
+return index + x + 1;
+}
+
+/* { dg-final { scan-assembler {e32,m4} } } */
+/* { dg-final { scan-assembler-not {e32,m8} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-times "Preferring smaller LMUL loop because it 
has unexpected spills" 1 "vect" } } */
-- 
2.48.1
 
 
-- 
Regards
Robin

Re: [PATCH] vect: Use original LHS type for gather pattern [PR118950].

2025-02-24 Thread Richard Biener

On Mon, 24 Feb 2025, Robin Dapp wrote:

> Hi,
> 
> in PR118950 we do not zero masked elements in a gather load.
> While recognizing a gather/scatter pattern we do not use the original
> type of the LHS.  This matters because the type can differ with bool
> patterns (e.g. _Bool vs unsigned char) and we don't notice the need
> for zeroing out the padding bytes.
> 
> This patch just uses the original LHS's type.
> 
> Bootstrapped and regtested on x86, aarch64, and power10.
> Regtested on rv64gcv_zvl512b.

OK.

Thanks,
Richard.

> 
>   PR middle-end/118950
> 
> gcc/ChangeLog:
> 
>   * tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Use
>   original LHS's type.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/autovec/pr118950.c: New test.
> ---
>  .../gcc.target/riscv/rvv/autovec/pr118950.c   | 29 +++
>  gcc/tree-vect-patterns.cc |  3 +-
>  2 files changed, 31 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c
> 
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c
> new file mode 100644
> index 000..604d4264eac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c
> @@ -0,0 +1,29 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target riscv_v_ok } */
> +/* { dg-add-options riscv_v } */
> +/* { dg-additional-options "-std=gnu99 -Wno-pedantic" } */
> +
> +unsigned char a;
> +long long r;
> +_Bool h = 1;
> +short j[23];
> +_Bool k[3][23];
> +
> +void b(_Bool h, short j[], _Bool k[][23]) {
> +  for (int m = 0; m < 23; m += 3)
> +for (short n = 0; n < 22; n += 4)
> +  a = ({
> +unsigned char o = a;
> +unsigned char p = j[n] ? h : k[m][n];
> +o > p ? o : p;
> +  });
> +}
> +
> +int main() {
> +  for (int m = 0; m < 23; ++m)
> +j[m] = 10;
> +  b(h, j, k);
> +  r = a;
> +  if (r != 1)
> +__builtin_abort ();
> +}
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index 6fc97d1b6ef..4f0a7ea162b 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -6022,7 +6022,8 @@ vect_recog_gather_scatter_pattern (vec_info *vinfo,
>else
>   pattern_stmt = gimple_build_call_internal (gs_info.ifn, 4, base,
>  offset, scale, zero);
> -  tree load_lhs = vect_recog_temp_ssa_var (gs_info.element_type, NULL);
> +  tree lhs = gimple_get_lhs (stmt_info->stmt);
> +  tree load_lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
>gimple_call_set_lhs (pattern_stmt, load_lhs);
>  }
>else
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PING] [PATCH v3] get source line for diagnostic from preprocessed file [PR preprocessor/79106]

2025-02-24 Thread Bader, Lucas

Gentle ping for 
https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675450.html

Re: [PATCH] c++: Fix range for with PMFs [PR118923]

2025-02-24 Thread Jason Merrill


On 2/24/25 5:52 AM, Jakub Jelinek wrote:

Hi!

The following testcases segfault because the new range for -frange-for-ext-temps
temporary extension extends even the internal TARGET_EXPRs created by
get_member_function_from_ptrfunc.

The following patch fixes that by marking those TARGET_EXPR_INTERNAL_P.
I'm not calling get_internal_target_expr because I really want to force
the TARGET_EXPR creation rather than say mark some existing TARGET_EXPR
internal,


Hmm, good point, but that seems like a reason to use force_target_expr 
in gen_internal_target_expr.



and get_internal_target_expr doesn't take complain argument
either and always uses tf_warning_or_error.


That shouldn't be an issue for any internal temporaries; complain is 
only relevant for building the cleanup that internal temps must not have.


It looks like there are a few more internal uses of force_target_expr in 
cp_finish_decl and build_comparison_op.


Jason

Re: The COBOL front end, version 3, now in 14 easy pieces

2025-02-24 Thread Richard Biener

On Wed, Feb 19, 2025 at 12:38 AM James K. Lowden
 wrote:
>
> The following 14 patches constitute 105,720 lines of code in 83 files
> to build and document the COBOL front end.  The messages are
> in a more or less logical order. We have:
>
>  1/14   4K dir: create gcc/cobol and libgcobol directories
>  2/14   8K pre: introduce ChangeLog files
>  3/14  80K bld: config and build machinery
>  4/14 376K hdr: header files
>  5/14 152K lex: lexer
>  6/14 476K par: parser
>  7/14 344K cbl: parser support
>  8/14 516K api: GENERIC interface
>  9/14 244K gen: GENERIC interface support
> 10/14  72K doc: man pages and GnuCOBOL emulation
> 11/14  84K lhd: libgcobol header files
> 12/14 320K lib: libgcobol support
> 13/14 372K lcc: libgcobol, main file
> 14/14 148K fun: libgcobol, intrinsic functions
>
> To slide under the 400 KB limit, the intrinsic functions now have
> their own patch.  The configure files are removed, as is the Posix
> adapter framework.
>
> They are still against the master branch as of
>
> commit 3e08a4ecea27c54fda90e8f58641b1986ad957e1
> Date:   Wed Feb 5 14:22:33 2025 -0700
>
> Our repository is
>
> https://gitlab.cobolworx.com/COBOLworx/gcc-cobol/
>
> using branch
>
> cobol-stage
>
> I tested these patches using "git apply" to an unpublished branch
> "cobol-patched".

I have now built the compiler from the (now published) cobol-patched branch.

On x86_64-linux and noticed the following issues:

The toplevel configure script on the branch wasn't re-generated (duh),
so it isn't
ready for use.

gcc-cobol/gcc/cobol/parse.y:1361.10-16: error: require bison 3.5.1,
but have 3.0.4
 %require "3.5.1"  //3.8.2 also works, but not 3.8.0
  ^^^

this requirement isn't documented, neither is a version requirement
for flex.  The
place for this is in gcc/doc/install.texi

During the build we write to

gcc/cobol/charmaps-dupe.cc
gcc/cobol/valconv-dupe.cc

but sources might reside on a R/O mounted volume and the above is undesirable
unless --enable-generated-files-in-srcdir.

Installing the result via make install DESTDIR=/foo I see both a
'gcobol' and a 'gcobc' program
being installed - is that intentional?  I also see the gcobol.3
manpage reside directly in /foo/gcobol.3 rather
than in the expected /foo/usr/local/man/..

Installing of libgcobol fails for me:

make[4]: Entering directory
'/home/rguenther/obj2/x86_64-pc-linux-gnu/32/libgcobol'
Suppressing the 32-bit build because of lack of support for __int128 variables
/bin/sh ../../../../gcc-cobol/libgcobol/../mkinstalldirs
/home/rguenther/install/gcc-cobol/usr/local/lib
/bin/sh ./libtool  --mode=install /usr/bin/install -c libgcobol.la
/home/rguenther/install/gcc-cobol/usr/local/lib;
libtool: install: `libgcobol.la' is not a valid libtool archive
libtool: install: Try `libtool --help --mode=install' for more information.
make[4]: *** [Makefile:246: install-libs] Error 1
make[4]: Leaving directory
'/home/rguenther/obj2/x86_64-pc-linux-gnu/32/libgcobol'
make[3]: *** [Makefile:301: multi-do] Error 1

so the suppression seems incomplete?

Compiling a Cobol Hello World results in

> ./install/gcc-cobol/usr/local/bin/gcobol t.cob
/usr/bin/ld: cannot find -lgcobol: No such file or directory
collect2: error: ld returned 1 exit status

possibly because the 64bit libcobol is installed in
/foo/usr/local/lib/ rather than .../lib64/ and
the former is not in the configured search directory.

Manually fixing this up results in a working binary!

Doing

> ./install/gcc-cobol/usr/local/bin/gcobol t.cob -m32

only fails during linking as we do build a 32bit binary but do not find
(the not built) 32bit runtime.  I suspect when writing a program
that would need 128bit integer or float we'd eventually ICE.

The cobol frontend should attempt to intercept the selection of a
not supported multilib (but don't ask me how) and gracefully exit,
possibly calling

 sorry ("selected multilib not available for Cobol");

or so.  There's the LANG_HOOKS_POST_OPTIONS langhook which
might be a place to do this.  The way to check is likely whether
a mode for __int128 exists and is supported.  The C fronted does

  if (targetm.scalar_mode_supported_p (TImode))

it also checks float128_type_node, but I didn't find who initializes that.

Richard.

> We have endeavored to address all must-fix issues raised in Round 2.
>
> 1.  Generated files use Autoconf 2.69
> 2.  Commit message matches mail Subject: line
> 3.  Various problems with Make-lang.in and cobol1.cc
> 4.  s/assert(false)/gcc_unreachable()/g
> 5.  Nixed range-based cases
> 6.  Removed Posix adapter files & generated configure scripts
> 7.  Explained memory-management engineering choice
> 8.  s/option_id/option_zero/g, for clarity
> 9.  GTY issues
> 10. Require only C++14 (not 17)
> 11. Moved #include 
> 12. Check regex buffer bounds outside gcc_assert
>
> Still to do (no particular order):
>
> 13. Try SARIF options
> 14. Do not compose messages (I18N).
> 15. Try valgrind for me

Re: [PATCH] avoid-store-forwarding: Handle REG_EH_REGION notes

2025-02-24 Thread Jeff Law





On 2/24/25 2:23 AM, Richard Biener wrote:

On Fri, Feb 21, 2025 at 4:55 PM Konstantinos Eleftheriou
 wrote:


Hi Richard, thanks for the feedback.

On Tue, Feb 18, 2025 at 9:17 PM Richard Biener
 wrote:





Am 18.02.2025 um 17:04 schrieb Konstantinos Eleftheriou 
:

From: kelefth 

The pass rejects the transformation when there are instructions in the
sequence that might throw an exception. This was added due to having
cases that the load instruction contains a REG_EH_REGION note and
moving it before the store instructions caused an error, as it was
no longer the last instruction in the basic block.

This patch handles those cases by moving a possible REG_EH_REGION
note from the load instruction of the store-load sequence to the
last instruction of the basic block.


But that’s not a correct transform and will lead to bogus exception handling?  
You’d need to move the note and split the block, possibly updating the EH info 
on the side.


I had originally thought about splitting the block, but tried to get
away with this solution. In case that we are splitting the block after
the load, we wouldn't need to also move the note, right?


I don't think it makes much sense in trying to handle loads/stores with EH,
you generally can't re-order stuff around EH since that changes what is
observable from the EH handler and the EH might also guard exceptions
on later stmts.
But if all the old statements and all the new/adjusted statements are in 
the same EH region is this still a problem?


Jeff

[PATCH] c++, v2: Fix range for with PMFs [PR118923]

2025-02-24 Thread Jakub Jelinek

On Mon, Feb 24, 2025 at 08:29:31AM -0500, Jason Merrill wrote:
> On 2/24/25 5:52 AM, Jakub Jelinek wrote:
> > The following testcases segfault because the new range for 
> > -frange-for-ext-temps
> > temporary extension extends even the internal TARGET_EXPRs created by
> > get_member_function_from_ptrfunc.
> > 
> > The following patch fixes that by marking those TARGET_EXPR_INTERNAL_P.
> > I'm not calling get_internal_target_expr because I really want to force
> > the TARGET_EXPR creation rather than say mark some existing TARGET_EXPR
> > internal,
> 
> Hmm, good point, but that seems like a reason to use force_target_expr in
> gen_internal_target_expr.
> 
> > and get_internal_target_expr doesn't take complain argument
> > either and always uses tf_warning_or_error.
> 
> That shouldn't be an issue for any internal temporaries; complain is only
> relevant for building the cleanup that internal temps must not have.
> 
> It looks like there are a few more internal uses of force_target_expr in
> cp_finish_decl and build_comparison_op.

So like this if it passes full bootstrap/regtest?
So far passed make check-g++ in gcc (98,11,14,17,20,23,26) and make check
in libstdc++-v3.

2025-02-24  Jakub Jelinek  

PR c++/118923
* tree.cc (get_internal_target_expr): Use force_target_expr
instead of build_target_expr_with_type.
* typeck.cc (get_member_function_from_ptrfunc): Use
get_internal_target_expr instead of force_target_expr.
* decl.cc (cp_finish_decl): Likewise.
* method.cc (build_comparison_op): Likewise.

* g++.dg/cpp0x/pr118923.C: New test.
* g++.dg/cpp1y/pr118923.C: New test.

--- gcc/cp/tree.cc.jj   2025-02-24 00:06:25.776732504 +0100
+++ gcc/cp/tree.cc  2025-02-24 15:04:28.991381102 +0100
@@ -982,8 +982,7 @@ tree
 get_internal_target_expr (tree init)
 {
   init = convert_bitfield_to_declared_type (init);
-  tree t = build_target_expr_with_type (init, TREE_TYPE (init),
-   tf_warning_or_error);
+  tree t = force_target_expr (TREE_TYPE (init), init, tf_warning_or_error);
   TARGET_EXPR_INTERNAL_P (t) = true;
   return t;
 }
--- gcc/cp/typeck.cc.jj 2025-01-23 11:17:39.651880614 +0100
+++ gcc/cp/typeck.cc2025-02-24 15:05:27.817562229 +0100
@@ -4219,16 +4219,14 @@ get_member_function_from_ptrfunc (tree *
  && !DECL_P (instance_ptr)
  && !TREE_CONSTANT (instance_ptr)))
instance_ptr = instance_save_expr
- = force_target_expr (TREE_TYPE (instance_ptr), instance_ptr,
-  complain);
+ = get_internal_target_expr (instance_ptr);
 
   /* See above comment.  */
   if (TREE_SIDE_EFFECTS (function)
  || (!nonvirtual
  && !DECL_P (function)
  && !TREE_CONSTANT (function)))
-   function
- = force_target_expr (TREE_TYPE (function), function, complain);
+   function = get_internal_target_expr (function);
 
   /* Start by extracting all the information from the PMF itself.  */
   e3 = pfn_from_ptrmemfunc (function);
--- gcc/cp/decl.cc.jj   2025-02-24 00:06:25.665734038 +0100
+++ gcc/cp/decl.cc  2025-02-24 15:08:52.027729921 +0100
@@ -9377,8 +9377,7 @@ cp_finish_decl (tree decl, tree init, bo
  tree guard = NULL_TREE;
  if (cleanups || cleanup)
{
- guard = force_target_expr (boolean_type_node,
-boolean_false_node, tf_none);
+ guard = get_internal_target_expr (boolean_false_node);
  add_stmt (guard);
  guard = TARGET_EXPR_SLOT (guard);
}
@@ -9407,8 +9406,7 @@ cp_finish_decl (tree decl, tree init, bo
 popped that all, so push those extra cleanups around
 the whole sequence with a guard variable.  */
  gcc_assert (TREE_CODE (sl) == STATEMENT_LIST);
- guard = force_target_expr (integer_type_node,
-integer_zero_node, tf_none);
+ guard = get_internal_target_expr (integer_zero_node);
  add_stmt (guard);
  guard = TARGET_EXPR_SLOT (guard);
  for (unsigned i = 0; i < n_extra_cleanups; ++i)
--- gcc/cp/method.cc.jj 2025-01-15 08:59:46.383217835 +0100
+++ gcc/cp/method.cc2025-02-24 15:10:18.674537705 +0100
@@ -1597,7 +1597,7 @@ build_comparison_op (tree fndecl, bool d
  /* Some other array, will need runtime loop.  */
  else
{
- idx = force_target_expr (sizetype, maxval, complain);
+ idx = get_internal_target_expr (maxval);
  loop_indexes = tree_cons (idx, NULL_TREE, loop_indexes);
}
  expr_type = TREE_TYPE (expr_type);
--- gcc/testsuite/g++.dg/cpp0x/pr118923.C.jj2025-02-24 15:02:04.966385945 
+0100
+++ gcc/testsuite/g++.dg/cpp0x

[Fortran, Patch, PR107635, 5_v1] Use correct data size in caf_transfer_between_remotes

2025-02-24 Thread Andre Vehreschild

Hi all,

I nearly forgot to publish this patch:

When transfering data between two remote images, i.e. a third images asks image
one to read some data and then asks image two to put that data into its memory,
the size of the data to transfer between these two images was miscalculated. The
attached patch fixes this.

Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

Btw, in theory this should be last patch for this PR.

Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gmx dot de
From 14432031c7f247e2b4d7e76614553b5379d543b2 Mon Sep 17 00:00:00 2001
From: Andre Vehreschild 
Date: Fri, 21 Feb 2025 14:06:28 +0100
Subject: [PATCH 2/2] Fortran: Fix detection of descriptor arrays in coarray
 [PR107635]

Look at the formal arguments generated type in the function declaration
to figure if an argument is a descriptor arrays.  Fix handling of class
types while splitting coarray expressions.

	PR fortran/107635

gcc/fortran/ChangeLog:

	* coarray.cc (fixup_comp_refs): For class types set correct
	component (class) type.
	(split_expr_at_caf_ref): Provide location.
	* trans-intrinsic.cc (conv_caf_send_to_remote): Look at
	generated formal argument and not declared one to detect
	descriptor arrays.
	(conv_caf_sendget): Same.
---
 gcc/fortran/coarray.cc | 15 ++-
 gcc/fortran/trans-intrinsic.cc | 30 --
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/gcc/fortran/coarray.cc b/gcc/fortran/coarray.cc
index e5648e0d027..f53de0b20e3 100644
--- a/gcc/fortran/coarray.cc
+++ b/gcc/fortran/coarray.cc
@@ -295,11 +295,12 @@ move_coarray_ref (gfc_ref **from, gfc_expr *expr)
 static void
 fixup_comp_refs (gfc_expr *expr)
 {
-  gfc_symbol *type = expr->symtree->n.sym->ts.type == BT_DERIVED
-		   ? expr->symtree->n.sym->ts.u.derived
-		   : (expr->symtree->n.sym->ts.type == BT_CLASS
-			? CLASS_DATA (expr->symtree->n.sym)->ts.u.derived
-			: nullptr);
+  bool class_ref = expr->symtree->n.sym->ts.type == BT_CLASS;
+  gfc_symbol *type
+= expr->symtree->n.sym->ts.type == BT_DERIVED
+	? expr->symtree->n.sym->ts.u.derived
+	: (class_ref ? CLASS_DATA (expr->symtree->n.sym)->ts.u.derived
+		 : nullptr);
   if (!type)
 return;
   gfc_ref **pref = &(expr->ref);
@@ -317,6 +318,9 @@ fixup_comp_refs (gfc_expr *expr)
 	  ref = nullptr;
 	  break;
 	}
+	  if (class_ref)
+	/* Link to the class type to allow for derived type resolution.  */
+	(*pref)->u.c.sym = ref->u.c.sym;
 	  (*pref)->next = ref->next;
 	  ref->next = NULL;
 	  gfc_free_ref_list (ref);
@@ -372,6 +376,7 @@ split_expr_at_caf_ref (gfc_expr *expr, gfc_namespace *ns,
   st->n.sym->attr.dummy = 1;
   st->n.sym->attr.intent = INTENT_IN;
   st->n.sym->ts = *caf_ts;
+  st->n.sym->declared_at = expr->where;

   *post_caf_ref_expr = gfc_get_variable_expr (st);
   (*post_caf_ref_expr)->where = expr->where;
diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc
index 80e98dc3c20..c97829fd8a8 100644
--- a/gcc/fortran/trans-intrinsic.cc
+++ b/gcc/fortran/trans-intrinsic.cc
@@ -1445,8 +1445,14 @@ conv_caf_send_to_remote (gfc_code *code)
 	  NULL_TREE, gfc_trans_force_lval (&block, lhs_se.string_length));
   else
 	opt_lhs_charlen = build_zero_cst (build_pointer_type (size_type_node));
-  if (!TYPE_LANG_SPECIFIC (TREE_TYPE (caf_decl))->rank
-	  || GFC_ARRAY_TYPE_P (TREE_TYPE (caf_decl)))
+  /* Get the third formal argument of the receiver function.  (This is the
+	 location where to put the data on the remote image.)  Need to look at
+	 the argument in the function decl, because in the gfc_symbol's formal
+	 argument an array may have no descriptor while in the generated
+	 function decl it has.  */
+  tmp = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (TYPE_ARG_TYPES (
+	TREE_TYPE (receiver_fn_expr->symtree->n.sym->backend_decl);
+  if (!GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)))
 	opt_lhs_desc = null_pointer_node;
   else
 	opt_lhs_desc
@@ -1635,8 +1641,14 @@ conv_caf_sendget (gfc_code *code)
 	  NULL_TREE, gfc_trans_force_lval (&block, lhs_se.string_length));
   else
 	opt_lhs_charlen = build_zero_cst (build_pointer_type (size_type_node));
-  if (!TYPE_LANG_SPECIFIC (TREE_TYPE (lhs_caf_decl))->rank
-	  || GFC_ARRAY_TYPE_P (TREE_TYPE (lhs_caf_decl)))
+  /* Get the third formal argument of the receiver function.  (This is the
+	 location where to put the data on the remote image.)  Need to look at
+	 the argument in the function decl, because in the gfc_symbol's formal
+	 argument an array may have no descriptor while in the generated
+	 function decl it has.  */
+  tmp = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (TYPE_ARG_TYPES (
+	TREE_TYPE (receiver_fn_expr->symtree->n.sym->backend_decl);
+  if (!GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)))
 	opt_lhs_desc = null_pointer_node;
   else
 	opt_lhs_desc
@@ -1677,8 +1689,14 @@ conv_caf_sendget (gfc_code *code)
 	  rhs_size = gfc_typenode_for_spec (ts)->type_common

Re: [Fortran, Patch, PR107635, 4_v2] Fix class type and descriptor handling for new coarray interface [PR107635]

2025-02-24 Thread Andre Vehreschild

Hi Harald,

I have added some comment(s). Can you take another look?

Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

Regards,
Andre

On Sat, 22 Feb 2025 17:36:55 +0100
Andre Vehreschild  wrote:

> Hi Harald,
>
> thanks for the review. Silently I'd hoped that there is some macro to get
> the i-th argument, that I just haven't found and someone could point me to.
> I will add a comment, when ko one comes up with the macro by Monday.
>
> Thanks,
> Andre
> Andre Vehreschild * ve...@gmx.de
> Am 22. Februar 2025 15:29:20 schrieb Harald Anlauf :
>
> > Hi Andre,
> >
> > Am 21.02.25 um 14:35 schrieb Andre Vehreschild:
> >> Hi all,
> >>
> >> during testing and compiling some larger coarray codes, I found a few
> >> deficiencies. One was with handling class types when splitting the coarray
> >> expression and the other was that the backend_decl of a formal argument in
> >> a function's symbol was not the same as the one the function was compiled
> >> to. So looking at the function-decl's tree n-th formal argument is the way
> >> to go there.
> >>
> >> Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?
> >
> > I am amazed that you do not get lost handling 9-fold nested
> > macros!  This is OK, as this touches your CAF code.
> >
> > Otherwise, I'd recommend to add an explaining comment in the
> > code or code such that mere mortals have a better chance to
> > follow...
> >
> > Thanks,
> > Harald
> >
> >> Regards,
> >>Andre
> >> --
> >> Andre Vehreschild * Email: vehre ad gmx dot de
>


--
Andre Vehreschild * Email: vehre ad gmx dot de
From cecfd1aeb37d65b2880dcbdd55afacb2dde63de0 Mon Sep 17 00:00:00 2001
From: Andre Vehreschild 
Date: Wed, 19 Feb 2025 09:04:47 +0100
Subject: [PATCH 1/2] Fortran: Use correct size when transferring between
 images [PR107635]

gcc/fortran/ChangeLog:

	PR fortran/107635

	* trans-intrinsic.cc (conv_caf_sendget): Use the size of data
	transferred between the two images and not the descritor's size.
---
 gcc/fortran/trans-intrinsic.cc | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc
index 2c4c47816c8..80e98dc3c20 100644
--- a/gcc/fortran/trans-intrinsic.cc
+++ b/gcc/fortran/trans-intrinsic.cc
@@ -1658,20 +1658,23 @@ conv_caf_sendget (gfc_code *code)
   gfc_init_se (&rhs_se, NULL);
   if (rhs_expr->rank == 0)
 {
-  gfc_conv_expr (&rhs_se, rhs_expr);
-  gfc_add_block_to_block (&block, &rhs_se.pre);
   opt_rhs_desc = null_pointer_node;
   if (rhs_expr->ts.type == BT_CHARACTER)
 	{
+	  gfc_conv_expr (&rhs_se, rhs_expr);
+	  gfc_add_block_to_block (&block, &rhs_se.pre);
 	  opt_rhs_charlen = gfc_build_addr_expr (
 	NULL_TREE, gfc_trans_force_lval (&block, rhs_se.string_length));
 	  rhs_size = build_int_cstu (size_type_node, rhs_expr->ts.kind);
 	}
   else
 	{
+	  gfc_typespec *ts
+	= &sender_fn_expr->symtree->n.sym->formal->next->next->sym->ts;
+
 	  opt_rhs_charlen
 	= build_zero_cst (build_pointer_type (size_type_node));
-	  rhs_size = TREE_TYPE (rhs_se.expr)->type_common.size_unit;
+	  rhs_size = gfc_typenode_for_spec (ts)->type_common.size_unit;
 	}
 }
   else if (!TYPE_LANG_SPECIFIC (TREE_TYPE (rhs_caf_decl))->rank
--
2.48.1

Re: [PATCH] avoid-store-forwarding: Handle REG_EH_REGION notes

2025-02-24 Thread Michael Matz

Hello,

On Mon, 24 Feb 2025, Jeff Law wrote:

>  The pass rejects the transformation when there are instructions in the
>  sequence that might throw an exception. This was added due to having
>  cases that the load instruction contains a REG_EH_REGION note and
>  moving it before the store instructions caused an error, as it was
>  no longer the last instruction in the basic block.
> 
>  This patch handles those cases by moving a possible REG_EH_REGION
>  note from the load instruction of the store-load sequence to the
>  last instruction of the basic block.
> >>>
> >>> But that’s not a correct transform and will lead to bogus exception
> >>> handling?  You’d need to move the note and split the block, possibly
> >>> updating the EH info on the side.
> >>
> >> I had originally thought about splitting the block, but tried to get
> >> away with this solution. In case that we are splitting the block after
> >> the load, we wouldn't need to also move the note, right?
> > 
> > I don't think it makes much sense in trying to handle loads/stores with EH,
> > you generally can't re-order stuff around EH since that changes what is
> > observable from the EH handler and the EH might also guard exceptions
> > on later stmts.
> But if all the old statements and all the new/adjusted statements are in the
> same EH region is this still a problem?

It depends, but even if that were no problem in some specific cases 
(perhaps by ensuring that such sequence isn't intermingled with insns that 
are in different EH regions) that doesn't seem to be what the proposal is 
doing?  From the description it moves a EH note from a load (where it 
presumably was for a reason) that occurs after a store to something that 
so happens to be the (new) last instruction of a bb.  Because otherwise 
the load couldn't be moved earlier into the middle of a BB.  In essence 
the load would lose the EH note and a new EH note would be generated for a 
different insn out of thin air.  I don't see how that can ever be a 
correct transformation in general.  What _if_ the very load traps then?  
What if the new insn cannot trap?

(Note that loads and stores may trap differently even on the same address, 
e.g. write to read-only mem, or an misaligned load but an aligned (or 
smaller-sized) write; though I'll say that GCC assumes that loads cannot 
trap if shadowed by an equivalent write, in some other passes)

Ciao,
Michael.

Re: The COBOL front end, version 3, now in 14 easy pieces

2025-02-24 Thread James K. Lowden

On Mon, 24 Feb 2025 14:51:27 +0100
Richard Biener  wrote:

> gcc-cobol/gcc/cobol/parse.y:1361.10-16: error: require bison 3.5.1,
> but have 3.0.4
>  %require "3.5.1"  //3.8.2 also works, but not 3.8.0
>   ^^^
> 
> this requirement isn't documented, neither is a version requirement
> for flex.  The
> place for this is in gcc/doc/install.texi

I supplied a set of documentation patches separately on Friday, subject "info 
updates for gcobol".  

IIUC, you would like that patch applied to cobol-patched, too.  I can do that. 

--jkl

65 matches

Mail list logo