Re: The COBOL front end, version 3, now in 14 easy pieces
On Tue, Feb 25, 2025 at 2:48 AM James K. Lowden wrote: > > On Mon, 24 Feb 2025 14:51:27 +0100 > Richard Biener wrote: > > > On Wed, Feb 19, 2025 at 12:38?AM James K. Lowden > > wrote: > > > > > > The following 14 patches constitute 105,720 lines of code in 83 > > > files to build and document the COBOL front end. > [...] > > > I tested these patches using "git apply" to an unpublished branch > > > "cobol-patched". > > > > I have now built the compiler from the (now published) cobol-patched > > branch. > > I rebuilt cobol-patched today. It now includes changes to gcc/doc/, and > a new gcc/Makefile.in (previously omitted because I thought it was > generated). That deals with some but not all the issues below. > > > On x86_64-linux and noticed the following issues: > > > > The toplevel configure script on the branch wasn't re-generated (duh), > > so it isn't > > ready for use. > > Since we're not mailing patches, cobol-patched is now built by applying > patches that include the configure scripts. good. > > gcc-cobol/gcc/cobol/parse.y:1361.10-16: error: require bison 3.5.1, > > but have 3.0.4 > > %require "3.5.1" //3.8.2 also works, but not 3.8.0 > > ^^^ > > > > this requirement isn't documented, neither is a version requirement > > for flex. The > > place for this is in gcc/doc/install.texi > > I think you will find the updated .texi files comprehesively (!) > reflect the presence of gcobol and its requirements. > > There is a hitch. The patches are against master branch as of 5 > February. Many changes occurred in the time since. Now that I know > that, I *expect* a conflict extravaganza as we pull that together. > > Best IMO is for us to update our repository, i.e. to merge master into > our master+cobol, then to cobol-stage, then to cobol-patched. > Questions: > > 1. How often should we do that, would you say? As it suits you, best when you think you fixed an important set of issues. My idea is that cobol-patched should reflect the tree as if it were post merge to GCC master (so we could basically merge from that branch instead of re-applying patches). > 2. Is cobol-patched more useful to you right now as-is, or should I > regenerate it tomorrow? It was useful as it is for discovering the issues I reported. I will re-try once you declare them fixed. > > During the build we write to > > > > gcc/cobol/charmaps-dupe.cc > > gcc/cobol/valconv-dupe.cc > > Not yet fixed. Top of the list, now. > > > Installing the result via make install DESTDIR=/foo I see both a > > 'gcobol' and a 'gcobc' program > > being installed - is that intentional? > > Yes, as explained prior, gcobc is an emulation script. > > > I also see the gcobol.3 > > manpage reside directly in /foo/gcobol.3 rather > > than in the expected /foo/usr/local/man/.. > > gcc/Makefile.in didn't appear in the patches. I now have installed: > > $ find /usr/local/cobol-patched/ -name '*cobol*' > /usr/local/cobol-patched/ > /usr/local/cobol-patched/libexec/gcc/x86_64-pc-linux-gnu/15.0.1/cobol1 > /usr/local/cobol-patched/bin/gcobol > /usr/local/cobol-patched/share/man/man1/gcobol.1 > /usr/local/cobol-patched/share/man/man3/gcobol.3 > /usr/local/cobol-patched/share/gcobol > > > Installing of libgcobol fails for me: > > Hmm, a little worse for me. As you see above, the installed cobol files > don't include the library. This was from "make install". Checking. That's what happened to me before I re-generated the toplevel configure, it didn't build libgcobol at all. > > so the suppression seems incomplete? > > Or too complete. :-/ > > > Doing > > > > > ./install/gcc-cobol/usr/local/bin/gcobol t.cob -m32 > > > > only fails during linking as we do build a 32bit binary but do not > > find (the not built) 32bit runtime. I suspect when writing a program > > that would need 128bit integer or float we'd eventually ICE. > > I think the cobol-patched you were using was missing our patch of > Friday. Our intention is to prevent any 32-bit binary from being > generated, in addition to not attempting to build the library. > > But we're not there yet. On my system, configured as > > $ ../configure --prefix=/usr/local/gcc-cobol --disable-bootstrap > --disable-generated-files-in-srcdir --disable-multilib > --enable-checking --enable-languages=c,cobol > > I see failure to link [eliding the "skipping" messages]: > > $ gcobol -oo -ffixed-form -m32 gcc/cobol/nist/NC/NC101A.cbl > /usr/bin/ld: cannot find -lgcobol: No such file or directory > /usr/bin/ld: cannot find -lstdc++: No such file or directory > /usr/bin/ld: cannot find -lgcc: No such file or directory > /usr/bin/ld: cannot find -lgcc_s: No such file or directory > > If compiled with -c, > > $ gcobol -oo -c -m32 -ffixed-form ../parser/gcc/cobol/nist/NC/NC101A.cbl > $ file o o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not > stripped > > Not our intended result. :-( But, fixable. > > > The cobol frontend should attempt to intercept the selection of
Re: [PATCH v2 5/5][STAGE1] doc: document btf_type_tag and btf_decl_tag attributes
On 2/6/25 11:54 AM, David Faust wrote: gcc/ * doc/extend.texi (Common Variable Attributes): Document btf_decl_tag attribute. (Common Type Attributes): Document btf_type_tag attribute. --- gcc/doc/extend.texi | 68 + 1 file changed, 68 insertions(+) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 2764597a479..4536cfd7c68 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -8037,6 +8037,41 @@ align them on any target. The @code{aligned} attribute can also be used for functions (@pxref{Common Function Attributes}.) +@cindex @code{btf_decl_tag} variable attribute +@item btf_decl_tag (@var{argument}) +The @code{btf_decl_tag} attribute may be used to associate variable +declarations, struct or union member declarations, function +declarations, and function parameter declarations with arbitrary strings. Currently we add this in the @node Common Variable Attributes. The btf_decl_tag attribute is unique in that the effect is the same, be it associated with a variable decl or function. But I wonder if some placeholder text also be included in the documentation for Common Function Attributes for clarity ? +These strings are not interpreted by the compiler in any way, and have +no effect on code generation. Instead, these user-provided strings +are recorded in DWARF (via @code{DW_AT_GNU_annotation} and +@code{DW_TAG_GNU_annotation} extensions) and BTF information (via +@code{BTF_KIND_DECL_TAG} records), and associated to the attributed +declaration. If neither DWARF nor BTF information is generated, the +attribute has no effect. + +The argument is treated as an ordinary string in the source language +with no additional special rules. + +The attribute may be supplied multiple times for a single declaration, +in which case each distinct argument string will be recorded in a +separate DIE or BTF record, each associated to the declaration. For +a single declaration with multiple @code{btf_decl_tag} attributes, +the order of the @code{DW_TAG_GNU_annotation} DIEs produced is not +guaranteed to maintain the order of attributes in the source code. + +For example: + +@smallexample +int *foo __attribute__ ((btf_decl_tag ("__percpu"))); +@end smallexample + +@noindent +when compiled with @code{-gbtf} results in an additional +@code{BTF_KIND_DECL_TAG} BTF record to be emitted in the BTF info, +associating the string ``__percpu'' with the normal @code{BTF_KIND_VAR} +record for the variable ``foo''. + Remove "normal" or do you intend to say something specific ? @cindex @code{counted_by} variable attribute @item counted_by (@var{count}) The @code{counted_by} attribute may be attached to the C99 flexible array @@ -9226,6 +9261,39 @@ is given by the product of arguments 1 and 2, and that @code{malloc_type}, like the standard C function @code{malloc}, returns an object whose size is given by argument 1 to the function. +@cindex @code{btf_type_tag} type attribute +@item btf_type_tag (@var{argument}) +The @code{btf_type_tag} attribute may be used to associate (to ``tag'') +particular types with arbitrary string annotations. These annotations +are recorded in debugging info by supported debug formats, currently +DWARF (via @code{DW_AT_GNU_annotation} and @code{DW_TAG_GNU_annotation} +extensions) and BTF (via @code{BTF_KIND_TYPE_TAG} records). These +annotation strings are not interpreted by the compiler in any way, and +have no effect on code generation. If neither DWARF nor BTF +information is generated, the attribute has no effect. + +The argument is treated as an ordinary string in the source language +with no additional special rules. + +The attribute may be supplied multiple times for a single declaration, +in which case each distinct argument string will be recorded in a +separate DIE or BTF record, each associated to the type. For a single +type with multiple @code{btf_type_tag} attributes, the order of the +@code{DW_TAG_GNU_annotation} DIEs produced is not guaranteed to +maintain the order of attributes in the source code. +> +For example the following code: + +@smallexample +int * __attribute__ ((btf_type_tag ("__user"))) foo; +@end smallexample + +@noindent +when compiled with @code{-gbtf} results in an additional +@code{BTF_KIND_TYPE_TAG} BTF record to be emitted in the BTF info, +associating the string ``__user'' with the normal @code{BTF_KIND_PTR} +record for the pointer-to-integer type used in the declaration. + @cindex @code{copy} type attribute @item copy @itemx copy (@var{expression})
Re: [PATCH][_Hashtable] Fix hash code cache usage
On Mon, 24 Feb 2025 at 20:52, François Dumont wrote: > > All good remarks as usual, here is a new version. > > I took the time to leverage on the new method to replace a usage of > _M_src_hash_code. > > libstdc++: [_Hashtable] Fix hash code cache usage when stateful > hash functor > > It is wrong to reuse a cached hash code from another container when > this code depends > on the state of the container's Hash functor. > > Add checks that Hash functor is stateless before reusing the cached > hash code. > > libstdc++-v3/ChangeLog: > > * include/bits/hashtable_policy.h > (_Hash_code_base::_M_copy_code, > _Hash_code_base::_M_stored_code): Remove. "_M_store_code" not "_M_stored_code". > * include/bits/hashtable.h (_M_hash_code_ext): New. > (_M_copy_code): New. > (_M_assign): Use latter. > (_M_bucket_index_ex): New. > (_M_equals): Use latter. > (_M_store_code): New. The changes to _M_src_hash_code and _M_merge_multi should be in the ChangeLog too. OK for trunk with those changes, thanks. > * testsuite/23_containers/unordered_map/modifiers/merge.cc > (test10): New > test case. > > Tested under Linux x64. > > Ok to commit ? > > François > > > On 19/02/2025 14:33, Jonathan Wakely wrote: > > On 20/01/25 22:12 +0100, François Dumont wrote: > >> Hi > >> > >> In my work on fancy pointer support I've decided to always cache the > >> hash code. > >> > >> Doing so I spotted a bug in the management of this cache when hash > >> functor is stateful. > >> > >> libstdc++: [_Hashtable] Fix hash code cache usage when hash > >> functor is stateful > >> > >> It is wrong to reuse a cached hash code when this code depends > >> then on the state > >> of the Hash functor. > >> > >> Add checks that Hash functor is stateless before reusing the > >> cached hash code. > >> > >> libstdc++-v3/ChangeLog: > >> > >> * include/bits/hashtable_policy.h > >> (_Hash_code_base::_M_copy_code): Remove. > >> * include/bits/hashtable.h (_M_copy_code): New. > > > > I like replacing the _M_copy_code overloads with one function, thanks. > > We could do the same for _M_store_code too. > > > >> (_M_assign): Use latter. > >> (_M_bucket_index_ex): New. > > > > I assume "ex" means "external"? That doesn't seem very clear to me > > though. Maybe "ext" would be better. > > > >> (_M_equals): Use latter. > >> * > >> testsuite/23_containers/unordered_map/modifiers/merge.cc (test10): New > >> test case. > >> > >> Tested under Linux x64, ok to commit ? > >> > >> François > > > >> diff --git a/libstdc++-v3/include/bits/hashtable.h > >> b/libstdc++-v3/include/bits/hashtable.h > >> index d6d76a743bb..b3c1d7aac24 100644 > >> --- a/libstdc++-v3/include/bits/hashtable.h > >> +++ b/libstdc++-v3/include/bits/hashtable.h > >> @@ -808,6 +808,36 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> _M_bucket_index(__hash_code __c) const > >> { return __hash_code_base::_M_bucket_index(__c, > >> _M_bucket_count); } > >> > >> +#pragma GCC diagnostic push > >> +#pragma GCC diagnostic ignored "-Wc++17-extensions" // if constexpr > >> + // Like _M_bucket_index but when the node is coming from another > >> + // container instance. > >> + size_type > >> + _M_bucket_index_ex(const __node_value_type& __n) const > >> + { > >> +if constexpr (__hash_cached::value) > >> + if constexpr (std::is_empty<_Hash>::value) > >> +return _RangeHash{}(__n._M_hash_code, _M_bucket_count); > >> + > >> +return _RangeHash{} > >> + (this->_M_hash_code(_ExtractKey{}(__n._M_v())), _M_bucket_count); > >> + } > >> + > >> + void > >> + _M_copy_code(__node_value_type& __to, > >> + const __node_value_type& __from) const > >> + { > >> +if constexpr (__hash_cached::value) > >> + { > >> +if constexpr (std::is_empty<_Hash>::value) > >> + __to._M_hash_code = __from._M_hash_code; > >> +else > >> + __to._M_hash_code = > >> +this->_M_hash_code(_ExtractKey{}(__from._M_v())); > >> + } > >> + } > > > > > > These two functions are doing similar work, would it make sense to > > factor out the common part to a new function: > > > > // Get hash code for a node that comes from another _Hashtable. > > // Reuse a cached hash code if the hash function is stateless, > > // otherwise recalculate it using our own hash function. > > size_t > > _M_hash_code_ext(const __node_value_type& __from) const > > { > > if constexpr (__and_<__hash_cached, is_empty<_Hash>>::value) > > return __from._M_hash_code; > > else > > this->_M_hash_code(_ExtractKey{}(__from._M_v())); > > } > > > > // Like _M_bucket_index but when the node is coming from another > > // container instance. > > size_type > > _M_bucket_index_
Re: The COBOL front end, version 3, now in 14 easy pieces (+NIST)
"James K. Lowden" writes: >> Having a minimal harness in GCCs testsuite is critical - I'd expect a >> gcc/testsuite/gcobol.dg/dg.exp supporting execution tests. I assume >> Cobol has a way to exit OK or fatally and this should be >> distinguished as testsuite PASS or FAIL. > > Yes, a COBOL program exits with a return status. And we rigged up NIST > to do that. What that means requires a long explanation, sorry. Regardless of the NIST usage you will need a frame work of GCC specific COBOL tests. You are expected that for every bug that you fix that wasn't detected by NIST you add a test case. Surely you have some of those already? -Andi
Re: [Fortran, Patch, PR107635, 4_v2] Fix class type and descriptor handling for new coarray interface [PR107635]
Hi Andre, Am 24.02.25 um 16:44 schrieb Andre Vehreschild: Hi Harald, I have added some comment(s). Can you take another look? assuming that you refer to the attachment of the other submission: that one is perfect! This one is also OK. Thanks for the patch(es)! Harald Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline? Regards, Andre On Sat, 22 Feb 2025 17:36:55 +0100 Andre Vehreschild wrote: Hi Harald, thanks for the review. Silently I'd hoped that there is some macro to get the i-th argument, that I just haven't found and someone could point me to. I will add a comment, when ko one comes up with the macro by Monday. Thanks, Andre Andre Vehreschild * ve...@gmx.de Am 22. Februar 2025 15:29:20 schrieb Harald Anlauf : Hi Andre, Am 21.02.25 um 14:35 schrieb Andre Vehreschild: Hi all, during testing and compiling some larger coarray codes, I found a few deficiencies. One was with handling class types when splitting the coarray expression and the other was that the backend_decl of a formal argument in a function's symbol was not the same as the one the function was compiled to. So looking at the function-decl's tree n-th formal argument is the way to go there. Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline? I am amazed that you do not get lost handling 9-fold nested macros! This is OK, as this touches your CAF code. Otherwise, I'd recommend to add an explaining comment in the code or code such that mere mortals have a better chance to follow... Thanks, Harald Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: [Fortran, Patch, PR107635, 4_v2] Fix class type and descriptor handling for new coarray interface [PR107635]
Hi Harald, Oops, I've mixed up the two attachments. Sorry for that and thank you for detecting it and ok'ing. I will merge tomorrow morning. Thanks again, Andre Andre Vehreschild * ve...@gmx.de Am 24. Februar 2025 20:22:25 schrieb Harald Anlauf : Hi Andre, Am 24.02.25 um 16:44 schrieb Andre Vehreschild: Hi Harald, I have added some comment(s). Can you take another look? assuming that you refer to the attachment of the other submission: that one is perfect! This one is also OK. Thanks for the patch(es)! Harald Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline? Regards, Andre On Sat, 22 Feb 2025 17:36:55 +0100 Andre Vehreschild wrote: Hi Harald, thanks for the review. Silently I'd hoped that there is some macro to get the i-th argument, that I just haven't found and someone could point me to. I will add a comment, when ko one comes up with the macro by Monday. Thanks, Andre Andre Vehreschild * ve...@gmx.de Am 22. Februar 2025 15:29:20 schrieb Harald Anlauf : Hi Andre, Am 21.02.25 um 14:35 schrieb Andre Vehreschild: Hi all, during testing and compiling some larger coarray codes, I found a few deficiencies. One was with handling class types when splitting the coarray expression and the other was that the backend_decl of a formal argument in a function's symbol was not the same as the one the function was compiled to. So looking at the function-decl's tree n-th formal argument is the way to go there. Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline? I am amazed that you do not get lost handling 9-fold nested macros! This is OK, as this touches your CAF code. Otherwise, I'd recommend to add an explaining comment in the code or code such that mere mortals have a better chance to follow... Thanks, Harald Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de -- Andre Vehreschild * Email: vehre ad gmx dot de
New German PO file for 'gcc' (version 15-b20250216)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the German team of translators. The file is available at: https://translationproject.org/latest/gcc/de.po (This file, 'gcc-15-b20250216.de.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
Re: [PATCH][_Hashtable] Fix hash code cache usage
All good remarks as usual, here is a new version. I took the time to leverage on the new method to replace a usage of _M_src_hash_code. libstdc++: [_Hashtable] Fix hash code cache usage when stateful hash functor It is wrong to reuse a cached hash code from another container when this code depends on the state of the container's Hash functor. Add checks that Hash functor is stateless before reusing the cached hash code. libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h (_Hash_code_base::_M_copy_code, _Hash_code_base::_M_stored_code): Remove. * include/bits/hashtable.h (_M_hash_code_ext): New. (_M_copy_code): New. (_M_assign): Use latter. (_M_bucket_index_ex): New. (_M_equals): Use latter. (_M_store_code): New. * testsuite/23_containers/unordered_map/modifiers/merge.cc (test10): New test case. Tested under Linux x64. Ok to commit ? François On 19/02/2025 14:33, Jonathan Wakely wrote: On 20/01/25 22:12 +0100, François Dumont wrote: Hi In my work on fancy pointer support I've decided to always cache the hash code. Doing so I spotted a bug in the management of this cache when hash functor is stateful. libstdc++: [_Hashtable] Fix hash code cache usage when hash functor is stateful It is wrong to reuse a cached hash code when this code depends then on the state of the Hash functor. Add checks that Hash functor is stateless before reusing the cached hash code. libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h (_Hash_code_base::_M_copy_code): Remove. * include/bits/hashtable.h (_M_copy_code): New. I like replacing the _M_copy_code overloads with one function, thanks. We could do the same for _M_store_code too. (_M_assign): Use latter. (_M_bucket_index_ex): New. I assume "ex" means "external"? That doesn't seem very clear to me though. Maybe "ext" would be better. (_M_equals): Use latter. * testsuite/23_containers/unordered_map/modifiers/merge.cc (test10): New test case. Tested under Linux x64, ok to commit ? François diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h index d6d76a743bb..b3c1d7aac24 100644 --- a/libstdc++-v3/include/bits/hashtable.h +++ b/libstdc++-v3/include/bits/hashtable.h @@ -808,6 +808,36 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_bucket_index(__hash_code __c) const { return __hash_code_base::_M_bucket_index(__c, _M_bucket_count); } +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wc++17-extensions" // if constexpr + // Like _M_bucket_index but when the node is coming from another + // container instance. + size_type + _M_bucket_index_ex(const __node_value_type& __n) const + { + if constexpr (__hash_cached::value) + if constexpr (std::is_empty<_Hash>::value) + return _RangeHash{}(__n._M_hash_code, _M_bucket_count); + + return _RangeHash{} + (this->_M_hash_code(_ExtractKey{}(__n._M_v())), _M_bucket_count); + } + + void + _M_copy_code(__node_value_type& __to, + const __node_value_type& __from) const + { + if constexpr (__hash_cached::value) + { + if constexpr (std::is_empty<_Hash>::value) + __to._M_hash_code = __from._M_hash_code; + else + __to._M_hash_code = + this->_M_hash_code(_ExtractKey{}(__from._M_v())); + } + } These two functions are doing similar work, would it make sense to factor out the common part to a new function: // Get hash code for a node that comes from another _Hashtable. // Reuse a cached hash code if the hash function is stateless, // otherwise recalculate it using our own hash function. size_t _M_hash_code_ext(const __node_value_type& __from) const { if constexpr (__and_<__hash_cached, is_empty<_Hash>>::value) return __from._M_hash_code; else this->_M_hash_code(_ExtractKey{}(__from._M_v())); } // Like _M_bucket_index but when the node is coming from another // container instance. size_type _M_bucket_index_ext(const __node_value_type& __n) const { return _M_bucket_index(_M_hash_code_ext(__n)); } void _M_copy_code(__node_value_type& __to, const __node_value_type& __from) const { if constexpr (__hash_cached::value) __to._M_hash_code = _M_hash_code_ext(__n); } +#pragma GCC diagnostic pop + // Find and insert helper functions and types // Find the node before the one matching the criteria. @@ -1587,7 +1617,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __node_ptr __ht_n = __ht._M_begin(); __node_ptr __this_n = __node_gen(static_cast<_FromVal>(__ht_n->_M_v())); - this->_M_copy_code(*__this_n, *__ht_n); + _M_copy_code(*__this_n, *__ht_n); _M_update_bbegin(__
Re: [PATCH] libstdc++: Implement LWG 4027 change to possibly-const-range [PR118083]
On Tue, 4 Feb 2025, Patrick Palka wrote: > On Tue, 4 Feb 2025, Patrick Palka wrote: > > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps > > 14 (after a long while)? > > > > -- >8 -- > > > > LWG 4027 effectively makes the const range access CPOs ranges::cfoo behave > > more consistently across C++23 and C++20 (pre-P2278R4) and more > > consistently with the std::cfoo range accessors, as the below testcase > > adjustments demonstrate (which mostly consist of reverting workarounds > > added by r14-3771-gf12e26f3496275 and r13-7186-g0d94c6df183375). > > > > In passing fix PR118083 which reports that the input_range constraint on > > possibly-const-range is missing in our implementation. A consequence of > > this is that the const range access CPOs now consistently require the > > type to be a range, and so in some our of tests we need to introduce > > otherwise unused begin/end members. > > ... now with the LWG 4027 testcases added (to cbegin.cc, so that it runs > in both C++20 and C++23 mode): Ping. > > -- >8 -- > > Subject: [PATCH] libstdc++: Implement LWG 4027 change to possibly-const-range > [PR118083] > > PR libstdc++/118083 > > libstdc++-v3/ChangeLog: > > * include/bits/ranges_base.h > (ranges::__access::__possibly_const_range): Adjust logic as per > LWG 4027. Add missing input_range constraint. > * testsuite/std/ranges/access/cbegin.cc (test05): Verify LWG > 4027 testcases. > * testsuite/std/ranges/access/cdata.cc: Adjust, simplify and > consolidate some tests after the above. > * testsuite/std/ranges/access/cend.cc: Likewise. > * testsuite/std/ranges/access/crbegin.cc: Likewise. > * testsuite/std/ranges/access/crend.cc: Likewise. > * testsuite/std/ranges/adaptors/join.cc: Likewise. > * testsuite/std/ranges/adaptors/take_while.cc: Likewise. > * testsuite/std/ranges/adaptors/transform.cc: Likewise. > --- > libstdc++-v3/include/bits/ranges_base.h | 4 +- > .../testsuite/std/ranges/access/cbegin.cc | 17 > .../testsuite/std/ranges/access/cdata.cc | 21 - > .../testsuite/std/ranges/access/cend.cc | 30 ++--- > .../testsuite/std/ranges/access/crbegin.cc| 43 +-- > .../testsuite/std/ranges/access/crend.cc | 20 + > .../testsuite/std/ranges/adaptors/join.cc | 8 ++-- > .../std/ranges/adaptors/take_while.cc | 2 - > .../std/ranges/adaptors/transform.cc | 4 -- > 9 files changed, 59 insertions(+), 90 deletions(-) > > diff --git a/libstdc++-v3/include/bits/ranges_base.h > b/libstdc++-v3/include/bits/ranges_base.h > index 4dcfbf66d51..28fe64a9e9d 100644 > --- a/libstdc++-v3/include/bits/ranges_base.h > +++ b/libstdc++-v3/include/bits/ranges_base.h > @@ -642,11 +642,11 @@ namespace ranges >namespace __access >{ > #if __glibcxx_ranges_as_const // >= C++23 > -template > +template >constexpr auto& >__possibly_const_range(_Range& __r) noexcept >{ > - if constexpr (constant_range && !constant_range<_Range>) > + if constexpr (input_range) > return const_cast(__r); > else > return __r; > diff --git a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc > b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc > index 5423e782428..66b28f41877 100644 > --- a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc > +++ b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc > @@ -116,10 +116,27 @@ test04() >VERIFY(std::ranges::cbegin(std::move(c)) == std::ranges::begin(c)); > } > > +void > +test05() > +{ > + // LWG 4027 - possibly-const-range should prefer returning const R& > + auto r = std::views::single(0) > +| std::views::transform([](int) { return 0; }); > + using C1 = decltype(std::ranges::cbegin(r)); > + using C1 = decltype(std::cbegin(r)); > + > + [] (auto x) { > +auto r = std::views::single(x) | std::views::lazy_split(0); > +static_assert( ! requires { (*std::ranges::cbegin(r)).front() = 42; }); > +static_assert( ! requires { (*std::cbegin(r)).front() = 42; }); > + }(0); > +} > + > int > main() > { >test01(); >test03(); >test04(); > + test05(); > } > diff --git a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc > b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc > index 62c347be43d..f474ab7ec99 100644 > --- a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc > +++ b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc > @@ -34,20 +34,21 @@ test01() >{ > int i = 0; > int j = 0; > + > +#if __cpp_lib_ranges_as_const > +// These overloads mean that range and range are satisfied. > +const int* begin() const { throw; } > +const int* end() const { throw; } > +#endif > + > int* data() { return &j; } > const R* data() const noexcept { return nullptr; } >}; >static_assert( has_cdata ); >static_assert( has_cdata ); >R r; > -#if ! __cpp_lib_ran
Re: [RFA][PR tree-optimization/98028] Use relationship between operands to simplify SUB_OVERFLOW
On Sat, Feb 15, 2025 at 04:52:58PM -0700, Jeff Law wrote: > On 2/12/25 2:22 PM, Jakub Jelinek wrote: > > > > > Or just go with that even for GCC 15 (completely untested and dunno if > > something needs to be done about s = NULL passed to query or not) for > > now, with the advantage that it can do something even for the cases where > > type is not compatible with types of arguments, and perhaps add additional > > cases later? > I added the check for s being non-null and made a trivial fix to your patch > (IIRC you used "code" when it should have been "subcode"). > > Do you think there's still value in opening a new non-regression bug for > additional cases? I actually can't think of any further cases right now, so I think no need to open a PR, just commit the patch ;) Jakub
Re: [PATCH v2 4/5][STAGE1] btf: generate and output DECL_TAG and TYPE_TAG records
On 2/24/25 11:03, Indu Bhagat wrote: > On 2/6/25 11:54 AM, David Faust wrote: >> Support the btf_decl_tag and btf_type_tag attributes in BTF by creating >> and emitting BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records, >> respectively, for them. >> >> Some care is required when -gprune-btf is in effect to avoid emitting >> decl or type tags for declarations or types which have been pruned and >> will not be emitted in BTF. >> > > The patch itself looks OK to me. Two comments below. > >> gcc/ >> * btfout.cc (get_btf_kind): Handle DECL_TAG and TYPE_TAG kinds. >> (btf_calc_num_vbytes): Likewise. >> (btf_asm_type): Likewise. >> (output_asm_btf_vlen_bytes): Likewise. >> (output_btf_tags): New. >> (btf_output): Call it here. >> (btf_add_used_type): Replace with simple wrapper around... >> (btf_add_used_type_1): ...the implementation. Handle >> BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >> (btf_add_vars): Update btf_add_used_type call. >> (btf_assign_tag_ids): New. >> (btf_mark_type_used): Update btf_add_used_type call. >> (btf_collect_pruned_types): Likewise. Handle type and decl tags. >> (btf_finish): Call btf_assign_tag_ids. >> >> gcc/testsuite/ >> * gcc.dg/debug/btf/btf-decl-tag-1.c: New test. >> * gcc.dg/debug/btf/btf-decl-tag-2.c: New test. >> * gcc.dg/debug/btf/btf-decl-tag-3.c: New test. >> * gcc.dg/debug/btf/btf-decl-tag-4.c: New test. >> * gcc.dg/debug/btf/btf-type-tag-1.c: New test. >> * gcc.dg/debug/btf/btf-type-tag-2.c: New test. >> * gcc.dg/debug/btf/btf-type-tag-3.c: New test. >> * gcc.dg/debug/btf/btf-type-tag-4.c: New test. >> * gcc.dg/debug/btf/btf-type-tag-5.c: New test. >> * gcc.dg/debug/btf/btf-type-tag-6.c: New test. >> * gcc.dg/debug/btf/btf-type-tag-c2x-1.c: New test. >> > > I think its worth adding a testcase in the BPF backend tests for testing > (no) interaction between CO-RE relocs with tags. E.g, when > __attribute__((preserve_access_index)) is used on a struct with type > tags, the type id in the CO-RE relocation records is the type and not > the type tag. Or other tests as you see fit. > > WDYT ? That is a very good idea. I will add a few tests for this. > >> include/ >> * btf.h (BTF_KIND_DECL_TAG, BTF_KIND_TYPE_TAG) New defines. >> (struct btf_decl_tag): New. >> --- >> gcc/btfout.cc | 171 +++--- >> .../gcc.dg/debug/btf/btf-decl-tag-1.c | 14 ++ >> .../gcc.dg/debug/btf/btf-decl-tag-2.c | 22 +++ >> .../gcc.dg/debug/btf/btf-decl-tag-3.c | 22 +++ >> .../gcc.dg/debug/btf/btf-decl-tag-4.c | 34 >> .../gcc.dg/debug/btf/btf-type-tag-1.c | 27 +++ >> .../gcc.dg/debug/btf/btf-type-tag-2.c | 15 ++ >> .../gcc.dg/debug/btf/btf-type-tag-3.c | 21 +++ >> .../gcc.dg/debug/btf/btf-type-tag-4.c | 25 +++ >> .../gcc.dg/debug/btf/btf-type-tag-5.c | 35 >> .../gcc.dg/debug/btf/btf-type-tag-6.c | 15 ++ >> .../gcc.dg/debug/btf/btf-type-tag-c2x-1.c | 23 +++ >> include/btf.h | 14 ++ >> 13 files changed, 414 insertions(+), 24 deletions(-) >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-1.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-2.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-3.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-4.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-1.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-2.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-3.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-4.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-5.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-6.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-c2x-1.c >> >> diff --git a/gcc/btfout.cc b/gcc/btfout.cc >> index ff7ea42a961..c00e0c98015 100644 >> --- a/gcc/btfout.cc >> +++ b/gcc/btfout.cc >> @@ -141,6 +141,8 @@ get_btf_kind (uint32_t ctf_kind) >> case CTF_K_VOLATILE: return BTF_KIND_VOLATILE; >> case CTF_K_CONST:return BTF_KIND_CONST; >> case CTF_K_RESTRICT: return BTF_KIND_RESTRICT; >> +case CTF_K_DECL_TAG: return BTF_KIND_DECL_TAG; >> +case CTF_K_TYPE_TAG: return BTF_KIND_TYPE_TAG; >> default:; >> } >> return BTF_KIND_UNKN; >> @@ -217,6 +219,7 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd) >> case BTF_KIND_CONST: >> case BTF_KIND_RESTRICT: >> case BTF_KIND_FUNC: >> +case BTF_KIND_TYPE_TAG: >> /* These kinds have no vlen data. */ >> break; >> >> @@ -256,6 +259,10 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd) >> vlen_bytes += vlen * sizeof (struct btf_var_secinfo); >> break; >> >> +case BTF_KIND_D
Re: [PATCH] libstdc++: Implement LWG 4027 change to possibly-const-range [PR118083]
On Tue, 4 Feb 2025 at 22:27, Patrick Palka wrote: > > On Tue, 4 Feb 2025, Patrick Palka wrote: > > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps > > 14 (after a long while)? Yes, and yes, thanks. > > > > -- >8 -- > > > > LWG 4027 effectively makes the const range access CPOs ranges::cfoo behave > > more consistently across C++23 and C++20 (pre-P2278R4) and more > > consistently with the std::cfoo range accessors, as the below testcase > > adjustments demonstrate (which mostly consist of reverting workarounds > > added by r14-3771-gf12e26f3496275 and r13-7186-g0d94c6df183375). > > > > In passing fix PR118083 which reports that the input_range constraint on > > possibly-const-range is missing in our implementation. A consequence of > > this is that the const range access CPOs now consistently require the > > type to be a range, and so in some our of tests we need to introduce > > otherwise unused begin/end members. > > ... now with the LWG 4027 testcases added (to cbegin.cc, so that it runs > in both C++20 and C++23 mode): > > -- >8 -- > > Subject: [PATCH] libstdc++: Implement LWG 4027 change to possibly-const-range > [PR118083] > > PR libstdc++/118083 > > libstdc++-v3/ChangeLog: > > * include/bits/ranges_base.h > (ranges::__access::__possibly_const_range): Adjust logic as per > LWG 4027. Add missing input_range constraint. > * testsuite/std/ranges/access/cbegin.cc (test05): Verify LWG > 4027 testcases. > * testsuite/std/ranges/access/cdata.cc: Adjust, simplify and > consolidate some tests after the above. > * testsuite/std/ranges/access/cend.cc: Likewise. > * testsuite/std/ranges/access/crbegin.cc: Likewise. > * testsuite/std/ranges/access/crend.cc: Likewise. > * testsuite/std/ranges/adaptors/join.cc: Likewise. > * testsuite/std/ranges/adaptors/take_while.cc: Likewise. > * testsuite/std/ranges/adaptors/transform.cc: Likewise. > --- > libstdc++-v3/include/bits/ranges_base.h | 4 +- > .../testsuite/std/ranges/access/cbegin.cc | 17 > .../testsuite/std/ranges/access/cdata.cc | 21 - > .../testsuite/std/ranges/access/cend.cc | 30 ++--- > .../testsuite/std/ranges/access/crbegin.cc| 43 +-- > .../testsuite/std/ranges/access/crend.cc | 20 + > .../testsuite/std/ranges/adaptors/join.cc | 8 ++-- > .../std/ranges/adaptors/take_while.cc | 2 - > .../std/ranges/adaptors/transform.cc | 4 -- > 9 files changed, 59 insertions(+), 90 deletions(-) > > diff --git a/libstdc++-v3/include/bits/ranges_base.h > b/libstdc++-v3/include/bits/ranges_base.h > index 4dcfbf66d51..28fe64a9e9d 100644 > --- a/libstdc++-v3/include/bits/ranges_base.h > +++ b/libstdc++-v3/include/bits/ranges_base.h > @@ -642,11 +642,11 @@ namespace ranges >namespace __access >{ > #if __glibcxx_ranges_as_const // >= C++23 > -template > +template >constexpr auto& >__possibly_const_range(_Range& __r) noexcept >{ > - if constexpr (constant_range && !constant_range<_Range>) > + if constexpr (input_range) > return const_cast(__r); > else > return __r; > diff --git a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc > b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc > index 5423e782428..66b28f41877 100644 > --- a/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc > +++ b/libstdc++-v3/testsuite/std/ranges/access/cbegin.cc > @@ -116,10 +116,27 @@ test04() >VERIFY(std::ranges::cbegin(std::move(c)) == std::ranges::begin(c)); > } > > +void > +test05() > +{ > + // LWG 4027 - possibly-const-range should prefer returning const R& > + auto r = std::views::single(0) > +| std::views::transform([](int) { return 0; }); > + using C1 = decltype(std::ranges::cbegin(r)); > + using C1 = decltype(std::cbegin(r)); > + > + [] (auto x) { > +auto r = std::views::single(x) | std::views::lazy_split(0); > +static_assert( ! requires { (*std::ranges::cbegin(r)).front() = 42; }); > +static_assert( ! requires { (*std::cbegin(r)).front() = 42; }); > + }(0); > +} > + > int > main() > { >test01(); >test03(); >test04(); > + test05(); > } > diff --git a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc > b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc > index 62c347be43d..f474ab7ec99 100644 > --- a/libstdc++-v3/testsuite/std/ranges/access/cdata.cc > +++ b/libstdc++-v3/testsuite/std/ranges/access/cdata.cc > @@ -34,20 +34,21 @@ test01() >{ > int i = 0; > int j = 0; > + > +#if __cpp_lib_ranges_as_const > +// These overloads mean that range and range are satisfied. > +const int* begin() const { throw; } > +const int* end() const { throw; } > +#endif > + > int* data() { return &j; } > const R* data() const noexcept { return nullptr; } >}; >static_assert( has_cdata ); >static_
Re: [PATCH v2 3/5][STAGE1] ctf: translate annotation DIEs to internal ctf
On 2/24/25 11:04, Indu Bhagat wrote: > On 2/6/25 11:54 AM, David Faust wrote: >> Translate DW_TAG_GNU_annotation DIEs created for C attributes >> btf_decl_tag and btf_type_tag into an in-memory representation in the >> CTF/BTF container. They will be output in BTF as BTF_KIND_DECL_TAG and >> BTF_KIND_TYPE_TAG records. >> >> The new CTF kinds used to represent these annotations, CTF_K_DECL_TAG >> and CTF_K_TYPE_TAG, are expected to be formalized in the next version of >> the CTF specification. For now they only exist in memory as a >> translation step to BTF, and are not emitted when generating CTF >> information. >> > > The patch overall looks OK to me. Some comments inlined below. > >> gcc/ >> * ctfc.cc (ctf_dtu_d_union_selector): Handle CTF_K_DECL_TAG and >> CTF_K_TYPE_TAG. >> (ctf_add_type_tag, ctf_add_decl_tag): New. >> (ctf_add_variable): Return the new ctf_dvdef_ref rather than zero. >> (new_ctf_container): Initialize new members. >> (ctfc_delete_container): Deallocate new members. >> * ctfc.h (ctf_dvdef, ctf_dvdef_t, ctf_dvdef_ref): Move forward >> declarations earlier in file. >> (ctf_decl_tag_t): New typedef. >> (ctf_dtdef): Add ctf_decl_tag_t member to dtd_u union. >> (ctf_dtu_d_union_enum): Add new CTF_DTU_D_TAG enumerator. >> (ctf_container): Add ctfc_tags vector and ctfc_type_tags_map hash_map >> members. >> (ctf_add_type_tag, ctf_add_decl_tag): New function protos. >> (ctf_add_variable): Change prototype return type to ctf_dvdef_ref. >> * dwarf2ctf.cc (gen_ctf_type_tags, gen_ctf_decl_tags) >> (gen_ctf_decl_tags_for_var): New static functions. >> (gen_ctf_sou_type): Handle decl tags. >> (gen_ctf_function_type): Likewise. >> (gen_ctf_variable): Likewise. >> (gen_ctf_function): Likewise. >> (gen_ctf_type): Handle type tags. >> >> gcc/testsuite >> * gcc.dg/debug/ctf/ctf-decl-tag-1.c: New test. >> * gcc.dg/debug/ctf/ctf-type-tag-1.c: New test. >> >> include/ >> * ctf.h (CTF_K_DECL_TAG, CTF_K_TYPE_TAG): New defines. >> --- >> gcc/ctfc.cc | 70 +++- >> gcc/ctfc.h| 41 - >> gcc/dwarf2ctf.cc | 152 +- >> .../gcc.dg/debug/ctf/ctf-decl-tag-1.c | 31 >> .../gcc.dg/debug/ctf/ctf-type-tag-1.c | 19 +++ >> include/ctf.h | 4 + >> 6 files changed, 303 insertions(+), 14 deletions(-) >> create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-decl-tag-1.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-type-tag-1.c >> >> diff --git a/gcc/ctfc.cc b/gcc/ctfc.cc >> index 51511d69baa..c478fa0a136 100644 >> --- a/gcc/ctfc.cc >> +++ b/gcc/ctfc.cc >> @@ -107,6 +107,9 @@ ctf_dtu_d_union_selector (ctf_dtdef_ref ctftype) >> return CTF_DTU_D_ARGUMENTS; >> case CTF_K_SLICE: >> return CTF_DTU_D_SLICE; >> +case CTF_K_DECL_TAG: >> +case CTF_K_TYPE_TAG: >> + return CTF_DTU_D_TAG; >> default: >> /* The largest member as default. */ >> return CTF_DTU_D_ARRAY; >> @@ -445,6 +448,58 @@ ctf_add_reftype (ctf_container_ref ctfc, uint32_t flag, >> ctf_dtdef_ref ref, >> return dtd; >> } >> >> +ctf_dtdef_ref >> +ctf_add_type_tag (ctf_container_ref ctfc, uint32_t flag, const char *value, >> + ctf_dtdef_ref ref_dtd) >> +{ >> + ctf_dtdef_ref dtd; >> + /* Create a DTD for the tag, but do not place it in the regular types >> list; >> + CTF format does not (yet) encode tags. */ >> + dtd = ggc_cleared_alloc (); >> + >> + dtd->dtd_name = ctf_add_string (ctfc, value, &(dtd->dtd_data.ctti_name), >> + CTF_AUX_STRTAB); >> + /* A single DW_TAG_GNU_annotation DIE may be referenced by multiple DIEs, >> + e.g. when multiple distinct types specify the same type tag. We will >> + synthesize multiple CTF DTD records in that case, so we cannot tie them >> + all to the same key (the DW_TAG_GNU_annotation DIE) in ctfc_types. */ >> + dtd->dtd_key = NULL; >> + dtd->ref_type = ref_dtd; >> + dtd->dtd_data.ctti_info = CTF_TYPE_INFO (CTF_K_TYPE_TAG, flag, 0); >> + dtd->dtd_u.dtu_tag.ref_var = NULL; /* Not used for type tags. */ >> + dtd->dtd_u.dtu_tag.component_idx = 0; /* Not used for type tags. */ >> + >> + /* Insert tag directly into the tag list. Type ID will be assigned >> later. */ >> + vec_safe_push (ctfc->ctfc_tags, dtd); >> + return dtd; >> +} >> + >> +ctf_dtdef_ref >> +ctf_add_decl_tag (ctf_container_ref ctfc, uint32_t flag, const char *value, >> + ctf_dtdef_ref ref_dtd, uint32_t comp_idx) >> +{ >> + ctf_dtdef_ref dtd; >> + /* Create a DTD for the tag, but do not place it in the regular types >> list; >> + ctf format does not (yet) encode tags. */ >> + dtd = ggc_cleared_alloc (); >> + >> + dtd->dtd_name = ctf_add_string (ctfc, value, &(dtd->dtd_data.ctti_
Re: The COBOL front end, version 3, now in 14 easy pieces
On Mon, 24 Feb 2025 14:51:27 +0100 Richard Biener wrote: > Installing the result via make install DESTDIR=/foo I see both a > 'gcobol' and a 'gcobc' program > being installed - is that intentional? Yes, that is intentional. gcobol is the compiler driver, as you know. gcobc is a shell script that emulates GnuCOBOL's cobc executable. It is being used to facilitate compatibility with GnuCOBOL. I see it's not documented. There is quite extensive documentation for it in a directory we chose not to supply among our patches, for simplicity. I will look into adapting that page, gcc/cobol/scripts/rocky8/migration-guide.1 to something more specific to gcobc. > I also see the gcobol.3 manpage reside directly in /foo/gcobol.3 > rather than in the expected /foo/usr/local/man/.. There may be something missing somewhere higher up. I had to add the man3 directory to the install machinery. I will investigate. > sources might reside on a R/O mounted volume working on it. Perhaps tomorrow. --jkl
Re: [PATCH v2 3/5][STAGE1] ctf: translate annotation DIEs to internal ctf
On 2/6/25 11:54 AM, David Faust wrote: Translate DW_TAG_GNU_annotation DIEs created for C attributes btf_decl_tag and btf_type_tag into an in-memory representation in the CTF/BTF container. They will be output in BTF as BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records. The new CTF kinds used to represent these annotations, CTF_K_DECL_TAG and CTF_K_TYPE_TAG, are expected to be formalized in the next version of the CTF specification. For now they only exist in memory as a translation step to BTF, and are not emitted when generating CTF information. The patch overall looks OK to me. Some comments inlined below. gcc/ * ctfc.cc (ctf_dtu_d_union_selector): Handle CTF_K_DECL_TAG and CTF_K_TYPE_TAG. (ctf_add_type_tag, ctf_add_decl_tag): New. (ctf_add_variable): Return the new ctf_dvdef_ref rather than zero. (new_ctf_container): Initialize new members. (ctfc_delete_container): Deallocate new members. * ctfc.h (ctf_dvdef, ctf_dvdef_t, ctf_dvdef_ref): Move forward declarations earlier in file. (ctf_decl_tag_t): New typedef. (ctf_dtdef): Add ctf_decl_tag_t member to dtd_u union. (ctf_dtu_d_union_enum): Add new CTF_DTU_D_TAG enumerator. (ctf_container): Add ctfc_tags vector and ctfc_type_tags_map hash_map members. (ctf_add_type_tag, ctf_add_decl_tag): New function protos. (ctf_add_variable): Change prototype return type to ctf_dvdef_ref. * dwarf2ctf.cc (gen_ctf_type_tags, gen_ctf_decl_tags) (gen_ctf_decl_tags_for_var): New static functions. (gen_ctf_sou_type): Handle decl tags. (gen_ctf_function_type): Likewise. (gen_ctf_variable): Likewise. (gen_ctf_function): Likewise. (gen_ctf_type): Handle type tags. gcc/testsuite * gcc.dg/debug/ctf/ctf-decl-tag-1.c: New test. * gcc.dg/debug/ctf/ctf-type-tag-1.c: New test. include/ * ctf.h (CTF_K_DECL_TAG, CTF_K_TYPE_TAG): New defines. --- gcc/ctfc.cc | 70 +++- gcc/ctfc.h| 41 - gcc/dwarf2ctf.cc | 152 +- .../gcc.dg/debug/ctf/ctf-decl-tag-1.c | 31 .../gcc.dg/debug/ctf/ctf-type-tag-1.c | 19 +++ include/ctf.h | 4 + 6 files changed, 303 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-decl-tag-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-type-tag-1.c diff --git a/gcc/ctfc.cc b/gcc/ctfc.cc index 51511d69baa..c478fa0a136 100644 --- a/gcc/ctfc.cc +++ b/gcc/ctfc.cc @@ -107,6 +107,9 @@ ctf_dtu_d_union_selector (ctf_dtdef_ref ctftype) return CTF_DTU_D_ARGUMENTS; case CTF_K_SLICE: return CTF_DTU_D_SLICE; +case CTF_K_DECL_TAG: +case CTF_K_TYPE_TAG: + return CTF_DTU_D_TAG; default: /* The largest member as default. */ return CTF_DTU_D_ARRAY; @@ -445,6 +448,58 @@ ctf_add_reftype (ctf_container_ref ctfc, uint32_t flag, ctf_dtdef_ref ref, return dtd; } +ctf_dtdef_ref +ctf_add_type_tag (ctf_container_ref ctfc, uint32_t flag, const char *value, + ctf_dtdef_ref ref_dtd) +{ + ctf_dtdef_ref dtd; + /* Create a DTD for the tag, but do not place it in the regular types list; + CTF format does not (yet) encode tags. */ + dtd = ggc_cleared_alloc (); + + dtd->dtd_name = ctf_add_string (ctfc, value, &(dtd->dtd_data.ctti_name), + CTF_AUX_STRTAB); + /* A single DW_TAG_GNU_annotation DIE may be referenced by multiple DIEs, + e.g. when multiple distinct types specify the same type tag. We will + synthesize multiple CTF DTD records in that case, so we cannot tie them + all to the same key (the DW_TAG_GNU_annotation DIE) in ctfc_types. */ + dtd->dtd_key = NULL; + dtd->ref_type = ref_dtd; + dtd->dtd_data.ctti_info = CTF_TYPE_INFO (CTF_K_TYPE_TAG, flag, 0); + dtd->dtd_u.dtu_tag.ref_var = NULL; /* Not used for type tags. */ + dtd->dtd_u.dtu_tag.component_idx = 0; /* Not used for type tags. */ + + /* Insert tag directly into the tag list. Type ID will be assigned later. */ + vec_safe_push (ctfc->ctfc_tags, dtd); + return dtd; +} + +ctf_dtdef_ref +ctf_add_decl_tag (ctf_container_ref ctfc, uint32_t flag, const char *value, + ctf_dtdef_ref ref_dtd, uint32_t comp_idx) +{ + ctf_dtdef_ref dtd; + /* Create a DTD for the tag, but do not place it in the regular types list; + ctf format does not (yet) encode tags. */ + dtd = ggc_cleared_alloc (); + + dtd->dtd_name = ctf_add_string (ctfc, value, &(dtd->dtd_data.ctti_name), + CTF_AUX_STRTAB); + /* A single DW_TAG_GNU_annotation DIE may be referenced by multiple DIEs, + e.g. when multiple distinct declarations specify the same decl tag. + We will synthesize multiple CTF DTD records in that case, so we cannot
Re: [PATCH v2 4/5][STAGE1] btf: generate and output DECL_TAG and TYPE_TAG records
On 2/6/25 11:54 AM, David Faust wrote: Support the btf_decl_tag and btf_type_tag attributes in BTF by creating and emitting BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records, respectively, for them. Some care is required when -gprune-btf is in effect to avoid emitting decl or type tags for declarations or types which have been pruned and will not be emitted in BTF. The patch itself looks OK to me. Two comments below. gcc/ * btfout.cc (get_btf_kind): Handle DECL_TAG and TYPE_TAG kinds. (btf_calc_num_vbytes): Likewise. (btf_asm_type): Likewise. (output_asm_btf_vlen_bytes): Likewise. (output_btf_tags): New. (btf_output): Call it here. (btf_add_used_type): Replace with simple wrapper around... (btf_add_used_type_1): ...the implementation. Handle BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. (btf_add_vars): Update btf_add_used_type call. (btf_assign_tag_ids): New. (btf_mark_type_used): Update btf_add_used_type call. (btf_collect_pruned_types): Likewise. Handle type and decl tags. (btf_finish): Call btf_assign_tag_ids. gcc/testsuite/ * gcc.dg/debug/btf/btf-decl-tag-1.c: New test. * gcc.dg/debug/btf/btf-decl-tag-2.c: New test. * gcc.dg/debug/btf/btf-decl-tag-3.c: New test. * gcc.dg/debug/btf/btf-decl-tag-4.c: New test. * gcc.dg/debug/btf/btf-type-tag-1.c: New test. * gcc.dg/debug/btf/btf-type-tag-2.c: New test. * gcc.dg/debug/btf/btf-type-tag-3.c: New test. * gcc.dg/debug/btf/btf-type-tag-4.c: New test. * gcc.dg/debug/btf/btf-type-tag-5.c: New test. * gcc.dg/debug/btf/btf-type-tag-6.c: New test. * gcc.dg/debug/btf/btf-type-tag-c2x-1.c: New test. I think its worth adding a testcase in the BPF backend tests for testing (no) interaction between CO-RE relocs with tags. E.g, when __attribute__((preserve_access_index)) is used on a struct with type tags, the type id in the CO-RE relocation records is the type and not the type tag. Or other tests as you see fit. WDYT ? include/ * btf.h (BTF_KIND_DECL_TAG, BTF_KIND_TYPE_TAG) New defines. (struct btf_decl_tag): New. --- gcc/btfout.cc | 171 +++--- .../gcc.dg/debug/btf/btf-decl-tag-1.c | 14 ++ .../gcc.dg/debug/btf/btf-decl-tag-2.c | 22 +++ .../gcc.dg/debug/btf/btf-decl-tag-3.c | 22 +++ .../gcc.dg/debug/btf/btf-decl-tag-4.c | 34 .../gcc.dg/debug/btf/btf-type-tag-1.c | 27 +++ .../gcc.dg/debug/btf/btf-type-tag-2.c | 15 ++ .../gcc.dg/debug/btf/btf-type-tag-3.c | 21 +++ .../gcc.dg/debug/btf/btf-type-tag-4.c | 25 +++ .../gcc.dg/debug/btf/btf-type-tag-5.c | 35 .../gcc.dg/debug/btf/btf-type-tag-6.c | 15 ++ .../gcc.dg/debug/btf/btf-type-tag-c2x-1.c | 23 +++ include/btf.h | 14 ++ 13 files changed, 414 insertions(+), 24 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-3.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decl-tag-4.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-3.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-4.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-5.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-6.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-type-tag-c2x-1.c diff --git a/gcc/btfout.cc b/gcc/btfout.cc index ff7ea42a961..c00e0c98015 100644 --- a/gcc/btfout.cc +++ b/gcc/btfout.cc @@ -141,6 +141,8 @@ get_btf_kind (uint32_t ctf_kind) case CTF_K_VOLATILE: return BTF_KIND_VOLATILE; case CTF_K_CONST:return BTF_KIND_CONST; case CTF_K_RESTRICT: return BTF_KIND_RESTRICT; +case CTF_K_DECL_TAG: return BTF_KIND_DECL_TAG; +case CTF_K_TYPE_TAG: return BTF_KIND_TYPE_TAG; default:; } return BTF_KIND_UNKN; @@ -217,6 +219,7 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd) case BTF_KIND_CONST: case BTF_KIND_RESTRICT: case BTF_KIND_FUNC: +case BTF_KIND_TYPE_TAG: /* These kinds have no vlen data. */ break; @@ -256,6 +259,10 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd) vlen_bytes += vlen * sizeof (struct btf_var_secinfo); break; +case BTF_KIND_DECL_TAG: + vlen_bytes += sizeof (struct btf_decl_tag); + break; + default: break; } @@ -452,6 +459,20 @@ btf_asm_type (ctf_dtdef_ref dtd) and should write 0. */ dw2_asm_output_data (4, 0, "(unused)"); return; +case BTF_KIND_DECL_TAG: + { +
Re: [PATCH] [testsuite] adjust expectations of x86 vect-simd-clone tests
On Feb 13, 2025, at 1:51 AM, Alexandre Oliva wrote: > > Some vect-simd-clone tests fail when targeting ancient x86 variants, > because the expected transformations only take place with -msse4 or > higher. > > So arrange for these tests to take an -msse4 option on x86, so that > the expected vectorization takes place, but decay to a compile test if > vect.exp would enable execution but the target doesn't have an sse4 > runtime. This requires the new dg-do-if to override the action on a > target while retaining the default action on others, instead of > disabling the test. > > We can count on avx512f compile-time support for these tests, because > vect_simd_clones requires that on x86, and that implies sse4 support, > so we need not complicate the scan conditionals with tests for sse4, > except on the last test. > > Regstrapped on x86_64-linux-gnu, also tested with gcc-14 targeting > x86_64-elf, targeting a cpu without sse4 support. Ok to install? Ok. If you or others notice needed follow on work, ok for that as it should be easy enough to figure out once the bulk is in. I thought I saw one more needing review. I accidentally trashed the message while catching up and went to find it via name and no other comments. I didn't find it. Feel free to reping it.
Re: [PATCH] avoid-store-forwarding: Handle REG_EH_REGION notes
On 2/24/25 9:43 AM, Michael Matz wrote: It depends, but even if that were no problem in some specific cases (perhaps by ensuring that such sequence isn't intermingled with insns that are in different EH regions) that doesn't seem to be what the proposal is doing? From the description it moves a EH note from a load (where it presumably was for a reason) that occurs after a store to something that so happens to be the (new) last instruction of a bb. Because otherwise the load couldn't be moved earlier into the middle of a BB. In essence the load would lose the EH note and a new EH note would be generated for a different insn out of thin air. I don't see how that can ever be a correct transformation in general. What _if_ the very load traps then? What if the new insn cannot trap? (Note that loads and stores may trap differently even on the same address, e.g. write to read-only mem, or an misaligned load but an aligned (or smaller-sized) write; though I'll say that GCC assumes that loads cannot trap if shadowed by an equivalent write, in some other passes) Good points. The alignment case was just starting to rattle around in my head.It may simply be the case we can't do much here. jeff
[PATCH] RISC-V: Avoid updating vl right before branching on avl
See [1] thread for original patch which spawned this one. We are currently seeing the following code where we perform a vsetvl before a branching instruction against the avl. vsetvli a5,a1,e32,m1,tu,ma vle32.v v2,0(a0) sub a1,a1,a5 <-- a1 potentially set to 0 sh2add a0,a5,a0 vfmacc.vv v1,v2,v2 vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0 beq a1,zero,.L12 <-- check if avl is 0 Since we are branching off of the avl, we don't need to update vl until after the branch is taken. Search the ready queue for vsetvls scheduled before branching instructions that branch off of the same regno and promote the branches to execute first. This can improve performancy by potentially avoiding setting VL=0 which may be expensive on some uarches. [1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675622.html PR/117974 gcc/ChangeLog: * config/riscv/riscv.cc (vsetvl_avl_regno): New helper function. (insn_increases_zeroness_p): Ditto. (riscv_promote_ready): Ditto. (riscv_sched_reorder): Implement hook. (TARGET_SCHED_REORDER): Define Hook. * config/riscv/riscv.opt: New flag. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr117974.c: New test. Signed-off-by: Edwin Lu Co-authored-by: Palmer Dabbelt --- gcc/config/riscv/riscv.cc | 103 ++ gcc/config/riscv/riscv.opt| 4 + .../gcc.target/riscv/rvv/vsetvl/pr117974.c| 15 +++ 3 files changed, 122 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr117974.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 89aa25d5da9..cf0866fa3fb 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -14035,6 +14035,106 @@ bool need_shadow_stack_push_pop_p () return is_zicfiss_p () && riscv_save_return_addr_reg_p (); } +static int +vsetvl_avl_regno(rtx_insn *insn) +{ + if (recog_memoized (insn) < 0) +return -1; + + if (get_attr_type (insn) != TYPE_VSETVL + && get_attr_type (insn) != TYPE_VSETVL_PRE) +return -1; + + extract_insn (insn); + /* From vector.md, vsetvl operands are as follows: + ;; operands[0]: VL. + ;; operands[1]: AVL. + ;; operands[2]: SEW + ;; operands[3]: LMUL + ;; operands[4]: Tail policy 0 or 1 (undisturbed/agnostic) + ;; operands[5]: Mask policy 0 or 1 (undisturbed/agnostic) + Return regno of avl operand. */ + return REGNO (recog_data.operand[1]); +} + +static bool +insn_increases_zeroness_p(rtx_insn *insn, int regno) +{ + /* Check for branching against zero. */ + if (JUMP_P (insn)) +{ + extract_insn (insn); + bool match_reg = false; + bool comp_zero = false; + for (int i = 0; i < recog_data.n_operands; i++) + { + if (REG_P (recog_data.operand[i]) + && REGNO (recog_data.operand[i]) == regno) + match_reg = true; + if (CONST_INT_P (recog_data.operand[i]) + && XINT (recog_data.operand[i], 0) == 0 + && XWINT (recog_data.operand[i], 0) == 0) + comp_zero = true; + } + return match_reg && comp_zero; +} + return false; +} + +/* Copied from MIPS. Removes the instruction at index LOWER from ready + queue READY and reinserts it in from of the instruction at index + HIGHER. LOWER must be <= HIGHER. */ +static void +riscv_promote_ready (rtx_insn **ready, int lower, int higher) +{ + rtx_insn *new_head; + int i; + + new_head = ready[lower]; + for (i = lower; i < higher; i++) +ready[i] = ready[i + 1]; + ready[i] = new_head; +} + +/* Attempt to avoid issuing VSETVL-type instructions before a branch that + ensures they are non-zero, as setting VL=0 dynamically can be slow. */ +static int +riscv_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED, +rtx_insn **ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED) +{ +if (! TARGET_AVOID_VL_EQ_0) + return riscv_issue_rate (); + +for (int i = *nreadyp - 1; i >= 0; i--) + { + /* Find the vsetvl. */ + int avl_regno = vsetvl_avl_regno (ready[i]); + if (avl_regno == -1 || i == 0) + continue; + for (int j = i - 1; j >= 0; j--) + { + /* Exit if another vsetvl is found before finding a branch insn + in the ready queue. */ + if (recog_memoized (ready[j]) >= 0 + && get_attr_type (ready[j]) == TYPE_VSETVL + && get_attr_type (ready[j]) == TYPE_VSETVL_PRE) + break; + /* Find branch. */ + if (recog_memoized (ready[j]) >= 0 + && insn_increases_zeroness_p (ready[j], avl_regno)) + { + /* Right now the only zeroness-increasing pattern we recognize + is a branch-not-zero, so there's no sense in looking for any +
Re: [PATCH] RISC-V: Avoid updating vl right before branching on avl
On 2/24/2025 4:34 PM, Jeff Law wrote: On 2/24/25 5:07 PM, Edwin Lu wrote: See [1] thread for original patch which spawned this one. We are currently seeing the following code where we perform a vsetvl before a branching instruction against the avl. vsetvli a5,a1,e32,m1,tu,ma vle32.v v2,0(a0) sub a1,a1,a5 <-- a1 potentially set to 0 sh2add a0,a5,a0 vfmacc.vv v1,v2,v2 vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0 beq a1,zero,.L12 <-- check if avl is 0 Since we are branching off of the avl, we don't need to update vl until after the branch is taken. Search the ready queue for vsetvls scheduled before branching instructions that branch off of the same regno and promote the branches to execute first. This can improve performancy by potentially avoiding setting VL=0 which may be expensive on some uarches. [1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675622.html PR/117974 gcc/ChangeLog: * config/riscv/riscv.cc (vsetvl_avl_regno): New helper function. (insn_increases_zeroness_p): Ditto. (riscv_promote_ready): Ditto. (riscv_sched_reorder): Implement hook. (TARGET_SCHED_REORDER): Define Hook. * config/riscv/riscv.opt: New flag. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr117974.c: New test. So I preferred the earlier approach of disabling speculation of the vsetvls, though I'm guessing you're looking at this approach because that was insufficient? I don't think that it was because it was insufficient, but that it might be too constraining. I'm not exactly certain if there is the possibility where speculatively issuing some vsetvls in some situations could improve perf. I think with this patch, it targets the specific issue we're facing directly which is updating vl=0 when it's unnecessary. I don't know which patch would be better so I'm putting this out there as a potential alternative. Regardless, like the disabling of speculation I tend to think this should be controlled by an entry in the tuning structure rather than flags. Flags are fine for internal testing, but I don't see a compelling need for a flag to control this in general. Jeff That makes sense to me. I completely forgot that we had talked about putting it into the tuning structure when writing this. I'll update the tunes respectively in a later version of the patches depending on which one moves forward. Edwin
Contents of PO file 'cpplib-15-b20250216.ro.po'
cpplib-15-b20250216.ro.po.gz Description: Binary data The Translation Project robot, in the name of your translation coordinator.
New Romanian PO file for 'cpplib' (version 15-b20250216)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'cpplib' has been submitted by the Romanian team of translators. The file is available at: https://translationproject.org/latest/cpplib/ro.po (This file, 'cpplib-15-b20250216.ro.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/cpplib/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/cpplib.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
Re: [PATCH] RISC-V: Avoid updating vl right before branching on avl
On 2/24/25 6:10 PM, Edwin Lu wrote: So I preferred the earlier approach of disabling speculation of the vsetvls, though I'm guessing you're looking at this approach because that was insufficient? I don't think that it was because it was insufficient, but that it might be too constraining. I'm not exactly certain if there is the possibility where speculatively issuing some vsetvls in some situations could improve perf. On an out of order core I would expect the hardware to do the speculation and doing it in the compiler is of marginal value, if any. I think with this patch, it targets the specific issue we're facing directly which is updating vl=0 when it's unnecessary. I don't know which patch would be better so I'm putting this out there as a potential alternative. I suspect (but haven't verified) that the only time you'll see the conditional branch and the vsetvl in the ready queue together is in the speculative case. Now it may be the case that you can be more targeted in inhibiting speculation, but again, for an out of order core I don't think it's actually going to matter. Essentially when we aren't speculating the conditional branch will be dependent on all the instructions in the block. So the conditional branch won't go into the ready queue until after everything else has issued. Given we're not planning to move forward until stage1 reopens, you've got time to play with it internally. I've also got a mental todo to check with our designers to see if we have any sensitivity to this issue. jeff
Re: [PATCH] RISC-V: Avoid updating vl right before branching on avl
On 2/24/25 5:07 PM, Edwin Lu wrote: See [1] thread for original patch which spawned this one. We are currently seeing the following code where we perform a vsetvl before a branching instruction against the avl. vsetvli a5,a1,e32,m1,tu,ma vle32.v v2,0(a0) sub a1,a1,a5 <-- a1 potentially set to 0 sh2add a0,a5,a0 vfmacc.vv v1,v2,v2 vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0 beq a1,zero,.L12 <-- check if avl is 0 Since we are branching off of the avl, we don't need to update vl until after the branch is taken. Search the ready queue for vsetvls scheduled before branching instructions that branch off of the same regno and promote the branches to execute first. This can improve performancy by potentially avoiding setting VL=0 which may be expensive on some uarches. [1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675622.html PR/117974 gcc/ChangeLog: * config/riscv/riscv.cc (vsetvl_avl_regno): New helper function. (insn_increases_zeroness_p): Ditto. (riscv_promote_ready): Ditto. (riscv_sched_reorder): Implement hook. (TARGET_SCHED_REORDER): Define Hook. * config/riscv/riscv.opt: New flag. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr117974.c: New test. So I preferred the earlier approach of disabling speculation of the vsetvls, though I'm guessing you're looking at this approach because that was insufficient? Regardless, like the disabling of speculation I tend to think this should be controlled by an entry in the tuning structure rather than flags. Flags are fine for internal testing, but I don't see a compelling need for a flag to control this in general. Jeff
Re: [PATCH] RISC-V: Avoid updating vl right before branching on avl
On 2/24/25 16:07, Edwin Lu wrote: > See [1] thread for original patch which spawned this one. > > We are currently seeing the following code where we perform a vsetvl > before a branching instruction against the avl. > > vsetvli a5,a1,e32,m1,tu,ma > vle32.v v2,0(a0) > sub a1,a1,a5 <-- a1 potentially set to 0 > sh2add a0,a5,a0 > vfmacc.vv v1,v2,v2 > vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0 > beq a1,zero,.L12 <-- check if avl is 0 > > Since we are branching off of the avl, we don't need to update vl until > after the branch is taken. Search the ready queue for vsetvls scheduled > before branching instructions that branch off of the same regno and > promote the branches to execute first. This can improve performancy by > potentially avoiding setting VL=0 which may be expensive on some uarches. > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675622.html > > PR/117974 > > gcc/ChangeLog: > > * config/riscv/riscv.cc (vsetvl_avl_regno): New helper function. > (insn_increases_zeroness_p): Ditto. > (riscv_promote_ready): Ditto. > (riscv_sched_reorder): Implement hook. > (TARGET_SCHED_REORDER): Define Hook. > * config/riscv/riscv.opt: New flag. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/vsetvl/pr117974.c: New test. > > Signed-off-by: Edwin Lu > Co-authored-by: Palmer Dabbelt > --- > gcc/config/riscv/riscv.cc | 103 ++ > gcc/config/riscv/riscv.opt| 4 + > .../gcc.target/riscv/rvv/vsetvl/pr117974.c| 15 +++ > 3 files changed, 122 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr117974.c > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 89aa25d5da9..cf0866fa3fb 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -14035,6 +14035,106 @@ bool need_shadow_stack_push_pop_p () >return is_zicfiss_p () && riscv_save_return_addr_reg_p (); > } > > +static int > +vsetvl_avl_regno(rtx_insn *insn) > +{ > + if (recog_memoized (insn) < 0) > +return -1; > + > + if (get_attr_type (insn) != TYPE_VSETVL > + && get_attr_type (insn) != TYPE_VSETVL_PRE) > +return -1; > + > + extract_insn (insn); > + /* From vector.md, vsetvl operands are as follows: > + ;; operands[0]: VL. > + ;; operands[1]: AVL. > + ;; operands[2]: SEW > + ;; operands[3]: LMUL > + ;; operands[4]: Tail policy 0 or 1 (undisturbed/agnostic) > + ;; operands[5]: Mask policy 0 or 1 (undisturbed/agnostic) > + Return regno of avl operand. */ > + return REGNO (recog_data.operand[1]); > +} > + > +static bool > +insn_increases_zeroness_p(rtx_insn *insn, int regno) Mnir point, but this is just one case of introducing VL=0 zeroness specific to branch, not the general case. > +{ > + /* Check for branching against zero. */ > + if (JUMP_P (insn)) > +{ > + extract_insn (insn); > + bool match_reg = false; > + bool comp_zero = false; > + for (int i = 0; i < recog_data.n_operands; i++) > + { > + if (REG_P (recog_data.operand[i]) > + && REGNO (recog_data.operand[i]) == regno) > + match_reg = true; > + if (CONST_INT_P (recog_data.operand[i]) > + && XINT (recog_data.operand[i], 0) == 0 > + && XWINT (recog_data.operand[i], 0) == 0) > + comp_zero = true; > + } > + return match_reg && comp_zero; > +} > + return false; > +} > + > +/* Copied from MIPS. Removes the instruction at index LOWER from ready > + queue READY and reinserts it in from of the instruction at index > + HIGHER. LOWER must be <= HIGHER. */ > +static void > +riscv_promote_ready (rtx_insn **ready, int lower, int higher) > +{ > + rtx_insn *new_head; > + int i; > + > + new_head = ready[lower]; > + for (i = lower; i < higher; i++) > +ready[i] = ready[i + 1]; > + ready[i] = new_head; > +} > + > +/* Attempt to avoid issuing VSETVL-type instructions before a branch that > + ensures they are non-zero, as setting VL=0 dynamically can be slow. */ > +static int > +riscv_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose > ATTRIBUTE_UNUSED, > + rtx_insn **ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED) > +{ > +if (! TARGET_AVOID_VL_EQ_0) > + return riscv_issue_rate (); > + > +for (int i = *nreadyp - 1; i >= 0; i--) > + { > + /* Find the vsetvl. */ > + int avl_regno = vsetvl_avl_regno (ready[i]); > + if (avl_regno == -1 || i == 0) > + continue; > + for (int j = i - 1; j >= 0; j--) > + { > + /* Exit if another vsetvl is found before finding a branch insn > +in the ready queue. */ > + if (recog_memoized (ready[j]) >= 0 > + && get_attr_type (ready[j]) == TYPE_VSETVL > + && get_attr_type (ready[j]) == TYPE_VSETVL_PRE) > +
Reply
Hello, Is this email address still the most reliable way to get in touch with you? *Farhan Faruqui*
Re: [PATCH] c++, v2: Fix range for with PMFs [PR118923]
On 2/24/25 10:01 AM, Jakub Jelinek wrote: On Mon, Feb 24, 2025 at 08:29:31AM -0500, Jason Merrill wrote: On 2/24/25 5:52 AM, Jakub Jelinek wrote: The following testcases segfault because the new range for -frange-for-ext-temps temporary extension extends even the internal TARGET_EXPRs created by get_member_function_from_ptrfunc. The following patch fixes that by marking those TARGET_EXPR_INTERNAL_P. I'm not calling get_internal_target_expr because I really want to force the TARGET_EXPR creation rather than say mark some existing TARGET_EXPR internal, Hmm, good point, but that seems like a reason to use force_target_expr in gen_internal_target_expr. and get_internal_target_expr doesn't take complain argument either and always uses tf_warning_or_error. That shouldn't be an issue for any internal temporaries; complain is only relevant for building the cleanup that internal temps must not have. It looks like there are a few more internal uses of force_target_expr in cp_finish_decl and build_comparison_op. So like this if it passes full bootstrap/regtest? So far passed make check-g++ in gcc (98,11,14,17,20,23,26) and make check in libstdc++-v3. OK. 2025-02-24 Jakub Jelinek PR c++/118923 * tree.cc (get_internal_target_expr): Use force_target_expr instead of build_target_expr_with_type. * typeck.cc (get_member_function_from_ptrfunc): Use get_internal_target_expr instead of force_target_expr. * decl.cc (cp_finish_decl): Likewise. * method.cc (build_comparison_op): Likewise. * g++.dg/cpp0x/pr118923.C: New test. * g++.dg/cpp1y/pr118923.C: New test. --- gcc/cp/tree.cc.jj 2025-02-24 00:06:25.776732504 +0100 +++ gcc/cp/tree.cc 2025-02-24 15:04:28.991381102 +0100 @@ -982,8 +982,7 @@ tree get_internal_target_expr (tree init) { init = convert_bitfield_to_declared_type (init); - tree t = build_target_expr_with_type (init, TREE_TYPE (init), - tf_warning_or_error); + tree t = force_target_expr (TREE_TYPE (init), init, tf_warning_or_error); TARGET_EXPR_INTERNAL_P (t) = true; return t; } --- gcc/cp/typeck.cc.jj 2025-01-23 11:17:39.651880614 +0100 +++ gcc/cp/typeck.cc2025-02-24 15:05:27.817562229 +0100 @@ -4219,16 +4219,14 @@ get_member_function_from_ptrfunc (tree * && !DECL_P (instance_ptr) && !TREE_CONSTANT (instance_ptr))) instance_ptr = instance_save_expr - = force_target_expr (TREE_TYPE (instance_ptr), instance_ptr, - complain); + = get_internal_target_expr (instance_ptr); /* See above comment. */ if (TREE_SIDE_EFFECTS (function) || (!nonvirtual && !DECL_P (function) && !TREE_CONSTANT (function))) - function - = force_target_expr (TREE_TYPE (function), function, complain); + function = get_internal_target_expr (function); /* Start by extracting all the information from the PMF itself. */ e3 = pfn_from_ptrmemfunc (function); --- gcc/cp/decl.cc.jj 2025-02-24 00:06:25.665734038 +0100 +++ gcc/cp/decl.cc 2025-02-24 15:08:52.027729921 +0100 @@ -9377,8 +9377,7 @@ cp_finish_decl (tree decl, tree init, bo tree guard = NULL_TREE; if (cleanups || cleanup) { - guard = force_target_expr (boolean_type_node, -boolean_false_node, tf_none); + guard = get_internal_target_expr (boolean_false_node); add_stmt (guard); guard = TARGET_EXPR_SLOT (guard); } @@ -9407,8 +9406,7 @@ cp_finish_decl (tree decl, tree init, bo popped that all, so push those extra cleanups around the whole sequence with a guard variable. */ gcc_assert (TREE_CODE (sl) == STATEMENT_LIST); - guard = force_target_expr (integer_type_node, -integer_zero_node, tf_none); + guard = get_internal_target_expr (integer_zero_node); add_stmt (guard); guard = TARGET_EXPR_SLOT (guard); for (unsigned i = 0; i < n_extra_cleanups; ++i) --- gcc/cp/method.cc.jj 2025-01-15 08:59:46.383217835 +0100 +++ gcc/cp/method.cc2025-02-24 15:10:18.674537705 +0100 @@ -1597,7 +1597,7 @@ build_comparison_op (tree fndecl, bool d /* Some other array, will need runtime loop. */ else { - idx = force_target_expr (sizetype, maxval, complain); + idx = get_internal_target_expr (maxval); loop_indexes = tree_cons (idx, NULL_TREE, loop_indexes); } expr_type = TREE_TYPE (expr_type); --- gcc/testsuite/g++.dg/cpp0x/pr118923.C.jj2025-02-24 15:02:04.966385945 +0100 +++ gcc/testsuite/g++.dg/c
Re: The COBOL front end, version 3, now in 14 easy pieces
On Mon, 24 Feb 2025 14:51:27 +0100 Richard Biener wrote: > On Wed, Feb 19, 2025 at 12:38?AM James K. Lowden > wrote: > > > > The following 14 patches constitute 105,720 lines of code in 83 > > files to build and document the COBOL front end. [...] > > I tested these patches using "git apply" to an unpublished branch > > "cobol-patched". > > I have now built the compiler from the (now published) cobol-patched > branch. I rebuilt cobol-patched today. It now includes changes to gcc/doc/, and a new gcc/Makefile.in (previously omitted because I thought it was generated). That deals with some but not all the issues below. > On x86_64-linux and noticed the following issues: > > The toplevel configure script on the branch wasn't re-generated (duh), > so it isn't > ready for use. Since we're not mailing patches, cobol-patched is now built by applying patches that include the configure scripts. > gcc-cobol/gcc/cobol/parse.y:1361.10-16: error: require bison 3.5.1, > but have 3.0.4 > %require "3.5.1" //3.8.2 also works, but not 3.8.0 > ^^^ > > this requirement isn't documented, neither is a version requirement > for flex. The > place for this is in gcc/doc/install.texi I think you will find the updated .texi files comprehesively (!) reflect the presence of gcobol and its requirements. There is a hitch. The patches are against master branch as of 5 February. Many changes occurred in the time since. Now that I know that, I *expect* a conflict extravaganza as we pull that together. Best IMO is for us to update our repository, i.e. to merge master into our master+cobol, then to cobol-stage, then to cobol-patched. Questions: 1. How often should we do that, would you say? 2. Is cobol-patched more useful to you right now as-is, or should I regenerate it tomorrow? > During the build we write to > > gcc/cobol/charmaps-dupe.cc > gcc/cobol/valconv-dupe.cc Not yet fixed. Top of the list, now. > Installing the result via make install DESTDIR=/foo I see both a > 'gcobol' and a 'gcobc' program > being installed - is that intentional? Yes, as explained prior, gcobc is an emulation script. > I also see the gcobol.3 > manpage reside directly in /foo/gcobol.3 rather > than in the expected /foo/usr/local/man/.. gcc/Makefile.in didn't appear in the patches. I now have installed: $ find /usr/local/cobol-patched/ -name '*cobol*' /usr/local/cobol-patched/ /usr/local/cobol-patched/libexec/gcc/x86_64-pc-linux-gnu/15.0.1/cobol1 /usr/local/cobol-patched/bin/gcobol /usr/local/cobol-patched/share/man/man1/gcobol.1 /usr/local/cobol-patched/share/man/man3/gcobol.3 /usr/local/cobol-patched/share/gcobol > Installing of libgcobol fails for me: Hmm, a little worse for me. As you see above, the installed cobol files don't include the library. This was from "make install". Checking. > so the suppression seems incomplete? Or too complete. :-/ > Doing > > > ./install/gcc-cobol/usr/local/bin/gcobol t.cob -m32 > > only fails during linking as we do build a 32bit binary but do not > find (the not built) 32bit runtime. I suspect when writing a program > that would need 128bit integer or float we'd eventually ICE. I think the cobol-patched you were using was missing our patch of Friday. Our intention is to prevent any 32-bit binary from being generated, in addition to not attempting to build the library. But we're not there yet. On my system, configured as $ ../configure --prefix=/usr/local/gcc-cobol --disable-bootstrap --disable-generated-files-in-srcdir --disable-multilib --enable-checking --enable-languages=c,cobol I see failure to link [eliding the "skipping" messages]: $ gcobol -oo -ffixed-form -m32 gcc/cobol/nist/NC/NC101A.cbl /usr/bin/ld: cannot find -lgcobol: No such file or directory /usr/bin/ld: cannot find -lstdc++: No such file or directory /usr/bin/ld: cannot find -lgcc: No such file or directory /usr/bin/ld: cannot find -lgcc_s: No such file or directory If compiled with -c, $ gcobol -oo -c -m32 -ffixed-form ../parser/gcc/cobol/nist/NC/NC101A.cbl $ file o o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped Not our intended result. :-( But, fixable. > The cobol frontend should attempt to intercept the selection of a > not supported multilib (but don't ask me how) and gracefully exit, > possibly calling > > sorry ("selected multilib not available for Cobol"); > > or so. There's the LANG_HOOKS_POST_OPTIONS langhook which > might be a place to do this. The way to check is likely whether > a mode for __int128 exists and is supported. The C fronted does > > if (targetm.scalar_mode_supported_p (TImode)) > > it also checks float128_type_node, but I didn't find who initializes > that. Thank you for the pointer. It's late here today. I will try that in the morning if Bob hasn't already solved it by then. Regards, --jkl
Re: [PATCH] RISC-V: Fix .cfi_offset directive when push pop in zc
Thanks, will push after verified on my hand :) On Mon, Feb 24, 2025 at 3:20 PM Hsing Yu Peng wrote: > > Address Fei's comment. > > Thanks > Lino > > Fei Gao 於 2025年2月24日 週一 下午2:20寫道: > > > > The case you hit is s2 set in frame mask but s1 not. So you're trying to > > set s1 bit manually. > > If so, riscv_adjust_multi_push_cfi_prologue seems the better place for your > > change. > > > > static rtx > > riscv_adjust_multi_push_cfi_prologue (int saved_size) > > { > > rtx dwarf = NULL_RTX; > > rtx adjust_sp_rtx, reg, mem, insn; > > unsigned int mask = cfun->machine->frame.mask; > > int offset; > > int saved_cnt = 0; > > > > - if (mask & S10_MASK) > > -mask |= S11_MASK; /* this corner case is covered in your for loop > > below. */ > > > > + unsigned num_multi_push = riscv_multi_push_regs_count > > (frame->mask); > > + for (int i = 0; i < num_multi_push; i++) > > + { > > + gcc_assert (zcmp_save_reg_order[i] != INVALID_REGNUM); > > + mask |= 1 << (zcmp_save_reg_order[i] - GP_REG_FIRST); /* here > > mask is used > > > > instead of frame->mask*/ > > + } > > ... > > > > BR > > Fei
Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]
> This patch would like to perform the overflow to smode check before IOR > the base2 series, and perform the clean highest bit if the const_vector > overflow to smode occurs. If no overflow, there will do nothing here. I agree with the general idea but I'm a bit wary fiddling with the coefficients directly. I think for a fixed-size, non VLA vector it should be sufficient to check whether the last step overflows. For VLA we actually know the largest possible runtime vector length and could use it the check? But maybe it's probably easier to bail with VLA anyway. In addition: if we're already checking the step anyway we could get rid of the negative step restriction as well. -- Regards Robin
Re: [PATCH] analyzer: Handle nonnull_if_nonzero attribute [PR117023]
On Fri, Feb 14, 2025 at 09:14:48AM -0500, David Malcolm wrote: > I haven't used ranger yet within the analyzer, and I wonder if there is > a philosophical divide here between the goals of optimization versus > bug finding: the optimizer makes use of undefined behavior in order to > add assumptions about the state of the code ("X can't happen"), whereas > the analyzer ought to report on places where undefined behavior could > happen ("if X happened here, it would be bad, let's warn the user"). You're right, that is why also using the ranger in the __builtin_*object_size handling and sanitizers is problematic. Maybe we'll need a ranger mode which only looks at clearly guaranteed things rather than exploiting undefined behavior (including whatever is saved in SSA_NAME_RANGE_INFO). > One way of expressing the conditional nature of this property in the > analyzer would be to "bifurcate" the analysis so that there are two > out-edges in the exploded graph, covering the ptr == NULL vs ptr != > NULL conditions explicitly and separately. But that's fiddly, alas, > and not something to be doing for GCC 15 at this stage (sorry). Indeed. > I see that the code from >state_t state = sm_ctxt.get_state (stmt, arg); > onwards seems to be a copy-and-paste of the existing code from within > the >/* Handle "__attribute__((nonnull))". */ > suite. malloc_state_machine::on_stmt is already longer than I'm > comfortable with, so please can you introduce a subroutine to avoid the > copy-and-paste (probably a private member function of > malloc_state_machine). > > OK for trunk with that change (assuming usual testing, of course) Here is what I've committed after another bootstrap/regtest on x86_64-linux and i686-linux. 2025-02-24 Jakub Jelinek PR c/117023 gcc/analyzer/ * sm-malloc.cc (malloc_state_machine::handle_nonnull): New private method. (malloc_state_machine::on_stmt): Use it for nonnull attribute arguments. Handle also nonnull_if_nonzero attributes. gcc/testsuite/ * c-c++-common/analyzer/call-summaries-malloc.c (test_use_without_check): Pass 4 rather than sz to memset. * c-c++-common/analyzer/strncpy-1.c (test_null_dst, test_null_src): Pass 42 rather than count to strncpy. --- gcc/analyzer/sm-malloc.cc.jj2024-10-24 22:56:14.58433 +0200 +++ gcc/analyzer/sm-malloc.cc 2025-02-24 00:28:51.55033 +0100 @@ -501,6 +501,12 @@ private: void on_zero_assignment (sm_context &sm_ctxt, const gimple *stmt, tree lhs) const; + void handle_nonnull (sm_context &sm_ctx, + const supernode *node, + const gimple *stmt, + tree fndecl, + tree arg, + unsigned i) const; /* A map for consolidating deallocators so that they are unique per deallocator FUNCTION_DECL. */ @@ -2004,6 +2010,43 @@ malloc_state_machine::maybe_assume_non_n } } +/* Helper method for malloc_state_machine::on_stmt. Handle a single + argument (Ith argument ARG) if it is nonnull or nonnull_if_nonzero + and size is nonzero. */ + +void +malloc_state_machine::handle_nonnull (sm_context &sm_ctxt, + const supernode *node, + const gimple *stmt, + tree fndecl, + tree arg, + unsigned i) const +{ + state_t state = sm_ctxt.get_state (stmt, arg); + /* Can't use a switch as the states are non-const. */ + /* Do use the fndecl that caused the warning so that the + misused attributes are printed and the user not confused. */ + if (unchecked_p (state)) +{ + tree diag_arg = sm_ctxt.get_diagnostic_tree (arg); + sm_ctxt.warn (node, stmt, arg, + make_unique (*this, diag_arg, fndecl, + i)); + const allocation_state *astate + = as_a_allocation_state (state); + sm_ctxt.set_next_state (stmt, arg, astate->get_nonnull ()); +} + else if (state == m_null) +{ + tree diag_arg = sm_ctxt.get_diagnostic_tree (arg); + sm_ctxt.warn (node, stmt, arg, + make_unique (*this, diag_arg, fndecl, i)); + sm_ctxt.set_next_state (stmt, arg, m_stop); +} + else if (state == m_start) +maybe_assume_non_null (sm_ctxt, arg, stmt); +} + /* Implementation of state_machine::on_stmt vfunc for malloc_state_machine. */ bool @@ -2118,37 +2161,37 @@ malloc_state_machine::on_stmt (sm_contex just the specified pointers. */ if (bitmap_empty_p (nonnull_args) || bitmap_bit_p (nonnull_args, i)) - { - state_t state = sm_ctxt.get_state (stmt, arg); - /* Can't use a switch as the states a
[committed] openmp: Fix diagnostics typo [PR118993]
Hi! There is a typo in one of the OpenMP gimplification diagnostics messages. The following patch fixes that and adjusts tests which just copied that message including typo to dg-warning regexps in 2 tests. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2025-02-24 Jakub Jelinek PR middle-end/118993 * gimplify.cc (gimplify_scan_omp_clauses): Fix diagnostics typo, undfined -> undefined. * c-c++-common/gomp/allocate-18.c: Adjust dg-warning regex for diagnostics typo fix. * gfortran.dg/gomp/allocate-clause.f90: Likewise. --- gcc/gimplify.cc.jj 2025-01-30 22:09:17.523382838 +0100 +++ gcc/gimplify.cc 2025-02-23 23:59:56.021122542 +0100 @@ -14007,7 +14007,7 @@ gimplify_scan_omp_clauses (tree *list_p, && (code == OMP_TARGET || code == OMP_TASK || code == OMP_TASKLOOP)) warning_at (OMP_CLAUSE_LOCATION (c), OPT_Wopenmp, "allocator with access trait set to % " - "results in undfined behavior for %qs directive", + "results in undefined behavior for %qs directive", code == OMP_TARGET ? "target" : (code == OMP_TASK ? "task" : "taskloop")); --- gcc/testsuite/c-c++-common/gomp/allocate-18.c.jj2024-12-04 00:45:08.606203630 +0100 +++ gcc/testsuite/c-c++-common/gomp/allocate-18.c 2025-02-24 00:00:08.043956210 +0100 @@ -36,16 +36,16 @@ test1 () x[0] = 1; #pragma omp target allocate(omp_thread_mem_alloc: x) firstprivate(x) /* uses_allocators(omp_thread_mem_alloc) */ - /* { dg-warning "allocator with access trait set to 'thread' results in undfined behavior for 'target' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } */ + /* { dg-warning "allocator with access trait set to 'thread' results in undefined behavior for 'target' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } */ x[0] = 1; #pragma omp taskloop allocate(omp_thread_mem_alloc: x) firstprivate(x) - /* { dg-warning "allocator with access trait set to 'thread' results in undfined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } */ + /* { dg-warning "allocator with access trait set to 'thread' results in undefined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } */ for (int i = 0; i < 5; i++) x[i] = i; #pragma omp parallel master taskloop simd allocate(omp_thread_mem_alloc: x) firstprivate(x) - /* { dg-warning "allocator with access trait set to 'thread' results in undfined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } */ + /* { dg-warning "allocator with access trait set to 'thread' results in undefined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } */ for (int i = 0; i < 5; i++) x[i] = i; @@ -53,7 +53,7 @@ test1 () #pragma omp masked { #pragma omp task allocate(omp_thread_mem_alloc: x) firstprivate(x) - /* { dg-warning "allocator with access trait set to 'thread' results in undfined behavior for 'task' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } */ + /* { dg-warning "allocator with access trait set to 'thread' results in undefined behavior for 'task' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } */ x[0] = 1; } } --- gcc/testsuite/gfortran.dg/gomp/allocate-clause.f90.jj 2024-10-07 10:49:58.098362880 +0200 +++ gcc/testsuite/gfortran.dg/gomp/allocate-clause.f90 2025-02-24 00:00:12.407895839 +0100 @@ -34,18 +34,18 @@ subroutine test1 () !$omp end parallel !$omp target allocate(omp_thread_mem_alloc: x) firstprivate(x) ! uses_allocators(omp_thread_mem_alloc) -! { dg-warning "allocator with access trait set to 'thread' results in undfined behavior for 'target' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } +! { dg-warning "allocator with access trait set to 'thread' results in undefined behavior for 'target' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } x(1) = 1 !$omp end target !$omp taskloop allocate(omp_thread_mem_alloc: x) firstprivate(x) - ! { dg-warning "allocator with access trait set to 'thread' results in undfined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } + ! { dg-warning "allocator with access trait set to 'thread' results in undefined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } do i = 1, 5 x(i) = i end do !$omp parallel master taskloop simd allocate(omp_thread_mem_alloc: x) firstprivate(x) - ! { dg-warning "allocator with access trait set to 'thread' results in undfined behavior for 'taskloop' directive \\\[-Wopenmp\\\]" "" { target *-*-* } .-1 } + ! { dg-warning "allocator with access trait set to 'thread' results in undefined behavior for 'taskloop'
[PATCH] reassoc: Fix up optimize_range_tests_to_bit_test [PR118915]
Hi! The following testcase is miscompiled due to a bug in optimize_range_tests_to_bit_test. It is trying to optimize check for a in [-34,-34] or [-26,-26] or [-6,-6] or [-4,inf] ranges. Another reassoc optimization folds the the test for the first two ranges into (a + 34U) & ~8U in [0U,0U] range, and extract_bit_test_mask actually has code to virtually undo it and treat that again as test for a being -34 or -26. The problem is that optimize_range_tests_to_bit_test remembers in the type variable TREE_TYPE (ranges[i].exp); from the first range. If extract_bit_test_mask doesn't do that virtual undoing of the BIT_AND_EXPR handling, that is just fine, the returned exp is ranges[i].exp. But if the first range is BIT_AND_EXPR, the type could be different, the BIT_AND_EXPR form has the optional cast to corresponding unsigned type in order to avoid introducing UB. Now, type was used to fill in the max value if ranges[j].high was missing in subsequently tested range, and so in this particular testcase the [-4,inf] range which was signed int and so [-4,INT_MAX] was treated as [-4,UINT_MAX] instead. And we were subtracting values of 2 different types and trying to make sense out of that. The following patch fixes this by using the type of the low bound (which is always non-NULL) for the max value of the high bound instead. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2025-02-24 Jakub Jelinek PR tree-optimization/118915 * tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): For highj == NULL_TREE use TYPE_MAX_VALUE (TREE_TYPE (lowj)) rather than TYPE_MAX_VALUE (type). * gcc.c-torture/execute/pr118915.c: New test. --- gcc/tree-ssa-reassoc.cc.jj 2025-02-13 19:59:56.157572421 +0100 +++ gcc/tree-ssa-reassoc.cc 2025-02-20 22:47:42.646500373 +0100 @@ -3362,7 +3362,7 @@ optimize_range_tests_to_bit_test (enum t continue; highj = ranges[j].high; if (highj == NULL_TREE) - highj = TYPE_MAX_VALUE (type); + highj = TYPE_MAX_VALUE (TREE_TYPE (lowj)); wide_int mask2; exp2 = extract_bit_test_mask (ranges[j].exp, prec, lowi, lowj, highj, &mask2, NULL); --- gcc/testsuite/gcc.c-torture/execute/pr118915.c.jj 2025-02-20 22:46:21.960212578 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr118915.c 2025-02-20 22:46:01.019137888 +0100 @@ -0,0 +1,22 @@ +/* PR tree-optimization/118915 */ + +int a; + +int +foo (int c, int d, int e, int f) +{ + if (!d || !e) +return -22; + if (c > 16) +return -22; + if (!f) +return -22; + return 2; +} + +int +main () +{ + if (foo (a + 21, a + 6, a + 34, a + 26) != -22) +__builtin_abort (); +} Jakub
[PATCH PING] combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]
I would like to ping for the following patch that fixes P1 regression: gcc/ChangeLog: * combine.cc (distribute_notes) : Remove REG_UNUSED note from i2 when the register is also mentioned in i3. gcc/testsuite/ChangeLog: * gcc.target/i386/pr118739.c: New test. https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675585.html Thanks, Uros.
Re: [PATCH] avoid-store-forwarding: Handle REG_EH_REGION notes
On Fri, Feb 21, 2025 at 4:55 PM Konstantinos Eleftheriou wrote: > > Hi Richard, thanks for the feedback. > > On Tue, Feb 18, 2025 at 9:17 PM Richard Biener > wrote: > > > > > > > > > Am 18.02.2025 um 17:04 schrieb Konstantinos Eleftheriou > > > : > > > > > > From: kelefth > > > > > > The pass rejects the transformation when there are instructions in the > > > sequence that might throw an exception. This was added due to having > > > cases that the load instruction contains a REG_EH_REGION note and > > > moving it before the store instructions caused an error, as it was > > > no longer the last instruction in the basic block. > > > > > > This patch handles those cases by moving a possible REG_EH_REGION > > > note from the load instruction of the store-load sequence to the > > > last instruction of the basic block. > > > > But that’s not a correct transform and will lead to bogus exception > > handling? You’d need to move the note and split the block, possibly > > updating the EH info on the side. > > I had originally thought about splitting the block, but tried to get > away with this solution. In case that we are splitting the block after > the load, we wouldn't need to also move the note, right? I don't think it makes much sense in trying to handle loads/stores with EH, you generally can't re-order stuff around EH since that changes what is observable from the EH handler and the EH might also guard exceptions on later stmts. Richard. > > Thanks, > Konstantinos > > > Richard > > > > > gcc/ChangeLog: > > > > > >* avoid-store-forwarding.cc (process_store_forwarding): > > >(store_forwarding_analyzer::avoid_store_forwarding): > > >Move a possible REG_EH_REGION note from the load instruction > > >to the last instruction of the basic block. > > > --- > > > gcc/avoid-store-forwarding.cc | 13 - > > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > > > diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc > > > index 34a7bba4043..05c91bb1a82 100644 > > > --- a/gcc/avoid-store-forwarding.cc > > > +++ b/gcc/avoid-store-forwarding.cc > > > @@ -400,6 +400,17 @@ process_store_forwarding (vec > > > &stores, rtx_insn *load_insn, > > > if (load_elim) > > > delete_insn (load_insn); > > > > > > + /* Find possible REG_EH_REGION note in the load instruction and move it > > > + into the last instruction of the basic block. */ > > > + rtx reg_eh_region_note = find_reg_note (load_insn, REG_EH_REGION, > > > NULL_RTX); > > > + if (reg_eh_region_note != NULL_RTX) > > > +{ > > > + remove_note (load_insn, reg_eh_region_note); > > > + basic_block load_bb = BLOCK_FOR_INSN (load_insn); > > > + add_reg_note (BB_END (load_bb), REG_EH_REGION, > > > +XEXP (reg_eh_region_note, 0)); > > > +} > > > + > > > return true; > > > } > > > > > > @@ -425,7 +436,7 @@ store_forwarding_analyzer::avoid_store_forwarding > > > (basic_block bb) > > > > > > rtx set = single_set (insn); > > > > > > - if (!set || insn_could_throw_p (insn)) > > > + if (!set) > > >{ > > > store_exprs.truncate (0); > > > continue; > > > -- > > > 2.47.0 > > >
Re: [PATCH] Further use of mod_scope in modified_type_die
On Thu, Oct 3, 2024 at 6:39 PM Tom Tromey wrote: > > I am working on some changes to GNAT to emit hierarchical DWARF -- > i.e., where entities will have simple names nested in a DW_TAG_module. > > While working on this I found a couple of paths in modified_type_die > where "mod_scope" should be used, but is not. I suspect these cases > are only reachable by Ada code, as in both spots (subrange types and > base types), I believe that other languages don't generally have named > types in a non-top-level scope, and in these other situations, > mod_scope will still be correct. > > gcc > > * dwarf2out.cc (modified_type_die): Use mod_scope for > ranged types and base types. > > Issue: eng/toolchain/gcc#241 > --- > gcc/dwarf2out.cc | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc > index 38aedb64470..67d2827c279 100644 > --- a/gcc/dwarf2out.cc > +++ b/gcc/dwarf2out.cc > @@ -13927,7 +13927,7 @@ modified_type_die (tree type, int cv_quals, bool > reverse, >tree bias = NULL_TREE; >if (lang_hooks.types.get_type_bias) > bias = lang_hooks.types.get_type_bias (type); > - mod_type_die = subrange_type_die (type, low, high, bias, context_die); > + mod_type_die = subrange_type_die (type, low, high, bias, mod_scope); that looks good. But why not for the ARRAY_TYPE case dircetly above? >item_type = TREE_TYPE (type); > } >else if (is_base_type (type)) > @@ -13964,10 +13964,10 @@ modified_type_die (tree type, int cv_quals, bool > reverse, > { > dw_die_ref after_die > = modified_type_die (type, cv_quals, false, context_die); > - add_child_die_after (comp_unit_die (), mod_type_die, after_die); > + add_child_die_after (mod_scope, mod_type_die, after_die); For the next DW_AT_endianity there's an assert for the correct placement but not here So I'm not positive this change is according to the comment. In fact we're realing with base-type DIEs here, and those are usually directly at comp_unit_die (), no? > } >else > - add_child_die (comp_unit_die (), mod_type_die); > + add_child_die (mod_scope, mod_type_die); > >add_pubtype (type, mod_type_die); > } > -- > 2.46.2 >
[PATCH] c++: Adjust #embed support for P1967R14
Hi! Now that the #embed paper has been voted in, the following patch removes the pedwarn for C++26 on it (and adjusts pedwarn warning for older C++ versions) and predefines __cpp_pp_embed FTM. I believe we otherwise implement everything in the paper already, except I'm really confused by the [Example: #embed limit(__has_include("a.h")) #if __has_embed( limit(__has_include("a.h"))) // ill-formed: __has_include [cpp.cond] cannot appear here #endif — end example] part. My reading of both C23 and C++ with the P1967R14 paper in is that the first case (#embed with __has_include or __has_embed in its clauses) is what is clearly invalid and so the ill-formed note should be for #embed. And the __has_include/__has_embed in __has_embed is actually questionable. Both C and C++ have something like "The identifiers __has_include, __has_embed, and __has_c_attribute shall not appear in any context not mentioned in this subclause." or "The identifiers __has_include and __has_cpp_attribute shall not appear in any context not mentioned in this subclause." (into which P1967R14 adds __has_embed) in the conditional inclusion subclause. #embed is defined in a different one, so using those in there is invalid (unless "using the rules specified for conditional inclusion" wording e.g. in limit clause overrides that). The reason why I think it is fuzzy for __has_embed is that __has_embed is actually defined in the Conditional inclusion subclause (so that would mean one can use __has_include, __has_embed and __has_*attribute in there) but its clauses are described in a different one. GCC currently accepts #embed __FILE__ limit (__has_include ()) #if __has_embed (__FILE__ limit (__has_include ())) #endif #embed __FILE__ limit (__has_embed ("a.c")) #if __has_embed (__FILE__ limit (__has_embed ("a.c"))) #endif with the exception of __has_embed in #embed which results in a strange message. Note, it isn't just about limit clause, but also about prefix/suffix/if_empty, except that in those cases the "using the rules specified for conditional inclusion" doesn't apply. In any case, I'd hope that can be dealt with incrementally (and should be handled the same for both C and C++). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2025-02-24 Jakub Jelinek libcpp/ * init.cc (lang_defaults): Set embed for GNUCXX26 and CXX26. * directives.cc (do_embed): Adjust pedwarn wording for embed in C++. gcc/c-family/ * c-cppbuiltin.cc (c_cpp_builtins): Predefine __cpp_pp_embed=202502 for C++26. gcc/testsuite/ * g++.dg/cpp/embed-1.C: Adjust for pedwarn wording change and don't expect any error for C++26. * g++.dg/cpp/embed-2.C: Adjust for pedwarn wording change and don't expect any warning for C++26. * g++.dg/cpp26/feat-cxx26.C: Test __cpp_pp_embed value. --- libcpp/init.cc.jj 2025-02-13 19:59:56.204572159 +0100 +++ libcpp/init.cc 2025-02-20 21:28:07.340156536 +0100 @@ -149,8 +149,8 @@ static const struct lang_flags lang_defa /* CXX20*/ { 1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,0,0,0,0,0,0,1,0,0,1 }, /* GNUCXX23 */ { 1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,0,0,1 }, /* CXX23*/ { 1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,0,0,1 }, - /* GNUCXX26 */ { 1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,0,0,1 }, - /* CXX26*/ { 1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,0,0,1 }, + /* GNUCXX26 */ { 1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,1,0,1 }, + /* CXX26*/ { 1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,1,0,1 }, /* ASM */ { 0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 } }; --- libcpp/directives.cc.jj 2025-02-13 19:59:56.202572170 +0100 +++ libcpp/directives.cc2025-02-20 21:40:56.379899457 +0100 @@ -1367,7 +1367,7 @@ do_embed (cpp_reader *pfile) { if (CPP_OPTION (pfile, cplusplus)) cpp_error (pfile, CPP_DL_PEDWARN, - "%<#%s%> is a GCC extension", "embed"); + "%<#%s%> before C++26 is a GCC extension", "embed"); else cpp_error (pfile, CPP_DL_PEDWARN, "%<#%s%> before C23 is a GCC extension", "embed"); --- gcc/c-family/c-cppbuiltin.cc.jj 2025-02-13 19:59:55.144578075 +0100 +++ gcc/c-family/c-cppbuiltin.cc2025-02-20 21:34:18.626480792 +0100 @@ -1093,6 +1093,7 @@ c_cpp_builtins (cpp_reader *pfile) cpp_define (pfile, "__cpp_deleted_function=202403L"); cpp_define (pfile, "__cpp_variadic_friend=202403L"); cpp_define (pfile, "__cpp_pack_indexing=202311L"); + cpp_define (pfile, "__cpp_pp_embed=202502L"); } if (flag_concepts && cxx_dialect > cxx14) cpp_define (pfile, "__cpp_concepts=202002L"); --- gcc/testsuite/g++.dg/cpp/embed-1.C.jj 2024-10-13 18:47:45.508432900 +0200 +++ gcc/testsuite/g++.dg/cpp/embed-1.C 2025-02-20 21:38:23.306353505 +0100 @@ -6,9 +6,9 @@ #endif int a = -#embed __FILE__ limit (
Re: [PATCH] RISC-V: Fix .cfi_offset directive when push pop in zc
Pushed with minor coding style fix: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4dcd3c7749734133f7f59509b1a118f3a13de4ee On Mon, Feb 24, 2025 at 4:00 PM Kito Cheng wrote: > > Thanks, will push after verified on my hand :) > > On Mon, Feb 24, 2025 at 3:20 PM Hsing Yu Peng wrote: > > > > Address Fei's comment. > > > > Thanks > > Lino > > > > Fei Gao 於 2025年2月24日 週一 下午2:20寫道: > > > > > > The case you hit is s2 set in frame mask but s1 not. So you're trying to > > > set s1 bit manually. > > > If so, riscv_adjust_multi_push_cfi_prologue seems the better place for > > > your change. > > > > > > static rtx > > > riscv_adjust_multi_push_cfi_prologue (int saved_size) > > > { > > > rtx dwarf = NULL_RTX; > > > rtx adjust_sp_rtx, reg, mem, insn; > > > unsigned int mask = cfun->machine->frame.mask; > > > int offset; > > > int saved_cnt = 0; > > > > > > - if (mask & S10_MASK) > > > -mask |= S11_MASK; /* this corner case is covered in your for loop > > > below. */ > > > > > > + unsigned num_multi_push = riscv_multi_push_regs_count > > > (frame->mask); > > > + for (int i = 0; i < num_multi_push; i++) > > > + { > > > + gcc_assert (zcmp_save_reg_order[i] != INVALID_REGNUM); > > > + mask |= 1 << (zcmp_save_reg_order[i] - GP_REG_FIRST); /* > > > here mask is used > > > > > >instead of frame->mask*/ > > > + } > > > ... > > > > > > BR > > > Fei
Re: [PATCH] reassoc: Fix up optimize_range_tests_to_bit_test [PR118915]
On Mon, 24 Feb 2025, Jakub Jelinek wrote: > Hi! > > The following testcase is miscompiled due to a bug in > optimize_range_tests_to_bit_test. It is trying to optimize > check for a in [-34,-34] or [-26,-26] or [-6,-6] or [-4,inf] ranges. > Another reassoc optimization folds the the test for the first > two ranges into (a + 34U) & ~8U in [0U,0U] range, and extract_bit_test_mask > actually has code to virtually undo it and treat that again as test > for a being -34 or -26. The problem is that optimize_range_tests_to_bit_test > remembers in the type variable TREE_TYPE (ranges[i].exp); from the first > range. If extract_bit_test_mask doesn't do that virtual undoing of the > BIT_AND_EXPR handling, that is just fine, the returned exp is ranges[i].exp. > But if the first range is BIT_AND_EXPR, the type could be different, the > BIT_AND_EXPR form has the optional cast to corresponding unsigned type > in order to avoid introducing UB. Now, type was used to fill in the > max value if ranges[j].high was missing in subsequently tested range, > and so in this particular testcase the [-4,inf] range which was > signed int and so [-4,INT_MAX] was treated as [-4,UINT_MAX] instead. > And we were subtracting values of 2 different types and trying to make > sense out of that. > > The following patch fixes this by using the type of the low bound > (which is always non-NULL) for the max value of the high bound instead. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Thanks, Richard. > 2025-02-24 Jakub Jelinek > > PR tree-optimization/118915 > * tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): For > highj == NULL_TREE use TYPE_MAX_VALUE (TREE_TYPE (lowj)) rather > than TYPE_MAX_VALUE (type). > > * gcc.c-torture/execute/pr118915.c: New test. > > --- gcc/tree-ssa-reassoc.cc.jj2025-02-13 19:59:56.157572421 +0100 > +++ gcc/tree-ssa-reassoc.cc 2025-02-20 22:47:42.646500373 +0100 > @@ -3362,7 +3362,7 @@ optimize_range_tests_to_bit_test (enum t > continue; > highj = ranges[j].high; > if (highj == NULL_TREE) > - highj = TYPE_MAX_VALUE (type); > + highj = TYPE_MAX_VALUE (TREE_TYPE (lowj)); > wide_int mask2; > exp2 = extract_bit_test_mask (ranges[j].exp, prec, lowi, lowj, > highj, &mask2, NULL); > --- gcc/testsuite/gcc.c-torture/execute/pr118915.c.jj 2025-02-20 > 22:46:21.960212578 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr118915.c2025-02-20 > 22:46:01.019137888 +0100 > @@ -0,0 +1,22 @@ > +/* PR tree-optimization/118915 */ > + > +int a; > + > +int > +foo (int c, int d, int e, int f) > +{ > + if (!d || !e) > +return -22; > + if (c > 16) > +return -22; > + if (!f) > +return -22; > + return 2; > +} > + > +int > +main () > +{ > + if (foo (a + 21, a + 6, a + 34, a + 26) != -22) > +__builtin_abort (); > +} > Jakub > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
Re: [PATCH] combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]
On Wed, Feb 12, 2025 at 1:16 PM Uros Bizjak wrote: > > The combine pass is trying to combine: > > Trying 16, 22, 21 -> 23: >16: r104:QI=flags:CCNO>0 >22: {r120:QI=r104:QI^0x1;clobber flags:CC;} > REG_UNUSED flags:CC >21: r119:QI=flags:CCNO<=0 > REG_DEAD flags:CCNO >23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;} > REG_DEAD r120:QI > REG_DEAD r119:QI > REG_UNUSED flags:CC > > and creates the following two insn sequence: > > modifying insn i222: r104:QI=flags:CCNO>0 > REG_DEAD flags:CC > deferring rescan insn with uid = 22. > modifying insn i323: r110:QI=flags:CCNO<=0 > REG_DEAD flags:CC > deferring rescan insn with uid = 23. > > where the REG_DEAD note in i2 is not correct, because the flags > register is still referenced in i3. In try_combine() megafunction, we > have this part: > > --cut here-- > /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3. */ > if (i3notes) > distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL, > elim_i2, elim_i1, elim_i0); > if (i2notes) > distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL, > elim_i2, elim_i1, elim_i0); > if (i1notes) > distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL, > elim_i2, local_elim_i1, local_elim_i0); > if (i0notes) > distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL, > elim_i2, elim_i1, local_elim_i0); > if (midnotes) > distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL, > elim_i2, elim_i1, elim_i0); > --cut here-- > > where the compiler distributes REG_UNUSED note from i2: > >22: {r120:QI=r104:QI^0x1;clobber flags:CC;} > REG_UNUSED flags:CC > > via distribute_notes() using the following: > > --cut here-- > /* Otherwise, if this register is now referenced in i2 > then the register used to be modified in one of the > original insns. If it was i3 (say, in an unused > parallel), it's now completely gone, so the note can > be discarded. But if it was modified in i2, i1 or i0 > and we still reference it in i2, then we're > referencing the previous value, and since the > register was modified and REG_UNUSED, we know that > the previous value is now dead. So, if we only > reference the register in i2, we change the note to > REG_DEAD, to reflect the previous value. However, if > we're also setting or clobbering the register as > scratch, we know (because the register was not > referenced in i3) that it's unused, just as it was > unused before, and we place the note in i2. */ > if (from_insn != i3 && i2 && INSN_P (i2) > && reg_referenced_p (XEXP (note, 0), PATTERN (i2))) > { > if (!reg_set_p (XEXP (note, 0), PATTERN (i2))) > PUT_REG_NOTE_KIND (note, REG_DEAD); > if (! (REG_P (XEXP (note, 0)) > ? find_regno_note (i2, REG_NOTE_KIND (note), > REGNO (XEXP (note, 0))) > : find_reg_note (i2, REG_NOTE_KIND (note), > XEXP (note, 0 > place = i2; > } > --cut here-- > > However, the flags register is not UNUSED (or DEAD), because it is > used in i3. The proposed solution is to remove the REG_UNUSED note > from i2 when the register is also mentioned in i3. But there is /* Otherwise, if this register is used by I3, then this register now dies here, so we must put a REG_DEAD note here unless there is one already. */ else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)) && ! (REG_P (XEXP (note, 0)) ? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0))) : find_reg_note (i3, REG_DEAD, XEXP (note, 0 { PUT_REG_NOTE_KIND (note, REG_DEAD); place = i3; } which of course isn't taken as there is a NOTE in i3 already. Still the note is supposed to be moved there (just not emitted, it's already there. So a better fix is to refactor the above to /* Otherwise, if this register is used by I3, then this register now dies here, so we must put a REG_DEAD note here unless there is one already. */ else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)) { if (! (REG_P (XEXP (note, 0)) ? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0))) : find_reg_note (i3, REG_DEAD, XEXP (note, 0 { PUT_REG_NOTE_KIND (note, REG_DEAD); place = i3; } } ? At least the else { case seems to assume the reg isn't refe
Re: Fix 'libstdc++-v3/src/c++20/tzdb.cc' build for '__GTHREADS && !__GTHREADS_CXX0X' configurations
On Mon, 24 Feb 2025 at 06:23, François Dumont wrote: > > > On 20/02/2025 18:28, Thomas Schwinge wrote: > > Hi! > > > > On 2025-02-20T16:36:56+, Jonathan Wakely wrote: > >> On 20/02/25 15:50 +0100, Thomas Schwinge wrote: > >> >From 820e015494e25187c9a5ffbd69911ec6ce612789 Mon Sep 17 00:00:00 2001 > >>> From: Jonathan Wakely > >>> Date: Thu, 20 Feb 2025 14:08:11 + > >>> Subject: [PATCH] Fix 'libstdc++-v3/src/c++20/tzdb.cc' build for > >>> '__GTHREADS && > >>> !__GTHREADS_CXX0X' configurations > >>> > >>> libstdc++-v3/ > >>> * src/c++20/tzdb.cc [__GTHREADS && !__GTHREADS_CXX0X]: Use > >>> '__gnu_cxx::__mutex'. > >>> > >>> Co-authored-by: Thomas Schwinge > >>> --- > >>> libstdc++-v3/src/c++20/tzdb.cc | 8 +++- > >>> 1 file changed, 7 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/libstdc++-v3/src/c++20/tzdb.cc > >>> b/libstdc++-v3/src/c++20/tzdb.cc > >>> index 7e8cce7ce8c..70ba7b0ef53 100644 > >>> --- a/libstdc++-v3/src/c++20/tzdb.cc > >>> +++ b/libstdc++-v3/src/c++20/tzdb.cc > >>> @@ -35,6 +35,9 @@ > >>> #include // atomic, atomic > >>> #include // atomic> > >>> #include // mutex > >>> +#if defined __GTHREADS && ! defined _GLIBCXX_HAS_GHTREADS > >>> +# include // __gnu_cxx::__mutex > >>> +#endif > >>> #include // filesystem::read_symlink > >>> > >>> #ifndef _AIX > >>> @@ -97,11 +100,14 @@ namespace std::chrono > >>> { > >>>namespace > >>>{ > >>> -#if ! USE_ATOMIC_SHARED_PTR > >>> #ifndef __GTHREADS > >>> // Dummy no-op mutex type for single-threaded targets. > >>> struct mutex { void lock() { } void unlock() { } }; > >>> +#elif ! defined _GLIBCXX_HAS_GHTREADS > >> This still has my GHTREADS typo. > > Eh. I had fixed that, but apparently lost it. Re-fixed. > > > >> A comment here would be good too: > >> > >> // Use __gnu_cxx::__mutex is std::mutex isn't available. > > Added. > > Minor but comment was added with alas the small typo to say 'if' rather > than 'is': > > // Use __gnu_cxx::__mutex if std::mutex isn't available. Ah yes, thanks. It looks like Thomas hasn't pushed it yet, so I hope he can fix that before pushing.
[PATCH] tree-optimization/118973 - stray abnormal edge after DCE
DCE preserves stmts performing abnormal control flow transfer but currently has an exception for replaceable allocations and cxa_atexit calls. That results in a broken CFG since DCE isn't set up to prune abnormal edges possibly hanging off those. While we could try to add this handling, the following is the safe fix at this point and more suitable for backporting. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/118973 * tree-ssa-dce.cc (mark_stmt_if_obviously_necessary): Calls that alter control flow in unpredictable ways need to be preserved. * g++.dg/torture/pr118973.C: New testcase. --- gcc/testsuite/g++.dg/torture/pr118973.C | 10 ++ gcc/tree-ssa-dce.cc | 8 2 files changed, 14 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/torture/pr118973.C diff --git a/gcc/testsuite/g++.dg/torture/pr118973.C b/gcc/testsuite/g++.dg/torture/pr118973.C new file mode 100644 index 000..8cb9e4b023a --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr118973.C @@ -0,0 +1,10 @@ +// { dg-do compile } + +int foo() __attribute__((returns_twice)); + +void a() +{ + int a; + if(foo()) new int; + &a; +} diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc index 461283ba858..ba9cd6536ae 100644 --- a/gcc/tree-ssa-dce.cc +++ b/gcc/tree-ssa-dce.cc @@ -391,15 +391,15 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive) { gcall *call = as_a (stmt); - /* Never elide a noreturn call we pruned control-flow for. */ - if ((gimple_call_flags (call) & ECF_NORETURN) - && gimple_call_ctrl_altering_p (call)) + /* Never elide a noreturn call we pruned control-flow for. + Same for statements that can alter control flow in unpredictable + ways. */ + if (gimple_call_ctrl_altering_p (call)) { mark_stmt_necessary (call, true); return; } - if (is_removable_allocation_p (call, false)) return; -- 2.43.0
[PATCH v2] RISC-V: Fix a typo in zce to zcf implication
zce must imply zcf but this rule was corrupted after refactoring in 9e12010b5e724277ea. This may be observed ater generating an .s file from any source code file with -mriscv-attribute -march=rv32if_zce -mabi=ilp32 -S options. A full march will be presented in arch attribute: rv32i2p1_f2p2_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0 As you see, zcf is not presented here though f_zce pair is passed in -march. According to The RISC-V Instruction Set Manual: Specifying Zce on RV32 with F includes Zca, Zcb, Zcmp, Zcmt and Zcf. PR target/118906 gcc/ChangeLog: * common/config/riscv/riscv-common.cc: fix zce to zcf implication. gcc/testsuite/ChangeLog: * gcc.target/riscv/attribute-zce-1.c: New test. * gcc.target/riscv/attribute-zce-2.c: New test. * gcc.target/riscv/attribute-zce-3.c: New test. * gcc.target/riscv/attribute-zce-4.c: New test. Signed-off-by: Yuriy Kolerov --- gcc/common/config/riscv/riscv-common.cc | 2 +- gcc/testsuite/gcc.target/riscv/attribute-zce-1.c | 6 ++ gcc/testsuite/gcc.target/riscv/attribute-zce-2.c | 6 ++ gcc/testsuite/gcc.target/riscv/attribute-zce-3.c | 6 ++ gcc/testsuite/gcc.target/riscv/attribute-zce-4.c | 6 ++ 5 files changed, 25 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-zce-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-zce-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-zce-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-zce-4.c diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index 5038f0eb959..b34409adf39 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -213,7 +213,7 @@ static const riscv_implied_info_t riscv_implied_info[] = {"zcmp", "zca"}, {"zcmt", "zca"}, {"zcmt", "zicsr"}, - {"zcf", "f", + {"zce", "zcf", [] (const riscv_subset_list *subset_list) -> bool { return subset_list->xlen () == 32 && subset_list->lookup ("f"); diff --git a/gcc/testsuite/gcc.target/riscv/attribute-zce-1.c b/gcc/testsuite/gcc.target/riscv/attribute-zce-1.c new file mode 100644 index 000..e477414d4d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/attribute-zce-1.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-mriscv-attribute -march=rv32i_zce -mabi=ilp32" } */ + +void foo(){} + +/* { dg-final { scan-assembler ".attribute arch, \"rv32i2p1_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0\"" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/attribute-zce-2.c b/gcc/testsuite/gcc.target/riscv/attribute-zce-2.c new file mode 100644 index 000..7008eb5ea1f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/attribute-zce-2.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-mriscv-attribute -march=rv32if_zce -mabi=ilp32" } */ + +void foo(){} + +/* { dg-final { scan-assembler ".attribute arch, \"rv32i2p1_f2p2_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcf1p0_zcmp1p0_zcmt1p0\"" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/attribute-zce-3.c b/gcc/testsuite/gcc.target/riscv/attribute-zce-3.c new file mode 100644 index 000..89ebaaf4063 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/attribute-zce-3.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-mriscv-attribute -march=rv64i_zce -mabi=lp64" } */ + +void foo(){} + +/* { dg-final { scan-assembler ".attribute arch, \"rv64i2p1_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0\"" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/attribute-zce-4.c b/gcc/testsuite/gcc.target/riscv/attribute-zce-4.c new file mode 100644 index 000..cacbcaac35f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/attribute-zce-4.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-mriscv-attribute -march=rv64if_zce -mabi=lp64" } */ + +void foo(){} + +/* { dg-final { scan-assembler ".attribute arch, \"rv64i2p1_f2p2_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0\"" } } */ -- 2.34.1
[PATCH v3 0/3] [AArch64, OpenMP] Support SVE types with various OpenMP clauses and constructs
This series is based on a previous thread and review comments from RichardS and Jakub upstream: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674698.html https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674219.html https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674434.html This series removes all the tests earlier patches covered as per review comments and only includes the code changes. I'm reworking through all the tests, so are not included in this series - I will send a separate thread once I incorporate all the changes Jakub has requested for. This patch also includes Richard's patch that the rest of the series is based on https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606741.html Patches have been rebased, retested and bootstrapped on trunk. OK for trunk? Richard Sandiford (1): gomp: Various fixes for SVE types [PR101018] Tejas Belagod (2): Add function to strip pointer type and get down to the actual pointee type. AArch64: Diagnose OpenMP offloading when SVE types involved. gcc/config/aarch64/aarch64-sve-builtins.cc | 34 +++- gcc/fold-const.cc | 7 +++ gcc/gimplify.cc| 62 ++ gcc/omp-low.cc | 2 +- gcc/poly-int.h | 19 +++ gcc/target.h | 37 - gcc/tree.h | 9 7 files changed, 156 insertions(+), 14 deletions(-) -- 2.25.1
[PATCH v3 2/3] Add function to strip pointer type and get down to the actual pointee type.
Add a function to traverse down the pointer layers to the pointee type. gcc/ChangeLog: * tree.h (strip_pointer_types): New. --- gcc/tree.h | 9 + 1 file changed, 9 insertions(+) diff --git a/gcc/tree.h b/gcc/tree.h index 21f3cd5525c..580997b4e5d 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -5048,6 +5048,15 @@ strip_array_types (tree type) return type; } +inline const_tree +strip_pointer_types (const_tree type) +{ + while (POINTER_TYPE_P (type)) +type = TREE_TYPE (type); + + return type; +} + /* Desription of the reason why the argument of valid_constant_size_p is not a valid size. */ enum cst_size_error { -- 2.25.1
[PATCH v3 3/3] AArch64: Diagnose OpenMP offloading when SVE types involved.
The target clause in OpenMP is used to offload loop kernels to accelarator peripeherals. target's 'map' clause is used to move data from and to the accelarator. When the data is SVE type, it may not be suitable because of various reasons i.e. the two SVE targets may not agree on vector size or some targets don't support variable vector size. This makes SVE unsuitable for use in OMP's 'map' clause. This patch diagnoses all such cases and issues an error where SVE types are not suitable. Co-authored-by: Andrea Corallo gcc/ChangeLog: * target.h (type_context_kind): Add new context kinds for target clauses. (omp_type_context): Query if the context is of OMP kind. * config/aarch64/aarch64-sve-builtins.cc (verify_type_context): Diagnose SVE types for a given OpenMP context. (omp_type_context): New. * gimplify.cc (omp_notice_variable): Diagnose implicitly-mapped SVE objects in OpenMP regions. (gimplify_scan_omp_clauses): Diagnose SVE types for various target clauses. --- gcc/config/aarch64/aarch64-sve-builtins.cc | 34 - gcc/gimplify.cc| 43 +- gcc/target.h | 37 ++- 3 files changed, 111 insertions(+), 3 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 5d2062726d6..e145689c355 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -5182,7 +5182,11 @@ bool verify_type_context (location_t loc, type_context_kind context, const_tree type, bool silent_p) { - if (!sizeless_type_p (type)) + const_tree tmp = type; + if (omp_type_context (context) && POINTER_TYPE_P (type)) +tmp = strip_pointer_types (tmp); + + if (!sizeless_type_p (tmp)) return true; switch (context) @@ -5242,6 +5246,34 @@ verify_type_context (location_t loc, type_context_kind context, if (!silent_p) error_at (loc, "capture by copy of SVE type %qT", type); return false; + +case TCTX_OMP_MAP: + if (!silent_p) + error_at (loc, "SVE type %qT not allowed in map clause", type); + return false; + +case TCTX_OMP_MAP_IMP_REF: + if (!silent_p) + error ("cannot reference %qT object types in target region", type); + return false; + +case TCTX_OMP_PRIVATE: + if (!silent_p) + error_at (loc, "SVE type %qT not allowed in target private clause", type); + return false; + +case TCTX_OMP_FIRSTPRIVATE: + if (!silent_p) + error_at (loc, "SVE type %qT not allowed in target firstprivate clause", type); + return false; + +case TCTX_OMP_DEVICE_ADDR: + if (!silent_p) + error_at (loc, "SVE type %qT not allowed in target device clauses", type); + return false; + +default: + break; } gcc_unreachable (); } diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index a03ca8cf4ee..47635bb3119 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -9081,11 +9081,20 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code) | GOVD_MAP_ALLOC_ONLY)) == flags) { tree type = TREE_TYPE (decl); + location_t loc = DECL_SOURCE_LOCATION (decl); if (gimplify_omp_ctxp->target_firstprivatize_array_bases && omp_privatize_by_reference (decl)) type = TREE_TYPE (type); - if (!omp_mappable_type (type)) + + if (!verify_type_context (loc, TCTX_OMP_MAP_IMP_REF, type)) + { + /* Check if TYPE can appear in a target region. +verify_type_context has already issued an error if it +can't. */ + nflags |= GOVD_MAP | GOVD_EXPLICIT; + } + else if (!omp_mappable_type (type)) { error ("%qD referenced in target region does not have " "a mappable type", decl); @@ -12815,6 +12824,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, unsigned int flags; tree decl; auto_vec addr_tokens; + tree op = NULL_TREE; + location_t loc = OMP_CLAUSE_LOCATION (c); if (grp_end && c == OMP_CLAUSE_CHAIN (grp_end)) { @@ -12822,6 +12833,36 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, grp_end = NULL_TREE; } + if (code == OMP_TARGET + || code == OMP_TARGET_DATA + || code == OMP_TARGET_ENTER_DATA + || code == OMP_TARGET_EXIT_DATA) + /* Do some target-specific type checks for map operands. */ + switch (OMP_CLAUSE_CODE (c)) + { + case OMP_CLAUSE_MAP: + op = OMP_CLAUSE_OPERAND (c, 0); + verify_type_context (loc, TCTX_OMP_MAP, TREE_TYPE (op)); + break; +
[PATCH] c++: Fix range for with PMFs [PR118923]
Hi! The following testcases segfault because the new range for -frange-for-ext-temps temporary extension extends even the internal TARGET_EXPRs created by get_member_function_from_ptrfunc. The following patch fixes that by marking those TARGET_EXPR_INTERNAL_P. I'm not calling get_internal_target_expr because I really want to force the TARGET_EXPR creation rather than say mark some existing TARGET_EXPR internal, and get_internal_target_expr doesn't take complain argument either and always uses tf_warning_or_error. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2025-02-24 Jakub Jelinek PR c++/118923 * typeck.cc (get_member_function_from_ptrfunc): Set TARGET_EXPR_INTERNAL_P flag on force_target_expr returned TARGET_EXPRs. * g++.dg/cpp0x/pr118923.C: New test. * g++.dg/cpp1y/pr118923.C: New test. --- gcc/cp/typeck.cc.jj 2025-02-13 19:59:55.463576294 +0100 +++ gcc/cp/typeck.cc2025-02-20 21:13:51.420841171 +0100 @@ -4218,17 +4218,23 @@ get_member_function_from_ptrfunc (tree * || (!nonvirtual && !DECL_P (instance_ptr) && !TREE_CONSTANT (instance_ptr))) - instance_ptr = instance_save_expr - = force_target_expr (TREE_TYPE (instance_ptr), instance_ptr, - complain); + { + instance_ptr = instance_save_expr + = force_target_expr (TREE_TYPE (instance_ptr), instance_ptr, +complain); + TARGET_EXPR_INTERNAL_P (instance_ptr) = true; + } /* See above comment. */ if (TREE_SIDE_EFFECTS (function) || (!nonvirtual && !DECL_P (function) && !TREE_CONSTANT (function))) - function - = force_target_expr (TREE_TYPE (function), function, complain); + { + function + = force_target_expr (TREE_TYPE (function), function, complain); + TARGET_EXPR_INTERNAL_P (function) = true; + } /* Start by extracting all the information from the PMF itself. */ e3 = pfn_from_ptrmemfunc (function); --- gcc/testsuite/g++.dg/cpp0x/pr118923.C.jj2025-02-24 11:37:54.638601995 +0100 +++ gcc/testsuite/g++.dg/cpp0x/pr118923.C 2025-02-24 11:39:33.940212633 +0100 @@ -0,0 +1,66 @@ +// PR c++/118923 +// { dg-do run { target c++11 } } +// { dg-additional-options "-frange-for-ext-temps" { target c++23 } } +// { dg-additional-options "-fno-range-for-ext-temps" { target c++20_down } } + +int g; + +struct A { + int a[3]; + A (int x, int y, int z) : a{x, y, z} { if ((g++ & 7) != 4) __builtin_abort (); } + A (const A &x) = delete; + ~A () { if ((g++ & 7) != 7 - 2 * (__cpp_range_based_for >= 202211)) __builtin_abort (); } + int *begin () { return a; } + int *end () { return a + 3; } +}; + +struct B { + B () { if ((g++ & 7) != 3) __builtin_abort (); } + B (const B &) = delete; + ~B () { if ((g++ & 7) != 5 + (__cpp_range_based_for >= 202211)) __builtin_abort (); } +}; + +struct C { + A foo (const B &) { return { 1, 2, 3 }; } + A bar (const B &) { return { 4, 5, 6 }; } + bool baz () { return b; } + bool b = false; + static C c; +}; + +C C::c; + +struct D { + D () { if ((g++ & 5) != 0) __builtin_abort (); } + D (const D &) = delete; + ~D () { if ((g & 7) != 1 && (g & 7) != 6 + (__cpp_range_based_for >= 202211)) __builtin_abort (); g++; } +}; + +inline C * +qux (const D &) +{ + return &C::c; +} + +void +foo () +{ + int z = 1; + auto d = qux (D {})->baz () ? &C::bar : &C::foo; + for (const int &r : (qux (D {})->*d) (B {})) +if (z++ != r) + __builtin_abort (); + C::c.b = true; + d = qux (D {})->baz () ? &C::bar : &C::foo; + for (const int &r : (qux (D {})->*d) (B {})) +if (z++ != r) + __builtin_abort (); +} + +int +main () +{ + foo (); + if (g != 16) +__builtin_abort (); +} --- gcc/testsuite/g++.dg/cpp1y/pr118923.C.jj2025-02-20 21:17:37.852648770 +0100 +++ gcc/testsuite/g++.dg/cpp1y/pr118923.C 2025-02-20 21:19:46.662108221 +0100 @@ -0,0 +1,38 @@ +// PR c++/118923 +// { dg-do run { target c++14 } } + +struct A { + int a[3] = { 0, 0, 0 }; + int *begin () { return a; } + int *end () { return a + 3; } +}; + +struct B { + A foo () { return { 1, 2, 3 }; } + A bar () { return { 1, 2, 3 }; } + bool baz () { return b; } + bool b = false; + static B c; +}; + +B B::c; + +inline B * +qux () +{ + return &B::c; +} + +void +foo () +{ + auto d = qux ()->baz () ? &B::foo : &B::bar; + for (const int &r : (qux ()->*d) ()) +; +} + +int +main () +{ + foo (); +} Jakub
[PATCH v3 1/3] gomp: Various fixes for SVE types [PR101018]
From: Richard Sandiford Various parts of the omp code checked whether the size of a decl was an INTEGER_CST in order to determine whether the decl was variable-sized or not. If it was variable-sized, it was expected to have a DECL_VALUE_EXPR replacement, as for VLAs. This patch uses poly_int_tree_p instead, so that variable-length SVE vectors are treated like constant-length vectors. This means that some structures become poly_int-sized, with some fields at poly_int offsets, but we already have code to handle that. An alternative would have been to handle the data via indirection instead. However, that's likely to be more complicated, and it would contradict is_variable_sized, which already uses a check for TREE_CONSTANT rather than INTEGER_CST. gimple_add_tmp_var should probably not add a safelen of 1 for SVE vectors, but that's really a separate thing and might be hard to test. Co-authored-by: Tejas Belagod gcc/ PR middle-end/101018 * poly-int.h (can_and_p): New function. * fold-const.cc (poly_int_binop): Use it to optimize BIT_AND_EXPRs involving POLY_INT_CSTs. * gimplify.cc (gimplify_bind_expr): Use poly_int_tree_p instead of INTEGER_CST when checking for constant-sized omp data. (omp_add_variable): Likewise. (omp_notice_variable): Likewise. (gimplify_adjust_omp_clauses_1): Likewise. (gimplify_adjust_omp_clauses): Likewise. * omp-low.cc (scan_sharing_clauses): Likewise. (lower_omp_target): Likewise. --- gcc/fold-const.cc | 7 +++ gcc/gimplify.cc | 19 +-- gcc/omp-low.cc| 2 +- gcc/poly-int.h| 19 +++ 4 files changed, 36 insertions(+), 11 deletions(-) diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index f9f7f4d2f91..072a0b62e13 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -1284,6 +1284,13 @@ poly_int_binop (poly_wide_int &res, enum tree_code code, return false; break; +case BIT_AND_EXPR: + if (TREE_CODE (arg2) != INTEGER_CST + || !can_and_p (wi::to_poly_wide (arg1), wi::to_wide (arg2), +&res)) + return false; + break; + case BIT_IOR_EXPR: if (TREE_CODE (arg2) != INTEGER_CST || !can_ior_p (wi::to_poly_wide (arg1), wi::to_wide (arg2), diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index cc0172cf96e..a03ca8cf4ee 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -9160,7 +9160,8 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code) && (flags & (GOVD_SEEN | GOVD_LOCAL)) == GOVD_SEEN && DECL_SIZE (decl)) { - if (TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST) + tree size; + if (!poly_int_tree_p (DECL_SIZE (decl))) { splay_tree_node n2; tree t = DECL_VALUE_EXPR (decl); @@ -9171,16 +9172,14 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code) n2->value |= GOVD_SEEN; } else if (omp_privatize_by_reference (decl) - && TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl))) - && (TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl - != INTEGER_CST)) + && (size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl + && !poly_int_tree_p (size)) { splay_tree_node n2; - tree t = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl))); - gcc_assert (DECL_P (t)); - n2 = splay_tree_lookup (ctx->variables, (splay_tree_key) t); + gcc_assert (DECL_P (size)); + n2 = splay_tree_lookup (ctx->variables, (splay_tree_key) size); if (n2) - omp_notice_variable (ctx, t, true); + omp_notice_variable (ctx, size, true); } } @@ -14437,7 +14436,7 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data) if ((gimplify_omp_ctxp->region_type & ORT_ACC) == 0) OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P (clause) = 1; if (DECL_SIZE (decl) - && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST) + && !poly_int_tree_p (DECL_SIZE (decl))) { tree decl2 = DECL_VALUE_EXPR (decl); gcc_assert (INDIRECT_REF_P (decl2)); @@ -15178,7 +15177,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, if (!DECL_P (decl)) break; if (DECL_SIZE (decl) - && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST) + && !poly_int_tree_p (DECL_SIZE (decl))) { tree decl2 = DECL_VALUE_EXPR (decl); gcc_assert (INDIRECT_REF_P (decl2)); diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index c36ae38cf8e..00bcf27fa47 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -1461,7 +1461,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) else install_var_field (decl, false, 11, ctx); if (DECL_SIZE (decl) - && TREE_CODE (
Re: [PATCH] ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243)
Hello, and ping please. Thanks, Martin On Mon, Feb 10 2025, Martin Jambor wrote: > Hello, > > among other things, IPA-SRA checks whether splitting out a bit of an > aggregate or something passed by reference would lead into a clash > with an already known IPA-CP constant a way which would cause problems > later on. Unfortunately the test is done only in > adjust_parameter_descriptions and is missing when accesses are > propagated from callees to callers, which leads to miscompilation > reported as PR 118243 (where the callee is a function created by > ipa-split). > > The matter is then further complicated by the fact that we consider > complex numbers as scalars even though they can be modified piecemeal > (IPA-CP can detect and propagate the pieces separately too) which then > confuses the parameter manipulation machinery furter. > > This patch simply adds the missing check to avoid the IPA-SRA > transform in these cases too, which should be suitable for backporting > to all affected release branches. It is a bit of a shame as in the PR > testcase we do propagate both components of the complex number in > question and the transformation phase could recover. I have some > prototype patches in this direction but that is something for (a) > stage 1. > > Bootstrapped and tested on x86_64-linux. OK for master and (assuming it > applies cleanly and passes the checks there too) to all active release > branches? > > Thanks, > > Martin > > > gcc/ChangeLog: > > 2025-02-10 Martin Jambor > > PR ipa/118243 > * ipa-sra.cc (pull_accesses_from_callee): New parameters > caller_ipcp_ts and param_idx. Check that scalar pulled accesses would > not clash with a known IPA-CP aggregate constant. > (param_splitting_across_edge): Pass IPA-CP transformation summary and > caller parameter index to pull_accesses_from_callee. > > gcc/testsuite/ChangeLog: > > 2025-02-10 Martin Jambor > > PR ipa/118243 > * g++.dg/ipa/pr118243.C: New test. > --- > gcc/ipa-sra.cc | 38 +++ > gcc/testsuite/g++.dg/ipa/pr118243.C | 40 + > 2 files changed, 68 insertions(+), 10 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/ipa/pr118243.C > > diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc > index ad80d22f8ce..5d1703ed394 100644 > --- a/gcc/ipa-sra.cc > +++ b/gcc/ipa-sra.cc > @@ -3640,15 +3640,19 @@ enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, > ACC_PROP_CERTAIN}; > > /* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC, > (which belongs to CALLER) if they would not violate some constraint there. > - If successful, return NULL, otherwise return the string reason for failure > - (which can be written to the dump file). DELTA_OFFSET is the known offset > - of the actual argument withing the formal parameter (so of ARG_DESCS > within > - PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not > - known. In case of success, set *CHANGE_P to true if propagation actually > - changed anything. */ > + CALLER_IPCP_TS describes the caller, PARAM_IDX is the index of the > parameter > + described by PARAM_DESC. If successful, return NULL, otherwise return the > + string reason for failure (which can be written to the dump file). > + DELTA_OFFSET is the known offset of the actual argument withing the formal > + parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the size of > the > + actual argument or zero, if not known. In case of success, set *CHANGE_P > to > + true if propagation actually changed anything. */ > > static const char * > -pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc, > +pull_accesses_from_callee (cgraph_node *caller, > +ipcp_transformation *caller_ipcp_ts, > +int param_idx, > +isra_param_desc *param_desc, > isra_param_desc *arg_desc, > unsigned delta_offset, unsigned arg_size, > bool *change_p) > @@ -3673,6 +3677,17 @@ pull_accesses_from_callee (cgraph_node *caller, > isra_param_desc *param_desc, > continue; > >unsigned offset = argacc->unit_offset + delta_offset; > + > + if (caller_ipcp_ts && !AGGREGATE_TYPE_P (argacc->type)) > + { > + ipa_argagg_value_list avl (caller_ipcp_ts); > + tree value = avl.get_value (param_idx, offset); > + if (value && ((tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value))) > + / BITS_PER_UNIT) > + != argacc->unit_size)) > + return " propagated access would conflict with an IPA-CP constant"; > + } > + >/* Given that accesses are initially stored according to increasing >offset and decreasing size in case of equal offsets, the following >searches could be written more efficiently if we kept the ordering
[PATCH] vect: Use original LHS type for gather pattern [PR118950].
Hi, in PR118950 we do not zero masked elements in a gather load. While recognizing a gather/scatter pattern we do not use the original type of the LHS. This matters because the type can differ with bool patterns (e.g. _Bool vs unsigned char) and we don't notice the need for zeroing out the padding bytes. This patch just uses the original LHS's type. Bootstrapped and regtested on x86, aarch64, and power10. Regtested on rv64gcv_zvl512b. PR middle-end/118950 gcc/ChangeLog: * tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Use original LHS's type. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr118950.c: New test. --- .../gcc.target/riscv/rvv/autovec/pr118950.c | 29 +++ gcc/tree-vect-patterns.cc | 3 +- 2 files changed, 31 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c new file mode 100644 index 000..604d4264eac --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-require-effective-target riscv_v_ok } */ +/* { dg-add-options riscv_v } */ +/* { dg-additional-options "-std=gnu99 -Wno-pedantic" } */ + +unsigned char a; +long long r; +_Bool h = 1; +short j[23]; +_Bool k[3][23]; + +void b(_Bool h, short j[], _Bool k[][23]) { + for (int m = 0; m < 23; m += 3) +for (short n = 0; n < 22; n += 4) + a = ({ +unsigned char o = a; +unsigned char p = j[n] ? h : k[m][n]; +o > p ? o : p; + }); +} + +int main() { + for (int m = 0; m < 23; ++m) +j[m] = 10; + b(h, j, k); + r = a; + if (r != 1) +__builtin_abort (); +} diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 6fc97d1b6ef..4f0a7ea162b 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -6022,7 +6022,8 @@ vect_recog_gather_scatter_pattern (vec_info *vinfo, else pattern_stmt = gimple_build_call_internal (gs_info.ifn, 4, base, offset, scale, zero); - tree load_lhs = vect_recog_temp_ssa_var (gs_info.element_type, NULL); + tree lhs = gimple_get_lhs (stmt_info->stmt); + tree load_lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL); gimple_call_set_lhs (pattern_stmt, load_lhs); } else -- 2.48.1 -- Regards Robin
[PATCH] RISC-V: Include pattern stmts for dynamic LMUL computation [PR114516].
Hi, when scanning for program points, i.e. vector statements, we're missing pattern statements. In PR114516 this becomes obvious as we choose LMUL=8 assuming there are only three statements but the divmod pattern adds another three. Those push us beyond four registers so we need to switch to LMUL=4. This patch adds pattern statements to the program points which helps calculate a better register pressure estimate. Regtested on rv64gcv_zvl512b. PR target/114516 gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (compute_estimated_lmul): Add pattern statements to program points. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr114516.c: New test. --- gcc/config/riscv/riscv-vector-costs.cc| 29 +++ .../vect/costmodel/riscv/rvv/pr114516.c | 29 +++ 2 files changed, 58 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index 5149751ae76..8aad05c6180 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -215,6 +215,35 @@ compute_local_program_points ( "program point %d: %G", info.point, gsi_stmt (si)); } + + /* If the statement is part of a pattern, also add the other +pattern statements. */ + gimple_seq pattern_def_seq; + if (STMT_VINFO_IN_PATTERN_P (stmt_info) + && (pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info))) + { + gimple_stmt_iterator si2; + + for (si2 = gsi_start (pattern_def_seq); + !gsi_end_p (si2); + gsi_next (&si2)) + { + stmt_vec_info pattern_def_stmt_info + = vinfo->lookup_stmt (gsi_stmt (si2)); + if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info) + || STMT_VINFO_LIVE_P (pattern_def_stmt_info)) + { + stmt_point info = {point, gsi_stmt (si2), + pattern_def_stmt_info}; + program_points.safe_push (info); + point++; + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, +"program point %d: %G", +info.point, gsi_stmt (si2)); + } + } + } } program_points_per_bb.put (bb, program_points); } diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c new file mode 100644 index 000..5a514d1bbf6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zba_zbb -mabi=lp64d -mrvv-max-lmul=dynamic -O3 -fdump-tree-vect-details -Wno-pedantic" } */ + +typedef float real_t; +__attribute__((aligned(64))) real_t a[32000]; +real_t s315() +{ +for (int i = 0; i < 32000; i++) +a[i] = (i * 7) % 32000; +real_t x, chksum; +int index; +for (int nl = 0; nl < 256; nl++) { +x = a[0]; +index = 0; +for (int i = 0; i < 32000; ++i) { +if (a[i] > x) { +x = a[i]; +index = i; +} +} +chksum = x + (real_t) index; +} +return index + x + 1; +} + +/* { dg-final { scan-assembler {e32,m4} } } */ +/* { dg-final { scan-assembler-not {e32,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Preferring smaller LMUL loop because it has unexpected spills" 1 "vect" } } */ -- 2.48.1 -- Regards Robin
RE: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]
Thanks Robin for comments. > I agree with the general idea but I'm a bit wary fiddling with the > coefficients > directly. I think for a fixed-size, non VLA vector it should be sufficient to > check whether the last step overflows. Initial idea I would like to take care of VLS and VLA in the same chunk of code, given the expand_const_vector is sort of complicated currently(I don't want to make it worse by add more if VLA, else if VLS ... etc). I will try to leverage poly_int mult, shift, and compare if possible, instead of touch the coefficients directly in v2. > For VLA we actually know the largest possible runtime vector length and could > use it the check? But maybe it's probably easier to bail with VLA anyway. > In addition: if we're already checking the step anyway we could get rid of the > negative step restriction as well. I don't explore more cases here consider we are in stage 4. I think the expand_const_vector need some refactor up to a point. Pan -Original Message- From: Robin Dapp Sent: Monday, February 24, 2025 4:29 PM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; Robin Dapp Subject: Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931] > This patch would like to perform the overflow to smode check before IOR > the base2 series, and perform the clean highest bit if the const_vector > overflow to smode occurs. If no overflow, there will do nothing here. I agree with the general idea but I'm a bit wary fiddling with the coefficients directly. I think for a fixed-size, non VLA vector it should be sufficient to check whether the last step overflows. For VLA we actually know the largest possible runtime vector length and could use it the check? But maybe it's probably easier to bail with VLA anyway. In addition: if we're already checking the step anyway we could get rid of the negative step restriction as well. -- Regards Robin
Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]
> I don't explore more cases here consider we are in stage 4. I think the > expand_const_vector need some refactor up to a point. I added the negative step check just some weeks ago and I'd see it as simplification to remove the restriction again if you're touching the actual step anyway. So I wouldn't worry about stage 4 in that regard. -- Regards Robin
RE: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]
> I added the negative step check just some weeks ago and I'd see it as > simplification to remove the restriction again if you're touching the actual > step anyway. So I wouldn't worry about stage 4 in that regard. Oh, I see. I'll have a try after this bug fix. Pan -Original Message- From: Robin Dapp Sent: Monday, February 24, 2025 7:44 PM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; Robin Dapp Subject: Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931] > I don't explore more cases here consider we are in stage 4. I think the > expand_const_vector need some refactor up to a point. I added the negative step check just some weeks ago and I'd see it as simplification to remove the restriction again if you're touching the actual step anyway. So I wouldn't worry about stage 4 in that regard. -- Regards Robin
Re: [PATCH] RISC-V: Include pattern stmts for dynamic LMUL computation [PR114516].
LGTM juzhe.zh...@rivai.ai From: Robin Dapp Date: 2025-02-24 19:14 To: gcc-patches CC: pal...@dabbelt.com; kito.ch...@gmail.com; juzhe.zh...@rivai.ai; jeffreya...@gmail.com; pan2...@intel.com; rdapp@gmail.com Subject: [PATCH] RISC-V: Include pattern stmts for dynamic LMUL computation [PR114516]. Hi, when scanning for program points, i.e. vector statements, we're missing pattern statements. In PR114516 this becomes obvious as we choose LMUL=8 assuming there are only three statements but the divmod pattern adds another three. Those push us beyond four registers so we need to switch to LMUL=4. This patch adds pattern statements to the program points which helps calculate a better register pressure estimate. Regtested on rv64gcv_zvl512b. PR target/114516 gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (compute_estimated_lmul): Add pattern statements to program points. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr114516.c: New test. --- gcc/config/riscv/riscv-vector-costs.cc| 29 +++ .../vect/costmodel/riscv/rvv/pr114516.c | 29 +++ 2 files changed, 58 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index 5149751ae76..8aad05c6180 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -215,6 +215,35 @@ compute_local_program_points ( "program point %d: %G", info.point, gsi_stmt (si)); } + + /* If the statement is part of a pattern, also add the other + pattern statements. */ + gimple_seq pattern_def_seq; + if (STMT_VINFO_IN_PATTERN_P (stmt_info) + && (pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info))) + { + gimple_stmt_iterator si2; + + for (si2 = gsi_start (pattern_def_seq); +!gsi_end_p (si2); +gsi_next (&si2)) + { + stmt_vec_info pattern_def_stmt_info + = vinfo->lookup_stmt (gsi_stmt (si2)); + if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info) + || STMT_VINFO_LIVE_P (pattern_def_stmt_info)) + { + stmt_point info = {point, gsi_stmt (si2), + pattern_def_stmt_info}; + program_points.safe_push (info); + point++; + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "program point %d: %G", + info.point, gsi_stmt (si2)); + } + } + } } program_points_per_bb.put (bb, program_points); } diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c new file mode 100644 index 000..5a514d1bbf6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114516.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zba_zbb -mabi=lp64d -mrvv-max-lmul=dynamic -O3 -fdump-tree-vect-details -Wno-pedantic" } */ + +typedef float real_t; +__attribute__((aligned(64))) real_t a[32000]; +real_t s315() +{ +for (int i = 0; i < 32000; i++) +a[i] = (i * 7) % 32000; +real_t x, chksum; +int index; +for (int nl = 0; nl < 256; nl++) { +x = a[0]; +index = 0; +for (int i = 0; i < 32000; ++i) { +if (a[i] > x) { +x = a[i]; +index = i; +} +} +chksum = x + (real_t) index; +} +return index + x + 1; +} + +/* { dg-final { scan-assembler {e32,m4} } } */ +/* { dg-final { scan-assembler-not {e32,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Preferring smaller LMUL loop because it has unexpected spills" 1 "vect" } } */ -- 2.48.1 -- Regards Robin
Re: [PATCH] vect: Use original LHS type for gather pattern [PR118950].
On Mon, 24 Feb 2025, Robin Dapp wrote: > Hi, > > in PR118950 we do not zero masked elements in a gather load. > While recognizing a gather/scatter pattern we do not use the original > type of the LHS. This matters because the type can differ with bool > patterns (e.g. _Bool vs unsigned char) and we don't notice the need > for zeroing out the padding bytes. > > This patch just uses the original LHS's type. > > Bootstrapped and regtested on x86, aarch64, and power10. > Regtested on rv64gcv_zvl512b. OK. Thanks, Richard. > > PR middle-end/118950 > > gcc/ChangeLog: > > * tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Use > original LHS's type. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/pr118950.c: New test. > --- > .../gcc.target/riscv/rvv/autovec/pr118950.c | 29 +++ > gcc/tree-vect-patterns.cc | 3 +- > 2 files changed, 31 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c > > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c > new file mode 100644 > index 000..604d4264eac > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118950.c > @@ -0,0 +1,29 @@ > +/* { dg-do run } */ > +/* { dg-require-effective-target riscv_v_ok } */ > +/* { dg-add-options riscv_v } */ > +/* { dg-additional-options "-std=gnu99 -Wno-pedantic" } */ > + > +unsigned char a; > +long long r; > +_Bool h = 1; > +short j[23]; > +_Bool k[3][23]; > + > +void b(_Bool h, short j[], _Bool k[][23]) { > + for (int m = 0; m < 23; m += 3) > +for (short n = 0; n < 22; n += 4) > + a = ({ > +unsigned char o = a; > +unsigned char p = j[n] ? h : k[m][n]; > +o > p ? o : p; > + }); > +} > + > +int main() { > + for (int m = 0; m < 23; ++m) > +j[m] = 10; > + b(h, j, k); > + r = a; > + if (r != 1) > +__builtin_abort (); > +} > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc > index 6fc97d1b6ef..4f0a7ea162b 100644 > --- a/gcc/tree-vect-patterns.cc > +++ b/gcc/tree-vect-patterns.cc > @@ -6022,7 +6022,8 @@ vect_recog_gather_scatter_pattern (vec_info *vinfo, >else > pattern_stmt = gimple_build_call_internal (gs_info.ifn, 4, base, > offset, scale, zero); > - tree load_lhs = vect_recog_temp_ssa_var (gs_info.element_type, NULL); > + tree lhs = gimple_get_lhs (stmt_info->stmt); > + tree load_lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL); >gimple_call_set_lhs (pattern_stmt, load_lhs); > } >else > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
[PING] [PATCH v3] get source line for diagnostic from preprocessed file [PR preprocessor/79106]
Gentle ping for https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675450.html
Re: [PATCH] c++: Fix range for with PMFs [PR118923]
On 2/24/25 5:52 AM, Jakub Jelinek wrote: Hi! The following testcases segfault because the new range for -frange-for-ext-temps temporary extension extends even the internal TARGET_EXPRs created by get_member_function_from_ptrfunc. The following patch fixes that by marking those TARGET_EXPR_INTERNAL_P. I'm not calling get_internal_target_expr because I really want to force the TARGET_EXPR creation rather than say mark some existing TARGET_EXPR internal, Hmm, good point, but that seems like a reason to use force_target_expr in gen_internal_target_expr. and get_internal_target_expr doesn't take complain argument either and always uses tf_warning_or_error. That shouldn't be an issue for any internal temporaries; complain is only relevant for building the cleanup that internal temps must not have. It looks like there are a few more internal uses of force_target_expr in cp_finish_decl and build_comparison_op. Jason
Re: The COBOL front end, version 3, now in 14 easy pieces
On Wed, Feb 19, 2025 at 12:38 AM James K. Lowden wrote: > > The following 14 patches constitute 105,720 lines of code in 83 files > to build and document the COBOL front end. The messages are > in a more or less logical order. We have: > > 1/14 4K dir: create gcc/cobol and libgcobol directories > 2/14 8K pre: introduce ChangeLog files > 3/14 80K bld: config and build machinery > 4/14 376K hdr: header files > 5/14 152K lex: lexer > 6/14 476K par: parser > 7/14 344K cbl: parser support > 8/14 516K api: GENERIC interface > 9/14 244K gen: GENERIC interface support > 10/14 72K doc: man pages and GnuCOBOL emulation > 11/14 84K lhd: libgcobol header files > 12/14 320K lib: libgcobol support > 13/14 372K lcc: libgcobol, main file > 14/14 148K fun: libgcobol, intrinsic functions > > To slide under the 400 KB limit, the intrinsic functions now have > their own patch. The configure files are removed, as is the Posix > adapter framework. > > They are still against the master branch as of > > commit 3e08a4ecea27c54fda90e8f58641b1986ad957e1 > Date: Wed Feb 5 14:22:33 2025 -0700 > > Our repository is > > https://gitlab.cobolworx.com/COBOLworx/gcc-cobol/ > > using branch > > cobol-stage > > I tested these patches using "git apply" to an unpublished branch > "cobol-patched". I have now built the compiler from the (now published) cobol-patched branch. On x86_64-linux and noticed the following issues: The toplevel configure script on the branch wasn't re-generated (duh), so it isn't ready for use. gcc-cobol/gcc/cobol/parse.y:1361.10-16: error: require bison 3.5.1, but have 3.0.4 %require "3.5.1" //3.8.2 also works, but not 3.8.0 ^^^ this requirement isn't documented, neither is a version requirement for flex. The place for this is in gcc/doc/install.texi During the build we write to gcc/cobol/charmaps-dupe.cc gcc/cobol/valconv-dupe.cc but sources might reside on a R/O mounted volume and the above is undesirable unless --enable-generated-files-in-srcdir. Installing the result via make install DESTDIR=/foo I see both a 'gcobol' and a 'gcobc' program being installed - is that intentional? I also see the gcobol.3 manpage reside directly in /foo/gcobol.3 rather than in the expected /foo/usr/local/man/.. Installing of libgcobol fails for me: make[4]: Entering directory '/home/rguenther/obj2/x86_64-pc-linux-gnu/32/libgcobol' Suppressing the 32-bit build because of lack of support for __int128 variables /bin/sh ../../../../gcc-cobol/libgcobol/../mkinstalldirs /home/rguenther/install/gcc-cobol/usr/local/lib /bin/sh ./libtool --mode=install /usr/bin/install -c libgcobol.la /home/rguenther/install/gcc-cobol/usr/local/lib; libtool: install: `libgcobol.la' is not a valid libtool archive libtool: install: Try `libtool --help --mode=install' for more information. make[4]: *** [Makefile:246: install-libs] Error 1 make[4]: Leaving directory '/home/rguenther/obj2/x86_64-pc-linux-gnu/32/libgcobol' make[3]: *** [Makefile:301: multi-do] Error 1 so the suppression seems incomplete? Compiling a Cobol Hello World results in > ./install/gcc-cobol/usr/local/bin/gcobol t.cob /usr/bin/ld: cannot find -lgcobol: No such file or directory collect2: error: ld returned 1 exit status possibly because the 64bit libcobol is installed in /foo/usr/local/lib/ rather than .../lib64/ and the former is not in the configured search directory. Manually fixing this up results in a working binary! Doing > ./install/gcc-cobol/usr/local/bin/gcobol t.cob -m32 only fails during linking as we do build a 32bit binary but do not find (the not built) 32bit runtime. I suspect when writing a program that would need 128bit integer or float we'd eventually ICE. The cobol frontend should attempt to intercept the selection of a not supported multilib (but don't ask me how) and gracefully exit, possibly calling sorry ("selected multilib not available for Cobol"); or so. There's the LANG_HOOKS_POST_OPTIONS langhook which might be a place to do this. The way to check is likely whether a mode for __int128 exists and is supported. The C fronted does if (targetm.scalar_mode_supported_p (TImode)) it also checks float128_type_node, but I didn't find who initializes that. Richard. > We have endeavored to address all must-fix issues raised in Round 2. > > 1. Generated files use Autoconf 2.69 > 2. Commit message matches mail Subject: line > 3. Various problems with Make-lang.in and cobol1.cc > 4. s/assert(false)/gcc_unreachable()/g > 5. Nixed range-based cases > 6. Removed Posix adapter files & generated configure scripts > 7. Explained memory-management engineering choice > 8. s/option_id/option_zero/g, for clarity > 9. GTY issues > 10. Require only C++14 (not 17) > 11. Moved #include > 12. Check regex buffer bounds outside gcc_assert > > Still to do (no particular order): > > 13. Try SARIF options > 14. Do not compose messages (I18N). > 15. Try valgrind for me
Re: [PATCH] avoid-store-forwarding: Handle REG_EH_REGION notes
On 2/24/25 2:23 AM, Richard Biener wrote: On Fri, Feb 21, 2025 at 4:55 PM Konstantinos Eleftheriou wrote: Hi Richard, thanks for the feedback. On Tue, Feb 18, 2025 at 9:17 PM Richard Biener wrote: Am 18.02.2025 um 17:04 schrieb Konstantinos Eleftheriou : From: kelefth The pass rejects the transformation when there are instructions in the sequence that might throw an exception. This was added due to having cases that the load instruction contains a REG_EH_REGION note and moving it before the store instructions caused an error, as it was no longer the last instruction in the basic block. This patch handles those cases by moving a possible REG_EH_REGION note from the load instruction of the store-load sequence to the last instruction of the basic block. But that’s not a correct transform and will lead to bogus exception handling? You’d need to move the note and split the block, possibly updating the EH info on the side. I had originally thought about splitting the block, but tried to get away with this solution. In case that we are splitting the block after the load, we wouldn't need to also move the note, right? I don't think it makes much sense in trying to handle loads/stores with EH, you generally can't re-order stuff around EH since that changes what is observable from the EH handler and the EH might also guard exceptions on later stmts. But if all the old statements and all the new/adjusted statements are in the same EH region is this still a problem? Jeff
[PATCH] c++, v2: Fix range for with PMFs [PR118923]
On Mon, Feb 24, 2025 at 08:29:31AM -0500, Jason Merrill wrote: > On 2/24/25 5:52 AM, Jakub Jelinek wrote: > > The following testcases segfault because the new range for > > -frange-for-ext-temps > > temporary extension extends even the internal TARGET_EXPRs created by > > get_member_function_from_ptrfunc. > > > > The following patch fixes that by marking those TARGET_EXPR_INTERNAL_P. > > I'm not calling get_internal_target_expr because I really want to force > > the TARGET_EXPR creation rather than say mark some existing TARGET_EXPR > > internal, > > Hmm, good point, but that seems like a reason to use force_target_expr in > gen_internal_target_expr. > > > and get_internal_target_expr doesn't take complain argument > > either and always uses tf_warning_or_error. > > That shouldn't be an issue for any internal temporaries; complain is only > relevant for building the cleanup that internal temps must not have. > > It looks like there are a few more internal uses of force_target_expr in > cp_finish_decl and build_comparison_op. So like this if it passes full bootstrap/regtest? So far passed make check-g++ in gcc (98,11,14,17,20,23,26) and make check in libstdc++-v3. 2025-02-24 Jakub Jelinek PR c++/118923 * tree.cc (get_internal_target_expr): Use force_target_expr instead of build_target_expr_with_type. * typeck.cc (get_member_function_from_ptrfunc): Use get_internal_target_expr instead of force_target_expr. * decl.cc (cp_finish_decl): Likewise. * method.cc (build_comparison_op): Likewise. * g++.dg/cpp0x/pr118923.C: New test. * g++.dg/cpp1y/pr118923.C: New test. --- gcc/cp/tree.cc.jj 2025-02-24 00:06:25.776732504 +0100 +++ gcc/cp/tree.cc 2025-02-24 15:04:28.991381102 +0100 @@ -982,8 +982,7 @@ tree get_internal_target_expr (tree init) { init = convert_bitfield_to_declared_type (init); - tree t = build_target_expr_with_type (init, TREE_TYPE (init), - tf_warning_or_error); + tree t = force_target_expr (TREE_TYPE (init), init, tf_warning_or_error); TARGET_EXPR_INTERNAL_P (t) = true; return t; } --- gcc/cp/typeck.cc.jj 2025-01-23 11:17:39.651880614 +0100 +++ gcc/cp/typeck.cc2025-02-24 15:05:27.817562229 +0100 @@ -4219,16 +4219,14 @@ get_member_function_from_ptrfunc (tree * && !DECL_P (instance_ptr) && !TREE_CONSTANT (instance_ptr))) instance_ptr = instance_save_expr - = force_target_expr (TREE_TYPE (instance_ptr), instance_ptr, - complain); + = get_internal_target_expr (instance_ptr); /* See above comment. */ if (TREE_SIDE_EFFECTS (function) || (!nonvirtual && !DECL_P (function) && !TREE_CONSTANT (function))) - function - = force_target_expr (TREE_TYPE (function), function, complain); + function = get_internal_target_expr (function); /* Start by extracting all the information from the PMF itself. */ e3 = pfn_from_ptrmemfunc (function); --- gcc/cp/decl.cc.jj 2025-02-24 00:06:25.665734038 +0100 +++ gcc/cp/decl.cc 2025-02-24 15:08:52.027729921 +0100 @@ -9377,8 +9377,7 @@ cp_finish_decl (tree decl, tree init, bo tree guard = NULL_TREE; if (cleanups || cleanup) { - guard = force_target_expr (boolean_type_node, -boolean_false_node, tf_none); + guard = get_internal_target_expr (boolean_false_node); add_stmt (guard); guard = TARGET_EXPR_SLOT (guard); } @@ -9407,8 +9406,7 @@ cp_finish_decl (tree decl, tree init, bo popped that all, so push those extra cleanups around the whole sequence with a guard variable. */ gcc_assert (TREE_CODE (sl) == STATEMENT_LIST); - guard = force_target_expr (integer_type_node, -integer_zero_node, tf_none); + guard = get_internal_target_expr (integer_zero_node); add_stmt (guard); guard = TARGET_EXPR_SLOT (guard); for (unsigned i = 0; i < n_extra_cleanups; ++i) --- gcc/cp/method.cc.jj 2025-01-15 08:59:46.383217835 +0100 +++ gcc/cp/method.cc2025-02-24 15:10:18.674537705 +0100 @@ -1597,7 +1597,7 @@ build_comparison_op (tree fndecl, bool d /* Some other array, will need runtime loop. */ else { - idx = force_target_expr (sizetype, maxval, complain); + idx = get_internal_target_expr (maxval); loop_indexes = tree_cons (idx, NULL_TREE, loop_indexes); } expr_type = TREE_TYPE (expr_type); --- gcc/testsuite/g++.dg/cpp0x/pr118923.C.jj2025-02-24 15:02:04.966385945 +0100 +++ gcc/testsuite/g++.dg/cpp0x
[Fortran, Patch, PR107635, 5_v1] Use correct data size in caf_transfer_between_remotes
Hi all, I nearly forgot to publish this patch: When transfering data between two remote images, i.e. a third images asks image one to read some data and then asks image two to put that data into its memory, the size of the data to transfer between these two images was miscalculated. The attached patch fixes this. Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline? Btw, in theory this should be last patch for this PR. Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de From 14432031c7f247e2b4d7e76614553b5379d543b2 Mon Sep 17 00:00:00 2001 From: Andre Vehreschild Date: Fri, 21 Feb 2025 14:06:28 +0100 Subject: [PATCH 2/2] Fortran: Fix detection of descriptor arrays in coarray [PR107635] Look at the formal arguments generated type in the function declaration to figure if an argument is a descriptor arrays. Fix handling of class types while splitting coarray expressions. PR fortran/107635 gcc/fortran/ChangeLog: * coarray.cc (fixup_comp_refs): For class types set correct component (class) type. (split_expr_at_caf_ref): Provide location. * trans-intrinsic.cc (conv_caf_send_to_remote): Look at generated formal argument and not declared one to detect descriptor arrays. (conv_caf_sendget): Same. --- gcc/fortran/coarray.cc | 15 ++- gcc/fortran/trans-intrinsic.cc | 30 -- 2 files changed, 34 insertions(+), 11 deletions(-) diff --git a/gcc/fortran/coarray.cc b/gcc/fortran/coarray.cc index e5648e0d027..f53de0b20e3 100644 --- a/gcc/fortran/coarray.cc +++ b/gcc/fortran/coarray.cc @@ -295,11 +295,12 @@ move_coarray_ref (gfc_ref **from, gfc_expr *expr) static void fixup_comp_refs (gfc_expr *expr) { - gfc_symbol *type = expr->symtree->n.sym->ts.type == BT_DERIVED - ? expr->symtree->n.sym->ts.u.derived - : (expr->symtree->n.sym->ts.type == BT_CLASS - ? CLASS_DATA (expr->symtree->n.sym)->ts.u.derived - : nullptr); + bool class_ref = expr->symtree->n.sym->ts.type == BT_CLASS; + gfc_symbol *type += expr->symtree->n.sym->ts.type == BT_DERIVED + ? expr->symtree->n.sym->ts.u.derived + : (class_ref ? CLASS_DATA (expr->symtree->n.sym)->ts.u.derived + : nullptr); if (!type) return; gfc_ref **pref = &(expr->ref); @@ -317,6 +318,9 @@ fixup_comp_refs (gfc_expr *expr) ref = nullptr; break; } + if (class_ref) + /* Link to the class type to allow for derived type resolution. */ + (*pref)->u.c.sym = ref->u.c.sym; (*pref)->next = ref->next; ref->next = NULL; gfc_free_ref_list (ref); @@ -372,6 +376,7 @@ split_expr_at_caf_ref (gfc_expr *expr, gfc_namespace *ns, st->n.sym->attr.dummy = 1; st->n.sym->attr.intent = INTENT_IN; st->n.sym->ts = *caf_ts; + st->n.sym->declared_at = expr->where; *post_caf_ref_expr = gfc_get_variable_expr (st); (*post_caf_ref_expr)->where = expr->where; diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index 80e98dc3c20..c97829fd8a8 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -1445,8 +1445,14 @@ conv_caf_send_to_remote (gfc_code *code) NULL_TREE, gfc_trans_force_lval (&block, lhs_se.string_length)); else opt_lhs_charlen = build_zero_cst (build_pointer_type (size_type_node)); - if (!TYPE_LANG_SPECIFIC (TREE_TYPE (caf_decl))->rank - || GFC_ARRAY_TYPE_P (TREE_TYPE (caf_decl))) + /* Get the third formal argument of the receiver function. (This is the + location where to put the data on the remote image.) Need to look at + the argument in the function decl, because in the gfc_symbol's formal + argument an array may have no descriptor while in the generated + function decl it has. */ + tmp = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (TYPE_ARG_TYPES ( + TREE_TYPE (receiver_fn_expr->symtree->n.sym->backend_decl); + if (!GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp))) opt_lhs_desc = null_pointer_node; else opt_lhs_desc @@ -1635,8 +1641,14 @@ conv_caf_sendget (gfc_code *code) NULL_TREE, gfc_trans_force_lval (&block, lhs_se.string_length)); else opt_lhs_charlen = build_zero_cst (build_pointer_type (size_type_node)); - if (!TYPE_LANG_SPECIFIC (TREE_TYPE (lhs_caf_decl))->rank - || GFC_ARRAY_TYPE_P (TREE_TYPE (lhs_caf_decl))) + /* Get the third formal argument of the receiver function. (This is the + location where to put the data on the remote image.) Need to look at + the argument in the function decl, because in the gfc_symbol's formal + argument an array may have no descriptor while in the generated + function decl it has. */ + tmp = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (TYPE_ARG_TYPES ( + TREE_TYPE (receiver_fn_expr->symtree->n.sym->backend_decl); + if (!GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp))) opt_lhs_desc = null_pointer_node; else opt_lhs_desc @@ -1677,8 +1689,14 @@ conv_caf_sendget (gfc_code *code) rhs_size = gfc_typenode_for_spec (ts)->type_common
Re: [Fortran, Patch, PR107635, 4_v2] Fix class type and descriptor handling for new coarray interface [PR107635]
Hi Harald, I have added some comment(s). Can you take another look? Regtested ok on x86_64-pc-linux-gnu / F41. Ok for mainline? Regards, Andre On Sat, 22 Feb 2025 17:36:55 +0100 Andre Vehreschild wrote: > Hi Harald, > > thanks for the review. Silently I'd hoped that there is some macro to get > the i-th argument, that I just haven't found and someone could point me to. > I will add a comment, when ko one comes up with the macro by Monday. > > Thanks, > Andre > Andre Vehreschild * ve...@gmx.de > Am 22. Februar 2025 15:29:20 schrieb Harald Anlauf : > > > Hi Andre, > > > > Am 21.02.25 um 14:35 schrieb Andre Vehreschild: > >> Hi all, > >> > >> during testing and compiling some larger coarray codes, I found a few > >> deficiencies. One was with handling class types when splitting the coarray > >> expression and the other was that the backend_decl of a formal argument in > >> a function's symbol was not the same as the one the function was compiled > >> to. So looking at the function-decl's tree n-th formal argument is the way > >> to go there. > >> > >> Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline? > > > > I am amazed that you do not get lost handling 9-fold nested > > macros! This is OK, as this touches your CAF code. > > > > Otherwise, I'd recommend to add an explaining comment in the > > code or code such that mere mortals have a better chance to > > follow... > > > > Thanks, > > Harald > > > >> Regards, > >>Andre > >> -- > >> Andre Vehreschild * Email: vehre ad gmx dot de > -- Andre Vehreschild * Email: vehre ad gmx dot de From cecfd1aeb37d65b2880dcbdd55afacb2dde63de0 Mon Sep 17 00:00:00 2001 From: Andre Vehreschild Date: Wed, 19 Feb 2025 09:04:47 +0100 Subject: [PATCH 1/2] Fortran: Use correct size when transferring between images [PR107635] gcc/fortran/ChangeLog: PR fortran/107635 * trans-intrinsic.cc (conv_caf_sendget): Use the size of data transferred between the two images and not the descritor's size. --- gcc/fortran/trans-intrinsic.cc | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index 2c4c47816c8..80e98dc3c20 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -1658,20 +1658,23 @@ conv_caf_sendget (gfc_code *code) gfc_init_se (&rhs_se, NULL); if (rhs_expr->rank == 0) { - gfc_conv_expr (&rhs_se, rhs_expr); - gfc_add_block_to_block (&block, &rhs_se.pre); opt_rhs_desc = null_pointer_node; if (rhs_expr->ts.type == BT_CHARACTER) { + gfc_conv_expr (&rhs_se, rhs_expr); + gfc_add_block_to_block (&block, &rhs_se.pre); opt_rhs_charlen = gfc_build_addr_expr ( NULL_TREE, gfc_trans_force_lval (&block, rhs_se.string_length)); rhs_size = build_int_cstu (size_type_node, rhs_expr->ts.kind); } else { + gfc_typespec *ts + = &sender_fn_expr->symtree->n.sym->formal->next->next->sym->ts; + opt_rhs_charlen = build_zero_cst (build_pointer_type (size_type_node)); - rhs_size = TREE_TYPE (rhs_se.expr)->type_common.size_unit; + rhs_size = gfc_typenode_for_spec (ts)->type_common.size_unit; } } else if (!TYPE_LANG_SPECIFIC (TREE_TYPE (rhs_caf_decl))->rank -- 2.48.1
Re: [PATCH] avoid-store-forwarding: Handle REG_EH_REGION notes
Hello, On Mon, 24 Feb 2025, Jeff Law wrote: > The pass rejects the transformation when there are instructions in the > sequence that might throw an exception. This was added due to having > cases that the load instruction contains a REG_EH_REGION note and > moving it before the store instructions caused an error, as it was > no longer the last instruction in the basic block. > > This patch handles those cases by moving a possible REG_EH_REGION > note from the load instruction of the store-load sequence to the > last instruction of the basic block. > >>> > >>> But that’s not a correct transform and will lead to bogus exception > >>> handling? You’d need to move the note and split the block, possibly > >>> updating the EH info on the side. > >> > >> I had originally thought about splitting the block, but tried to get > >> away with this solution. In case that we are splitting the block after > >> the load, we wouldn't need to also move the note, right? > > > > I don't think it makes much sense in trying to handle loads/stores with EH, > > you generally can't re-order stuff around EH since that changes what is > > observable from the EH handler and the EH might also guard exceptions > > on later stmts. > But if all the old statements and all the new/adjusted statements are in the > same EH region is this still a problem? It depends, but even if that were no problem in some specific cases (perhaps by ensuring that such sequence isn't intermingled with insns that are in different EH regions) that doesn't seem to be what the proposal is doing? From the description it moves a EH note from a load (where it presumably was for a reason) that occurs after a store to something that so happens to be the (new) last instruction of a bb. Because otherwise the load couldn't be moved earlier into the middle of a BB. In essence the load would lose the EH note and a new EH note would be generated for a different insn out of thin air. I don't see how that can ever be a correct transformation in general. What _if_ the very load traps then? What if the new insn cannot trap? (Note that loads and stores may trap differently even on the same address, e.g. write to read-only mem, or an misaligned load but an aligned (or smaller-sized) write; though I'll say that GCC assumes that loads cannot trap if shadowed by an equivalent write, in some other passes) Ciao, Michael.
Re: The COBOL front end, version 3, now in 14 easy pieces
On Mon, 24 Feb 2025 14:51:27 +0100 Richard Biener wrote: > gcc-cobol/gcc/cobol/parse.y:1361.10-16: error: require bison 3.5.1, > but have 3.0.4 > %require "3.5.1" //3.8.2 also works, but not 3.8.0 > ^^^ > > this requirement isn't documented, neither is a version requirement > for flex. The > place for this is in gcc/doc/install.texi I supplied a set of documentation patches separately on Friday, subject "info updates for gcobol". IIUC, you would like that patch applied to cobol-patched, too. I can do that. --jkl