Re: [PATCH] lto-plugin: add support for feature detection
On Sun, 15 May 2022, Rui Ueyama wrote: [snip] > > So get_symbols_v3 allows the linker to discard an LTO .o file to solve this. > > > > In absence of get_symbols_v3 mold tries to ensure correctness by restarting > > itself while appending a list of .o files to be discarded to its command > > line. > > > > I wonder if mold can invoke plugin cleanup callback to solve this without > > restarting. > > We can call the plugin cleanup callback from mold, but there are a few > problems: > > First of all, it looks like it is not clear what state the plugin cleanup > callback resets to. It may reset it to the initial state with which we > need to restart everything from calling `onload` callback, or it may not > deregister functions registered by the previous `onload` call. Since the > exact semantics is not documented, the LLVM gold plugin may behave > differently than the GCC plugin. Ack, that looks risky (and probably unnecessary, see below). > Second, if we reset a plugin's internal state, we need to register all > input files by calling the `claim_file_hook` callback, which in turn calls > the `add_symbols` callback. But we don't need any symbol information at > this point because mold already knows what are in LTO object files as it > calls `claim_file_hook` already on the same sets of files. So the > `add_symbols` invocations would be ignored, which is a waste of resources. > > So, I prefer get_symbols_v3 over calling the plugin cleanup callback. Oh, to be clear I wouldn't argue against implementing get_symbols_v3 in GCC. I was wondering if mold can solve this in another fashion without the self-restart trick. If I understood your design correctly, mold disregards the index in static archives, because it doesn't give you the dependency graph of contained objects (it only lists defined symbols, not used, I was mistaken about that in the previous email), and you wanted to let mold parse all archived objects in parallel instead of doing a multiphase scan where each phase extracts only the needed objects (in parallel). Is that correct? Is that a good tradeoff in the LTO case though? I believe you cannot assume the plugin to be thread-safe, so you're serializing its API calls, right? But the plugin is doing a lot of work, so using the index to feed it with as few LTO objects as possible should be a significant win, no? (even if it was thread-safe) And with the index, it should be rare that a file is spuriously added to the plugin, so maybe you could get away with issuing a warning or an error when the v2 API is used, but mold needs to discard a file? > > (also, hm, it seems to confirm my idea that LTO .o files should have had the > > correct symbol table so normal linker algorithms would work) > > I agree. If GCC LTO object file contains a correct ELF symbol table, we > can also eliminate the need of the special LTO-aware ar command. It looks > like it is a very common error to use an ar command that doesn't > understand the LTO object file, which results in mysterious "undefined > symbol" errors even though the object files in an archive file provide > that very symbols. Thanks. Alexander
Re: [PATCH] lto-plugin: add support for feature detection
On Sun, May 15, 2022 at 3:53 PM Alexander Monakov wrote: > > On Sun, 15 May 2022, Rui Ueyama wrote: > > [snip] > > > So get_symbols_v3 allows the linker to discard an LTO .o file to solve > > > this. > > > > > > In absence of get_symbols_v3 mold tries to ensure correctness by > > > restarting > > > itself while appending a list of .o files to be discarded to its command > > > line. > > > > > > I wonder if mold can invoke plugin cleanup callback to solve this without > > > restarting. > > > > We can call the plugin cleanup callback from mold, but there are a few > > problems: > > > > First of all, it looks like it is not clear what state the plugin cleanup > > callback resets to. It may reset it to the initial state with which we > > need to restart everything from calling `onload` callback, or it may not > > deregister functions registered by the previous `onload` call. Since the > > exact semantics is not documented, the LLVM gold plugin may behave > > differently than the GCC plugin. > > Ack, that looks risky (and probably unnecessary, see below). > > > Second, if we reset a plugin's internal state, we need to register all > > input files by calling the `claim_file_hook` callback, which in turn calls > > the `add_symbols` callback. But we don't need any symbol information at > > this point because mold already knows what are in LTO object files as it > > calls `claim_file_hook` already on the same sets of files. So the > > `add_symbols` invocations would be ignored, which is a waste of resources. > > > > So, I prefer get_symbols_v3 over calling the plugin cleanup callback. > > Oh, to be clear I wouldn't argue against implementing get_symbols_v3 in GCC. > I was wondering if mold can solve this in another fashion without the > self-restart trick. > > If I understood your design correctly, mold disregards the index in static > archives, because it doesn't give you the dependency graph of contained > objects (it only lists defined symbols, not used, I was mistaken about that > in the previous email), and you wanted to let mold parse all archived > objects in parallel instead of doing a multiphase scan where each phase > extracts only the needed objects (in parallel). Is that correct? Correct. > Is that a good tradeoff in the LTO case though? I believe you cannot assume > the plugin to be thread-safe, so you're serializing its API calls, right? > But the plugin is doing a lot of work, so using the index to feed it with as > few LTO objects as possible should be a significant win, no? (even if it > was thread-safe) Oh, I didn't know that claim_file_hook isn't thread-safe. I need to add a lock to guard it then. But is it actually the case? As to the tradeoff, speculatively loading all object files from archives may not be beneficial if the loaded files are LTO object files. But we do this for consistency. We don't have a multi-phase name resolution pass at all in mold; all symbols are resolved at once in parallel. We don't want to implement another name resolution pass just for LTO for the following reasons: 1. It bloats up the linker's code. 2. We don't know whether an archive file contains an LTO object file or not until we actually read archive members, so there's no chance to switch to the multi-pass name resolution algorithm before we read files. 3. If we have two different name resolution algorithms, it is very hard to guarantee that both algorithms produce the same result. As a result, the output with -flto may behave differently without -flto. 4. We still have to handle --start-libs and --end-libs, so feeding an object file that will end up not being included into the output is unavoidable. > And with the index, it should be rare that a file is spuriously added to the > plugin, so maybe you could get away with issuing a warning or an error when > the v2 API is used, but mold needs to discard a file? > > > > (also, hm, it seems to confirm my idea that LTO .o files should have had > > > the > > > correct symbol table so normal linker algorithms would work) > > > > I agree. If GCC LTO object file contains a correct ELF symbol table, we > > can also eliminate the need of the special LTO-aware ar command. It looks > > like it is a very common error to use an ar command that doesn't > > understand the LTO object file, which results in mysterious "undefined > > symbol" errors even though the object files in an archive file provide > > that very symbols. > > Thanks. > Alexander
Re: [PATCH] lto-plugin: add support for feature detection
On Sun, 15 May 2022, Rui Ueyama wrote: > > Is that a good tradeoff in the LTO case though? I believe you cannot assume > > the plugin to be thread-safe, so you're serializing its API calls, right? > > But the plugin is doing a lot of work, so using the index to feed it with as > > few LTO objects as possible should be a significant win, no? (even if it > > was thread-safe) > > Oh, I didn't know that claim_file_hook isn't thread-safe. I need to add a > lock to guard it then. But is it actually the case? You can see for yourself at https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=lto-plugin/lto-plugin.c (e.g. how claim_file_handler increments the global variable num_claimed_files) > As to the tradeoff, speculatively loading all object files from archives > may not be beneficial if the loaded files are LTO object files. But we do > this for consistency. We don't have a multi-phase name resolution pass at > all in mold; all symbols are resolved at once in parallel. We don't want > to implement another name resolution pass just for LTO for the following > reasons: > > 1. It bloats up the linker's code. > > 2. We don't know whether an archive file contains an LTO object file or > not until we actually read archive members, so there's no chance to switch > to the multi-pass name resolution algorithm before we read files. > > 3. If we have two different name resolution algorithms, it is very hard to > guarantee that both algorithms produce the same result. As a result, the > output with -flto may behave differently without -flto. Well, -flto can result in observably different results for other reasons anyway. > 4. We still have to handle --start-libs and --end-libs, so feeding an > object file that will end up not being included into the output is > unavoidable. Makes sense, but I still don't understand why mold wants to discover in advance whether the plugin is going to use get_symbols_v3. How would it help with what mold does today to handle the _v2 case? Alexander
[PATCH] rs6000: add support for sanitizers on FreeBSD
GCC's f732bf6a603721f61102a08ad2d023c7c2670870 merged LLVM's 315d792130258a9b7250494be8d002ebb427b08f, which added sanitizers support for PowerPC on FreeBSD, so this commit only enables building it. Enabled sanitizers are the same as on powerpc*-*-linux*. libsanitizer * configure.tgt: add powerpc*-*-freebsd* as supported --- libsanitizer/ChangeLog | 4 libsanitizer/configure.tgt | 2 ++ 2 files changed, 6 insertions(+) diff --git a/libsanitizer/ChangeLog b/libsanitizer/ChangeLog index 52050be9476..17cc395aea1 100644 --- a/libsanitizer/ChangeLog +++ b/libsanitizer/ChangeLog @@ -1,3 +1,7 @@ +2022-05-15 Piotr Kubaj + + * configure.tgt: add powerpc*-*-freebsd* + 2022-05-05 Martin Liska * LOCAL_PATCHES: Update. diff --git a/libsanitizer/configure.tgt b/libsanitizer/configure.tgt index fb89df4935c..affe8964f84 100644 --- a/libsanitizer/configure.tgt +++ b/libsanitizer/configure.tgt @@ -31,6 +31,8 @@ case "${target}" in TSAN_TARGET_DEPENDENT_OBJECTS=tsan_rtl_amd64.lo fi ;; + powerpc*-*-freebsd*) + ;; powerpc*-*-linux*) if test x$ac_cv_sizeof_void_p = x8; then TSAN_SUPPORTED=yes -- 2.36.0
Re: [PATCH] lto-plugin: add support for feature detection
On Sun, May 15, 2022 at 4:51 PM Alexander Monakov wrote: > > On Sun, 15 May 2022, Rui Ueyama wrote: > > > > Is that a good tradeoff in the LTO case though? I believe you cannot > > > assume > > > the plugin to be thread-safe, so you're serializing its API calls, right? > > > But the plugin is doing a lot of work, so using the index to feed it with > > > as > > > few LTO objects as possible should be a significant win, no? (even if it > > > was thread-safe) > > > > Oh, I didn't know that claim_file_hook isn't thread-safe. I need to add a > > lock to guard it then. But is it actually the case? > > You can see for yourself at > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=lto-plugin/lto-plugin.c > (e.g. how claim_file_handler increments the global variable num_claimed_files) > > > As to the tradeoff, speculatively loading all object files from archives > > may not be beneficial if the loaded files are LTO object files. But we do > > this for consistency. We don't have a multi-phase name resolution pass at > > all in mold; all symbols are resolved at once in parallel. We don't want > > to implement another name resolution pass just for LTO for the following > > reasons: > > > > 1. It bloats up the linker's code. > > > > 2. We don't know whether an archive file contains an LTO object file or > > not until we actually read archive members, so there's no chance to switch > > to the multi-pass name resolution algorithm before we read files. > > > > 3. If we have two different name resolution algorithms, it is very hard to > > guarantee that both algorithms produce the same result. As a result, the > > output with -flto may behave differently without -flto. > > Well, -flto can result in observably different results for other reasons > anyway. > > > 4. We still have to handle --start-libs and --end-libs, so feeding an > > object file that will end up not being included into the output is > > unavoidable. > > Makes sense, but I still don't understand why mold wants to discover in > advance whether the plugin is going to use get_symbols_v3. How would it > help with what mold does today to handle the _v2 case? Currently, mold restarts itself to reset the internal state of the plugin. If we know in advance that get_symbols_v3 is supported, we can avoid that restart. That should make the linker a bit faster. Also, restarting the linker is a hack, so we want to avoid it if possible.
Re: [PATCH] lto-plugin: add support for feature detection
On Sun, 15 May 2022, Rui Ueyama wrote: > > Makes sense, but I still don't understand why mold wants to discover in > > advance whether the plugin is going to use get_symbols_v3. How would it > > help with what mold does today to handle the _v2 case? > > Currently, mold restarts itself to reset the internal state of the plugin. > If we know in advance that get_symbols_v3 is supported, we can avoid that > restart. That should make the linker a bit faster. Also, restarting the > linker is a hack, so we want to avoid it if possible. Can you simply restart the linker on first call to get_symbols_v2 instead? Alexander
Re: [PATCH] lto-plugin: add support for feature detection
On Sun, May 15, 2022 at 6:09 PM Alexander Monakov wrote: > > On Sun, 15 May 2022, Rui Ueyama wrote: > > > > Makes sense, but I still don't understand why mold wants to discover in > > > advance whether the plugin is going to use get_symbols_v3. How would it > > > help with what mold does today to handle the _v2 case? > > > > Currently, mold restarts itself to reset the internal state of the plugin. > > If we know in advance that get_symbols_v3 is supported, we can avoid that > > restart. That should make the linker a bit faster. Also, restarting the > > linker is a hack, so we want to avoid it if possible. > > Can you simply restart the linker on first call to get_symbols_v2 instead? I could, but it may not be a safe timing to call exec(2). I believe we are expected to call cleanup_hook after calling all_symbols_read_hook, and it is not clear what will happen if we abruptly terminate and restart the current process. For example, doesn't it leave temporary files on disk?
[PATCH] rs6000: add support for sanitizers on FreeBSD
GCC's f732bf6a603721f61102a08ad2d023c7c2670870 merged LLVM's 315d792130258a9b7250494be8d002ebb427b08f, which added sanitizers support for PowerPC on FreeBSD, so this commit only enables building it. Enabled sanitizers are the same as on powerpc*-*-linux*. libsanitizer * configure.tgt: add powerpc*-*-freebsd* as supported --- libsanitizer/ChangeLog | 4 libsanitizer/configure.tgt | 2 ++ 2 files changed, 6 insertions(+) diff --git a/libsanitizer/ChangeLog b/libsanitizer/ChangeLog index 52050be9476..17cc395aea1 100644 --- a/libsanitizer/ChangeLog +++ b/libsanitizer/ChangeLog @@ -1,3 +1,7 @@ +2022-05-15 Piotr Kubaj + + * configure.tgt: add powerpc*-*-freebsd* + 2022-05-05 Martin Liska * LOCAL_PATCHES: Update. diff --git a/libsanitizer/configure.tgt b/libsanitizer/configure.tgt index fb89df4935c..affe8964f84 100644 --- a/libsanitizer/configure.tgt +++ b/libsanitizer/configure.tgt @@ -31,6 +31,8 @@ case "${target}" in TSAN_TARGET_DEPENDENT_OBJECTS=tsan_rtl_amd64.lo fi ;; + powerpc*-*-freebsd*) + ;; powerpc*-*-linux*) if test x$ac_cv_sizeof_void_p = x8; then TSAN_SUPPORTED=yes -- 2.36.0
Re: [PATCH] lto-plugin: add support for feature detection
On Sun, 15 May 2022, Rui Ueyama wrote: > > Can you simply restart the linker on first call to get_symbols_v2 instead? > > I could, but it may not be a safe timing to call exec(2). I believe we > are expected to call cleanup_hook after calling all_symbols_read_hook, > and it is not clear what will happen if we abruptly terminate and > restart the current process. For example, doesn't it leave temporary > files on disk? Regarding files, as far as I can tell, GCC plugin will leave a 'resolution file' on disk, but after re-exec it would recreate it anyway. Alexander
Re: [PATCH] lto-plugin: add support for feature detection
On Sun, May 15, 2022 at 7:37 PM Alexander Monakov wrote: > > On Sun, 15 May 2022, Rui Ueyama wrote: > > > > Can you simply restart the linker on first call to get_symbols_v2 instead? > > > > I could, but it may not be a safe timing to call exec(2). I believe we > > are expected to call cleanup_hook after calling all_symbols_read_hook, > > and it is not clear what will happen if we abruptly terminate and > > restart the current process. For example, doesn't it leave temporary > > files on disk? > > Regarding files, as far as I can tell, GCC plugin will leave a 'resolution > file' > on disk, but after re-exec it would recreate it anyway. Does it recreate a temporary file with the same file name so that there's no temporary file left on the disk after the linker finishes doing LTO?
Re: [PATCH] lto-plugin: add support for feature detection
On Sun, 15 May 2022, Rui Ueyama wrote: > > Regarding files, as far as I can tell, GCC plugin will leave a 'resolution > > file' > > on disk, but after re-exec it would recreate it anyway. > > Does it recreate a temporary file with the same file name so that > there's no temporary file left on the disk after the linker finishes > doing LTO? Resolution file name is taken from the command line option '-fresolution=', so it's a stable name (supplied by the compiler driver). Alexander
Re: [PATCH] gdc 9, 10 and 11 bug fix
Excerpts from Marc Aurèle La France's message of Mai 12, 2022 10:29 pm: > Greetings. > > No compiler has any business rejecting files for the sole crime of being > symlinked to. The following applies, modulo patch fuzz, to the 9, 10 and 11 > series of compilers. > > Given my use of shadow trees, this bug attempted to prevent me from building > 12.1.0. The D-based gdc in 12.1.0 and up does not exhibit this quirky > behaviour. > Hi Marc, Thanks, I've checked upstream and see the following change: https://github.com/dlang/dmd/pull/11836/commits/ebda81e44fd0ca4b247a1860d9bef411c41c16cb It should be fine to just backport that. Iain.
Re: libgompd: ADD OMPD support and global ICV functions
Ping في الجمعة، ١٣ مايو، ٢٠٢٢ ٩:١٩ م Mohamed Atef كتب: > Hello Jakub, >I am sorry, but should #ifdef __ELF__ put and separate file and also > the actual functions (e.g. extern ompd_dll_location_valid (void)) > I mean both in the same files or the functions should be in the > omp-tools.h but with #ifndef __ELF__ > > > Mohamed > > On Fri, May 13, 2022 at 8:27 PM Jakub Jelinek wrote: > >> On Fri, May 13, 2022 at 08:22:41PM +0200, Mohamed Atef wrote: >> > > As I've tried to explain, this #ifdef __ELF__ doesn't belong >> > > to the public header, which should contain just >> > > extern void ompd_dll_locations_valid (void) __GOMPD_NOTHROW; >> > > The #define should be in some internal header that is included >> > > after the public one. >> > > >> > I will put the #define in ompd-support.h is that okay? >> > And i will put ompd_bp_* labels there too. >> >> Ok. >> >> Jakub >> >>
Re: libgompd: ADD OMPD support and global ICV functions
في الأحد، ١٥ مايو، ٢٠٢٢ ٤:١٨ م Mohamed Atef كتب: > Ping > > في الجمعة، ١٣ مايو، ٢٠٢٢ ٩:١٩ م Mohamed Atef > كتب: > >> Hello Jakub, >>I am sorry, but should #ifdef __ELF__ put and separate file and also >> the actual functions (e.g. extern ompd_dll_location_valid (void)) >> I mean both in the same files or the functions should be in the >> omp-tools.h but with #ifndef __ELF__ >> >> >> Mohamed >> > I am pinging this not the patch. Thanks. > >> On Fri, May 13, 2022 at 8:27 PM Jakub Jelinek wrote: >> >>> On Fri, May 13, 2022 at 08:22:41PM +0200, Mohamed Atef wrote: >>> > > As I've tried to explain, this #ifdef __ELF__ doesn't belong >>> > > to the public header, which should contain just >>> > > extern void ompd_dll_locations_valid (void) __GOMPD_NOTHROW; >>> > > The #define should be in some internal header that is included >>> > > after the public one. >>> > > >>> > I will put the #define in ompd-support.h is that okay? >>> > And i will put ompd_bp_* labels there too. >>> >>> Ok. >>> >>> Jakub >>> >>>
[pushed] c++: array {}-init [PR105589]
My patch for 105191 made us use build_value_init more frequently from build_vec_init_expr, but build_value_init doesn't like to be called to initialize a class in a template. That's caused trouble in the past, and seems like a strange restriction, so let's fix it. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/105589 PR c++/105191 PR c++/92385 gcc/cp/ChangeLog: * init.cc (build_value_init): Handle class in template. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/initlist-array16.C: New test. --- gcc/cp/init.cc| 7 +++ gcc/testsuite/g++.dg/cpp0x/initlist-array16.C | 11 +++ 2 files changed, 14 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-array16.C diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc index f1ed9336dc9..a4a0a0ac0c2 100644 --- a/gcc/cp/init.cc +++ b/gcc/cp/init.cc @@ -343,10 +343,6 @@ build_value_init (tree type, tsubst_flags_t complain) A program that calls for default-initialization or value-initialization of an entity of reference type is ill-formed. */ - /* The AGGR_INIT_EXPR tweaking below breaks in templates. */ - gcc_assert (!processing_template_decl - || (SCALAR_TYPE_P (type) || TREE_CODE (type) == ARRAY_TYPE)); - if (CLASS_TYPE_P (type) && type_build_ctor_call (type)) { tree ctor @@ -354,6 +350,9 @@ build_value_init (tree type, tsubst_flags_t complain) NULL, type, LOOKUP_NORMAL, complain); if (ctor == error_mark_node || TREE_CONSTANT (ctor)) return ctor; + if (processing_template_decl) + /* The AGGR_INIT_EXPR tweaking below breaks in templates. */ + return build_min (CAST_EXPR, type, NULL_TREE); tree fn = NULL_TREE; if (TREE_CODE (ctor) == CALL_EXPR) fn = get_callee_fndecl (ctor); diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-array16.C b/gcc/testsuite/g++.dg/cpp0x/initlist-array16.C new file mode 100644 index 000..bb1d8d84704 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/initlist-array16.C @@ -0,0 +1,11 @@ +// PR c++/105589 +// { dg-do compile { target c++11 } } + +struct X { X(); }; + +struct array { X m[2]; }; + +template +void f() { + array w = array{}; +} base-commit: c5397682aff4ae9ced15ddc74971b9b6e218b664 -- 2.27.0
[pushed] c++: parsing injected-class-name as template
While I was backporting the patch for PR102300, it occurred to me that it would be cleaner to look through the injected-class-name earlier in the function. I don't think this changes any test results. Tested x86_64-pc-linux-gnu, applying to trunk. gcc/cp/ChangeLog: * parser.cc (cp_parser_template_name): Look through injected-class-name. --- gcc/cp/parser.cc | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 8969ed0076a..d4dab6cf84e 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -18646,6 +18646,9 @@ cp_parser_template_name (cp_parser* parser, (9.3.4), or in a type-only context other than a nested-name-specifier (13.8). */ + /* Handle injected-class-name. */ + decl = maybe_get_template_decl_from_type_decl (decl); + /* If DECL is a template, then the name was a template-name. */ if (TREE_CODE (decl) == TEMPLATE_DECL) { base-commit: ce46d6041358052dfa26f3720732f0357c5d72e7 -- 2.27.0
[pushed] c++: hidden friend access [DR1699]
It has come up several times that Clang considers hidden friends of a class to be sufficiently memberly to be covered by a friend declaration naming the class. This is somewhat unclear in the standard: [class.friend] says "Declaring a class to be a friend implies that private and protected members of the class granting friendship can be named in the base-specifiers and member declarations of the befriended class." A hidden friend is a syntactic member-declaration, but is it a "member declaration"? CWG was ambivalent, and referred the question to EWG as a design choice. But recently Patrick mentioned that the current G++ choice not to treat it as a "member declaration" was making his library work significantly more cumbersome, so let's go ahead and vote the other way. This means that the testcases for 100502 and 58993 are now accepted. Tested x86_64-pc-linux-gnu, applying to trunk. DR1699 PR c++/100502 PR c++/58993 gcc/cp/ChangeLog: * friend.cc (is_friend): Hidden friends count as members. * search.cc (friend_accessible_p): Likewise. gcc/testsuite/ChangeLog: * g++.dg/template/access37.C: Now OK. * g++.dg/template/friend69.C: Now OK. * g++.dg/lookup/friend23.C: New test. --- gcc/cp/friend.cc | 2 ++ gcc/cp/search.cc | 7 ++- gcc/testsuite/g++.dg/lookup/friend23.C | 17 + gcc/testsuite/g++.dg/template/access37.C | 8 gcc/testsuite/g++.dg/template/friend69.C | 4 ++-- 5 files changed, 27 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/g++.dg/lookup/friend23.C diff --git a/gcc/cp/friend.cc b/gcc/cp/friend.cc index 124ed4f3962..bf37dadeb62 100644 --- a/gcc/cp/friend.cc +++ b/gcc/cp/friend.cc @@ -131,6 +131,8 @@ is_friend (tree type, tree supplicant) { if (DECL_FUNCTION_MEMBER_P (supplicant)) context = DECL_CONTEXT (supplicant); + else if (tree fc = DECL_FRIEND_CONTEXT (supplicant)) + context = fc; else context = NULL_TREE; } diff --git a/gcc/cp/search.cc b/gcc/cp/search.cc index b86b3a24080..10863a40b11 100644 --- a/gcc/cp/search.cc +++ b/gcc/cp/search.cc @@ -734,12 +734,9 @@ friend_accessible_p (tree scope, tree decl, tree type, tree otype) && friend_accessible_p (DECL_CONTEXT (scope), decl, type, otype)) return 1; /* Perhaps SCOPE is a friend function defined inside a class from which -DECL is accessible. Checking this is necessary only when the class -is dependent, for otherwise add_friend will already have added the -class to SCOPE's DECL_BEFRIENDING_CLASSES. */ +DECL is accessible. */ if (tree fctx = DECL_FRIEND_CONTEXT (scope)) - if (dependent_type_p (fctx) - && protected_accessible_p (decl, fctx, type, otype)) + if (friend_accessible_p (fctx, decl, type, otype)) return 1; } diff --git a/gcc/testsuite/g++.dg/lookup/friend23.C b/gcc/testsuite/g++.dg/lookup/friend23.C new file mode 100644 index 000..f7b26c9e3ae --- /dev/null +++ b/gcc/testsuite/g++.dg/lookup/friend23.C @@ -0,0 +1,17 @@ +template +struct base { + friend void bar(Derived& d) { +d.bar(); // access in inline friend of friend, ok? + } +}; + +class derived : base { + friend class base; + void bar() {} +}; + +int main() { + derived d; + bar(d); +} + diff --git a/gcc/testsuite/g++.dg/template/access37.C b/gcc/testsuite/g++.dg/template/access37.C index 5be532c75b0..407a7dc0f2d 100644 --- a/gcc/testsuite/g++.dg/template/access37.C +++ b/gcc/testsuite/g++.dg/template/access37.C @@ -6,10 +6,10 @@ struct EnumeratorRange { EnumeratorRange range_; friend void f(Iterator i) { - i.range_.end_reached_; // { dg-error "private" } - i.range_.EnumeratorRange::end_reached_; // { dg-error "private" } - &i.range_.end_reached_; // { dg-error "private" } - &i.range_.EnumeratorRange::end_reached_; // { dg-error "private" } + i.range_.end_reached_; + i.range_.EnumeratorRange::end_reached_; + &i.range_.end_reached_; + &i.range_.EnumeratorRange::end_reached_; } }; diff --git a/gcc/testsuite/g++.dg/template/friend69.C b/gcc/testsuite/g++.dg/template/friend69.C index f3086a9f980..9bec6ba5846 100644 --- a/gcc/testsuite/g++.dg/template/friend69.C +++ b/gcc/testsuite/g++.dg/template/friend69.C @@ -12,7 +12,7 @@ protected: struct A { friend void g(A) { -B::f(); // { dg-error "private" } -B::g(); // { dg-error "protected" } +B::f(); +B::g(); } }; base-commit: ce46d6041358052dfa26f3720732f0357c5d72e7 prerequisite-patch-id: 4e382fc8327c2191fe57f2111af0c9830dc85d71 -- 2.27.0
[PATCH] i386: Remove constraints when used with constant integer predicates.
const_int_operand and other const*_operand predicates do not need constraints when the constraint is inherited from the range of constant integer predicate. Remove the constraint in case all alternatives use the same inherited constraint. 2022-05-15 Uroš Bizjak gcc/ChangeLog: * config/i386/i386.md: Remove constraints when used with const_int_operand, const0_operand, const_1_operand, constm1_operand, const8_operand, const128_operand, const248_operand, const123_operand, const2367_operand, const1248_operand, const359_operand, const_4_or_8_to_11_operand, const48_operand, const_0_to_1_operand, const_0_to_3_operand, const_0_to_4_operand, const_0_to_5_operand, const_0_to_7_operand, const_0_to_15_operand, const_0_to_31_operand, const_0_to_63_operand, const_0_to_127_operand, const_0_to_255_operand, const_0_to_255_mul_8_operand, const_1_to_31_operand, const_1_to_63_operand, const_2_to_3_operand, const_4_to_5_operand, const_4_to_7_operand, const_6_to_7_operand, const_8_to_9_operand, const_8_to_11_operand, const_8_to_15_operand, const_10_to_11_operand, const_12_to_13_operand, const_12_to_15_operand, const_14_to_15_operand, const_16_to_19_operand, const_16_to_31_operand, const_20_to_23_operand, const_24_to_27_operand and const_28_to_31_operand. * config/i386/mmx.md: Ditto. * config/i386/sse.md: Ditto. * config/i386/subst.md: Ditto. * config/i386/sync.md: Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. p.diff.txt.gz Description: application/gzip
[GCC 12][committed] d: Merge upstream dmd a53934d18, phobos 604534d7c.
Hi, Upstream dmd has now released v2.100.0, this patch merges in the latest bug fixes since the last sync-up of the release branch. D front-end changes: - Import dmd v2.100.0 release. - The new behavior of issuing deprecation messages for scope violations has been reverted until next release. Phobos changes: - Import phobos v2.100.0 release. Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and committed to the releases/gcc-12 branch. Regards, Iain. --- gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd a53934d18. * dmd/VERSION: Update version to v2.100.0. * d-codegen.cc (d_decl_context): Use resolvedLinkage to get declaration linkage. (build_struct_literal): Track offset in bits. * d-gimplify.cc (d_gimplify_modify_expr): Check both operands for a bit-field reference. * d-lang.cc (d_post_options): Set flag_rtti and flag_exceptions if -fno-druntime was seen on command-line. (d_type_promotes_to): Use resolvedLinkage to get declaration linkage. * decl.cc (make_thunk): Likewise. * types.cc (layout_aggregate_members): Ignore anonymous fields in total count. libphobos/ChangeLog: * src/MERGE: Merge upstream phobos 604534d7c. --- gcc/d/d-codegen.cc| 17 - gcc/d/d-gimplify.cc | 3 +- gcc/d/d-lang.cc | 16 +++-- gcc/d/decl.cc | 4 +-- gcc/d/dmd/MERGE | 2 +- gcc/d/dmd/VERSION | 2 +- gcc/d/dmd/clone.d | 22 +++- gcc/d/dmd/dclass.d| 2 +- gcc/d/dmd/declaration.d | 10 -- gcc/d/dmd/declaration.h | 3 +- gcc/d/dmd/dmangle.d | 10 +++--- gcc/d/dmd/dsymbolsem.d| 20 +-- gcc/d/dmd/dtemplate.d | 2 +- gcc/d/dmd/dtoh.d | 19 ++ gcc/d/dmd/escape.d| 2 +- gcc/d/dmd/expressionsem.d | 35 +-- gcc/d/dmd/func.d | 17 - gcc/d/dmd/initsem.d | 12 +++ gcc/d/dmd/json.d | 2 +- gcc/d/dmd/mtype.d | 25 +++-- gcc/d/dmd/objc.d | 6 ++-- gcc/d/dmd/semantic2.d | 15 gcc/d/dmd/semantic3.d | 2 +- gcc/d/dmd/traits.d| 6 ++-- gcc/d/types.cc| 10 -- gcc/testsuite/gdc.test/compilable/test23097.d | 33 + .../extra-files/test23109/object.d| 17 + .../gdc.test/fail_compilation/fail12604.d | 4 +-- .../gdc.test/fail_compilation/fail23108a.d| 16 + .../gdc.test/fail_compilation/fail23108b.d| 18 ++ .../gdc.test/fail_compilation/fail23109.d | 12 +++ .../gdc.test/fail_compilation/fail3703.d | 4 +-- .../gdc.test/fail_compilation/fail_scope.d| 30 .../gdc.test/fail_compilation/ice23097.d | 28 +++ .../fail_compilation/imports/test23109a.d | 10 ++ .../fail_compilation/imports/test23109b.d | 10 ++ .../fail_compilation/imports/test23109c.d | 3 ++ .../gdc.test/fail_compilation/test9150.d | 2 +- gcc/testsuite/gdc.test/runnable/test20734.d | 28 +++ libphobos/src/MERGE | 2 +- 40 files changed, 375 insertions(+), 106 deletions(-) create mode 100644 gcc/testsuite/gdc.test/compilable/test23097.d create mode 100644 gcc/testsuite/gdc.test/fail_compilation/extra-files/test23109/object.d create mode 100644 gcc/testsuite/gdc.test/fail_compilation/fail23108a.d create mode 100644 gcc/testsuite/gdc.test/fail_compilation/fail23108b.d create mode 100644 gcc/testsuite/gdc.test/fail_compilation/fail23109.d create mode 100644 gcc/testsuite/gdc.test/fail_compilation/ice23097.d create mode 100644 gcc/testsuite/gdc.test/fail_compilation/imports/test23109a.d create mode 100644 gcc/testsuite/gdc.test/fail_compilation/imports/test23109b.d create mode 100644 gcc/testsuite/gdc.test/fail_compilation/imports/test23109c.d create mode 100644 gcc/testsuite/gdc.test/runnable/test20734.d diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc index bb96b2f8d28..22090a8c755 100644 --- a/gcc/d/d-codegen.cc +++ b/gcc/d/d-codegen.cc @@ -76,7 +76,7 @@ d_decl_context (Dsymbol *dsym) but only for extern(D) symbols. */ if (parent->isModule ()) { - if ((decl != NULL && decl->linkage != LINK::d) + if ((decl != NULL && decl->resolvedLinkage () != LINK::d) || (ad != NULL && ad->classKind != ClassKind::d)) return NULL_TREE;
Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.
ping. On Sat, May 7, 2022 at 1:05 PM liuhongt via Gcc-patches wrote: > > This is adjusted patch only for OImode. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/104610 > * config/i386/i386-expand.cc (ix86_expand_branch): Use ptest > for QImode when code is EQ or NE. > * config/i386/sse.md (cbranch4): Extend to OImode. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr104610.c: New test. > --- > gcc/config/i386/i386-expand.cc | 10 +- > gcc/config/i386/sse.md | 8 ++-- > gcc/testsuite/gcc.target/i386/pr104610.c | 15 +++ > 3 files changed, 30 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr104610.c > > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc > index bc806ffa283..c2f8776102c 100644 > --- a/gcc/config/i386/i386-expand.cc > +++ b/gcc/config/i386/i386-expand.cc > @@ -2267,11 +2267,19 @@ ix86_expand_branch (enum rtx_code code, rtx op0, rtx > op1, rtx label) > >/* Handle special case - vector comparsion with boolean result, transform > it using ptest instruction. */ > - if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) > + if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT > + || (mode == OImode && (code == EQ || code == NE))) > { >rtx flag = gen_rtx_REG (CCZmode, FLAGS_REG); >machine_mode p_mode = GET_MODE_SIZE (mode) == 32 ? V4DImode : V2DImode; > > + if (mode == OImode) > + { > + op0 = lowpart_subreg (p_mode, force_reg (mode, op0), mode); > + op1 = lowpart_subreg (p_mode, force_reg (mode, op1), mode); > + mode = p_mode; > + } > + >gcc_assert (code == EQ || code == NE); >/* Generate XOR since we can't check that one operand is zero vector. > */ >tmp = gen_reg_rtx (mode); > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 7b791def542..9514b8e0234 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -26034,10 +26034,14 @@ (define_expand > "maskstore" > (match_operand: 2 "register_operand")))] >"TARGET_AVX512BW") > > +(define_mode_iterator VI48_OI_AVX > + [(V8SI "TARGET_AVX") (V4DI "TARGET_AVX") (OI "TARGET_AVX") > + V4SI V2DI]) > + > (define_expand "cbranch4" >[(set (reg:CC FLAGS_REG) > - (compare:CC (match_operand:VI48_AVX 1 "register_operand") > - (match_operand:VI48_AVX 2 "nonimmediate_operand"))) > + (compare:CC (match_operand:VI48_OI_AVX 1 "register_operand") > + (match_operand:VI48_OI_AVX 2 "nonimmediate_operand"))) > (set (pc) (if_then_else >(match_operator 0 "bt_comparison_operator" > [(reg:CC FLAGS_REG) (const_int 0)]) > diff --git a/gcc/testsuite/gcc.target/i386/pr104610.c > b/gcc/testsuite/gcc.target/i386/pr104610.c > new file mode 100644 > index 000..00866238bd7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr104610.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mmove-max=256 -mstore-max=256" } */ > +/* { dg-final { scan-assembler-times {(?n)vptest.*ymm} 1 } } */ > +/* { dg-final { scan-assembler-times {sete} 1 } } */ > +/* { dg-final { scan-assembler-not {(?n)je.*L[0-9]} } } */ > +/* { dg-final { scan-assembler-not {(?n)jne.*L[0-9]} } } */ > + > + > +#include > +__attribute__((target("avx"))) > +bool f256(char *a) > +{ > + char t[] = "0123456789012345678901234567890"; > + return __builtin_memcmp(a, &t[0], sizeof(t)) == 0; > +} > -- > 2.18.1 > -- BR, Hongtao
Re: [PATCH] lto-plugin: add support for feature detection
On Sun, May 15, 2022 at 8:07 PM Alexander Monakov wrote: > > On Sun, 15 May 2022, Rui Ueyama wrote: > > > > Regarding files, as far as I can tell, GCC plugin will leave a > > > 'resolution file' > > > on disk, but after re-exec it would recreate it anyway. > > > > Does it recreate a temporary file with the same file name so that > > there's no temporary file left on the disk after the linker finishes > > doing LTO? > > Resolution file name is taken from the command line option '-fresolution=', > so it's a stable name (supplied by the compiler driver). If it is a guaranteed behavior that GCC of all versions that support only get_symbols_v2 don't leave a temporary file behind if it is suddenly disrupted during get_symbols_v2 execution, then yes, mold can restart itself when get_symbols_v2 is called for the first time. Is this what you want? I'm fine with that approach if it is guaranteed to work by GCC developers.
[PATCH v5, rs6000] Add a combine pattern for CA minus one [PR95737]
Hi, This patch adds a combine pattern for "CA minus one". As CA only has two values (0 or 1), we could convert following pattern (sign_extend:DI (plus:SI (reg:SI 98 ca) (const_int -1 [0x] to (plus:DI (reg:DI 98 ca) (const_int -1 [0x]))) With this patch, one unnecessary sign extend is eliminated. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2022-05-16 Haochen Gui gcc/ PR target/95737 * config/rs6000/rs6000.md (subfsi3_carry_in_xx_64): New. gcc/testsuite/ PR target/95737 * gcc.target/powerpc/pr95737.c: New. patch.diff diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 64049a6e521..b97ac453fc0 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -2353,6 +2353,19 @@ (define_insn "subf3_carry_in_xx" "subfe %0,%0,%0" [(set_attr "type" "add")]) +(define_insn_and_split "*subfsi3_carry_in_xx_64" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (sign_extend:DI (plus:SI (reg:SI CA_REGNO) +(const_int -1] + "TARGET_POWERPC64" + "#" + "" + [(parallel [(set (match_dup 0) + (plus:DI (reg:DI CA_REGNO) + (const_int -1))) + (clobber (reg:DI CA_REGNO))])] + "" +) (define_insn "@neg2" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") diff --git a/gcc/testsuite/gcc.target/powerpc/pr95737.c b/gcc/testsuite/gcc.target/powerpc/pr95737.c new file mode 100644 index 000..30f0f819393 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c @@ -0,0 +1,11 @@ +/* PR target/95737 */ +/* { dg-do compile } */ +/* Disable isel for P9 and later */ +/* { dg-options "-O2 -mno-isel" } */ +/* { dg-final { scan-assembler-not {\mextsw\M} } } */ + + +unsigned long negativeLessThan (unsigned long a, unsigned long b) +{ + return -(a < b); +}
Re: Supporting RISC-V Vendor Extensions in the GNU Toolchain
On Fri, May 13, 2022 at 3:38 AM Philipp Tomsich wrote: > On Fri, 13 May 2022 at 12:00, Christoph Müllner > wrote: > > > > On Wed, May 11, 2022 at 2:02 AM Palmer Dabbelt > wrote: > > > > > > [Sorry for cross-posting to a bunch of lists, I figured it'd be best to > > > have all the discussions in one thread.] > > > > > > We currently only support what is defined by official RISC-V > > > specifications in the various GNU toolchain projects. There's > certainly > > > some grey areas there, but in general that means not taking code that > > > relies on drafts or vendor defined extensions, even if that would > result > > > in higher performance or more featured systems for users. > > > > > > The original goal of these policies were to steer RISC-V implementers > > > towards a common set of specifications, but over the last year or so > > > it's become abundantly clear that this is causing more harm that good. > > > All extant RISC-V systems rely on behaviors defined outside the > official > > > specifications, and while that's technically always been the case we've > > > gotten to the point where trying to ignore that fact is impacting real > > > users on real systems. There's been consistent feedback from users > that > > > we're not meeting their needs, which can clearly be seen in the many > out > > > of tree patch sets in common use. > > > > > > There's been a handful of discussions about this, but we've yet to have > > > a proper discussion on the mailing lists. From the various discussions > > > I've had it seems that folks are broadly in favor of supporting vendor > > > extensions, but the devil's always in the details with this sort of > > > thing so I thought it'd be best to write something up so we can have a > > > concrete discussion. > > > > > > The idea is to start taking code that depends on vendor-defined > behavior > > > into the core GNU toolchain ports, as long as it meets the following > > > criteria: > > > > > > * An ISA manual is available that can be redistributed/archived, > defines > > > the behaviors in question as one or more vendor-specific extensions, > > > and is clearly versioned. The RISC-V foundation is setting various > > > guidelines around how vendor-defined extensions and instructions > > > should be named, we strongly suggest that vendors follow those > > > conventions whenever possible (this is all new, though, so exactly > > > what's necessary from vendor specifications will likely evolve as we > > > learn). > > > * There is a substantial user base that depends on the behavior in > > > question, which probably means there is hardware in the wild that > > > implements the extensions and users that require those extensions in > > > order for that hardware to be useful for common applications. This > is > > > always going to be a grey area, but it's essentially the same spot > > > everyone else is in. > > I must take exception to the "There is a substantial user base" rule, > as this conflicts with the goal of avoiding fragmentation: the support > for vendor-defined extensions should ideally have landed in an > upstream release before the silicon is widely released. The "substantial user base" standard is specious. Other recent emails have suggested that phrase is a euphemism for "widely available consumer-programmable product." It fails to account for RISC-V's most important constituency: people using open-source toolchains to program non-mass-market parts. Our existing regime of "no custom extension support in standard software" is draconian, but at least it's self-consistent. I'd support retaining it. But, if we do change it, it's preposterous to reject extensions like XVentanaCondOps on the basis that their silicon isn't available on AliExpress. This would > see these extensions being sent upstream significantly before > widespread sampling (and sometimes around the time of the announcement > of a device on the roadmap). Simply put: I want everyone defining > vendor extensions to contribute to our mainline development efforts > instead of extending their own ancient forks. > > I suspect that this rule is intended to ensure that experimental, > purely academic, or "closed" (as in: even if you have the silicon, it > will be so deeply embedded that no one can run their own software — > e.g. radio baseband controllers) extensions don't make the maintenance > work harder. If that is the case: could we use wording such as (a > native speaker might wordsmith something more accurate) "accessible to > run user-developed software" and "intended for a wider audience"? > > > > * There is a mechanism for testing the code in question without direct > > > access to hardware, which in practice means a QEMU port (or whatever > > > simulator is relevant in the space and that folks use for testing) or > > > some community commitment to long-term availability of the hardware > > > for testing (something like the GCC compile farm, for exampl
Re: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
On Fri, 13 May 2022, Tamar Christina wrote: > Hi All, > > Some targets require function parameters to be promoted to a different > type on expand time because the target may not have native instructions > to work on such types. As an example the AArch64 port does not have native > instructions working on integer 8- or 16-bit values. As such it promotes > every parameter of these types to 32-bits. > > This promotion could be done by a target for two reasons: > > 1. For correctness. This may be an APCS requirement for instance. > 2. For efficiency. By promoting the argument at expansion time we don't have >to keep promoting the type back and forth after each operation on it. >i.e. the promotion simplies the RTL. > > This patch adds the ability for a target to decide whether during the > expansion > to use an extend to handle promotion or to use a paradoxical subreg. > > A pradoxical subreg can be used when there's no correctness issues and when > you > still want the RTL efficiency of not doing the promotion constantly. > > This also allows the target to not need to generate any code when the top bits > are not significant. > > An example is in AArch64 the following extend is unneeded: > > uint8_t fd2 (uint8_t xr){ > return xr + 1; > } > > currently generates: > > fd2: > and w0, w0, 255 > add w0, w0, 1 > ret > > instead of > > fd2: > add w0, w0, #1 > ret > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > Bootstrapped on x86_64-pc-linux-gnu and no issues > > Ok for master? Why do we need a target hook for this? Why doesn't the targets function_arg(?) hook just return (subreg:SI (reg:QI 0)) here instead when no zero-extension is required and (reg:QI 0) when it is? That said, an extra hook looks like a bad design to me, the existing cummulative args way of querying the target ABI shoud be enough, and if not, should be extended in a less hackish way. But of course I am not familiar at all with the current state, but since you specifially CCed me ... Richard. > Thanks, > Tamar > > gcc/ChangeLog: > > * cfgexpand.cc (set_rtl): Check for function promotion. > * tree-outof-ssa.cc (insert_value_copy_on_edge): Likewise. > * function.cc (assign_parm_setup_reg): Likewise. > * hooks.cc (hook_bool_mode_mode_int_tree_false, > hook_bool_mode_mode_int_tree_true): New. > * hooks.h (hook_bool_mode_mode_int_tree_false, > hook_bool_mode_mode_int_tree_true): New. > * target.def (promote_function_args_subreg_p): New. > * doc/tm.texi: Document it. > * doc/tm.texi.in: Likewise. > > --- inline copy of patch -- > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc > index > d3cc77d2ca98f620b29623fc5696410bad9bc184..df95184cfa185312c2a46cb92daa051718d9f4f3 > 100644 > --- a/gcc/cfgexpand.cc > +++ b/gcc/cfgexpand.cc > @@ -206,14 +206,20 @@ set_rtl (tree t, rtx x) > have to compute it ourselves. For RESULT_DECLs, we accept mode > mismatches too, as long as we have BLKmode or are not coalescing > across variables, so that we don't reject BLKmode PARALLELs or > - unpromoted REGs. */ > + unpromoted REGs. For any promoted types that result in a > + paradoxical subreg also accept the argument. */ >gcc_checking_assert (!x || x == pc_rtx || TREE_CODE (t) != SSA_NAME > || (SSAVAR (t) > && TREE_CODE (SSAVAR (t)) == RESULT_DECL > && (promote_ssa_mode (t, NULL) == BLKmode > || !flag_tree_coalesce_vars)) > || !use_register_for_decl (t) > -|| GET_MODE (x) == promote_ssa_mode (t, NULL)); > +|| GET_MODE (x) == promote_ssa_mode (t, NULL) > +|| targetm.calls.promote_function_args_subreg_p ( > + GET_MODE (x), > + promote_ssa_mode (t, NULL), > + TYPE_UNSIGNED (TREE_TYPE (t)), > + SSAVAR (t))); > >if (x) > { > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index > 2f92d37da8c0091e9879a493cfe8a361eb1d9299..6314cd83a2488dc225d4a1a15599e8e51e639f7f > 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -3906,6 +3906,15 @@ cases of mismatch, it also makes for better code on > certain machines. > The default is to not promote prototypes. > @end deftypefn > > +@deftypefn {Target Hook} bool TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P > (machine_mode @var{mode}, machine_mode @var{promoted_mode}, int > @var{unsignedp}, tree @var{v}) > +When a function argument is promoted with @code{PROMOTE_MODE} then this > +hook is used to determine whether the bits of the promoted type are all > +significant in the expression pointed to by V. If they are an extend is > +generated, if they are not a paradoxical subreg is crea
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, 16 May 2022, Rui Ueyama wrote: > If it is a guaranteed behavior that GCC of all versions that support > only get_symbols_v2 don't leave a temporary file behind if it is > suddenly disrupted during get_symbols_v2 execution, then yes, mold can > restart itself when get_symbols_v2 is called for the first time. > > Is this what you want? I'm fine with that approach if it is guaranteed > to work by GCC developers. I cannot answer that, hopefully someone in Cc: will. This sub-thread started with Richard proposing an alternative solution for API level discovery [1] (invoking onload twice, first with only the _v3 entrypoint in the "transfer vector"), and then suggesting an onload_v2 variant that would allow to discover which entrypoints the plugin is going to use [2]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594058.html [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594059.html ... at which point I butted in, because the whole _v3 thing was shrouded in mystery. Hopefully, now it makes more sense. >From my side I want to add that thread-safety remains a major unsolved point. Compiler driver _always_ adds the plugin to linker command line, so I expect that if you add a mutex around your claim_file hook invocation, you'll find that it serializes the linker too much. Have you given some thought to that? Will you be needing a plugin API upgrade to discover thread-safe entrypoints, or do you have another solution in mind? Alexander
Re: [PATCH v5, rs6000] Add a combine pattern for CA minus one [PR95737]
Hi Haochen, on 2022/5/16 13:22, HAO CHEN GUI wrote: > Hi, >This patch adds a combine pattern for "CA minus one". As CA only has two > values (0 or 1), we could convert following pattern > (sign_extend:DI (plus:SI (reg:SI 98 ca) > (const_int -1 [0x] > to >(plus:DI (reg:DI 98 ca) > (const_int -1 [0x]))) >With this patch, one unnecessary sign extend is eliminated. > >Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. > Is this okay for trunk? Any recommendations? Thanks a lot. > > ChangeLog > 2022-05-16 Haochen Gui > > gcc/ > PR target/95737 > * config/rs6000/rs6000.md (subfsi3_carry_in_xx_64): New. (subfsi3_carry_in_xx_64) -> (*subfsi3_carry_in_xx_64) Sorry for nit-picking, but they are different. > > gcc/testsuite/ > PR target/95737 > * gcc.target/powerpc/pr95737.c: New. > > patch.diff > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 64049a6e521..b97ac453fc0 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -2353,6 +2353,19 @@ (define_insn "subf3_carry_in_xx" >"subfe %0,%0,%0" >[(set_attr "type" "add")]) > > +(define_insn_and_split "*subfsi3_carry_in_xx_64" > + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") > + (sign_extend:DI (plus:SI (reg:SI CA_REGNO) > + (const_int -1] > + "TARGET_POWERPC64" > + "#" > + "" "&& 1" > + [(parallel [(set (match_dup 0) > +(plus:DI (reg:DI CA_REGNO) > + (const_int -1))) > + (clobber (reg:DI CA_REGNO))])] > + "" > +) > > (define_insn "@neg2" >[(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr95737.c > b/gcc/testsuite/gcc.target/powerpc/pr95737.c > new file mode 100644 > index 000..30f0f819393 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c > @@ -0,0 +1,11 @@ > +/* PR target/95737 */ > +/* { dg-do compile } */ > +/* Disable isel for P9 and later */ Maybe it looks better with "later. */" Test case doesn't need strict formatting, so your call. :) > +/* { dg-options "-O2 -mno-isel" } */ > +/* { dg-final { scan-assembler-not {\mextsw\M} } } */ > + > + > +unsigned long negativeLessThan (unsigned long a, unsigned long b) > +{ > + return -(a < b); > +} OK with split condition fixed, with or without those nits tweaked. BR, Kewen