Re: [RFC] ldist: Recognize rawmemchr loop patterns
On Tue, Feb 09, 2021 at 09:57:58AM +0100, Richard Biener wrote: > On Mon, Feb 8, 2021 at 3:11 PM Stefan Schulze Frielinghaus via > Gcc-patches wrote: > > > > This patch adds support for recognizing loops which mimic the behaviour > > of function rawmemchr, and replaces those with an internal function call > > in case a target provides them. In contrast to the original rawmemchr > > function, this patch also supports different instances where the memory > > pointed to and the pattern are interpreted as 8, 16, and 32 bit sized, > > respectively. > > > > This patch is not final and I'm looking for some feedback: > > > > Previously, only loops which mimic the behaviours of functions memset, > > memcpy, and memmove have been detected and replaced by corresponding > > function calls. One characteristic of those loops/partitions is that > > they don't have a reduction. In contrast, loops which mimic the > > behaviour of rawmemchr compute a result and therefore have a reduction. > > My current attempt is to ensure that the reduction statement is not used > > in any other partition and only in that case ignore the reduction and > > replace the loop by a function call. We then only need to replace the > > reduction variable of the loop which contained the loop result by the > > variable of the lhs of the internal function call. This should ensure > > that the transformation is correct independently of how partitions are > > fused/distributed in the end. Any thoughts about this? > > Currently we're forcing reduction partitions last (and force to have a single > one by fusing all partitions containing a reduction) because code-generation > does not properly update SSA form for the reduction results. ISTR that > might be just because we do not copy the LC PHI nodes or do not adjust > them when copying. That might not be an issue in case you replace the > partition with a call. I guess you can try to have a testcase with > two rawmemchr patterns and a regular loop part that has to be scheduled > inbetween both for correctness. Ah ok, in that case I updated my patch by removing the constraint that the reduction statement must be in precisely one partition. Please find attached the testcases I came up so far. Since transforming a loop into a rawmemchr function call is backend dependend, I planned to include those only in my backend patch. I wasn't able to come up with any testcase where a loop is distributed into multiple partitions and where one is classified as a rawmemchr builtin. The latter boils down to a for loop with an empty body only in which case I suspect that loop distribution shouldn't be done anyway. > > Furthermore, I simply added two new members (pattern, fn) to structure > > builtin_info which I consider rather hacky. For the long run I thought > > about to split up structure builtin_info into a union where each member > > is a structure for a particular builtin of a partition, i.e., something > > like this: > > > > union builtin_info > > { > > struct binfo_memset *memset; > > struct binfo_memcpymove *memcpymove; > > struct binfo_rawmemchr *rawmemchr; > > }; > > > > Such that a structure for one builtin does not get "polluted" by a > > different one. Any thoughts about this? > > Probably makes sense if the list of recognized patterns grow further. > > I see you use internal functions rather than builtin functions. I guess > that's OK. But you use new target hooks for expansion where I think > new optab entries similar to cmpmem would be more appropriate > where the distinction between 8, 16 or 32 bits can be encoded in > the modes. The optab implementation is really nice which allows me to use iterators in the backend which in the end saves me some boiler plate code compared to the previous implementation :) While using optabs now, I only require one additional member (pattern) in the builtin_info struct. Thus I didn't want to overcomplicate things and kept the single struct approach as is. For the long run, should I resubmit this patch once stage 1 opens or how would you propose to proceed? Thanks for your review so far! Cheers, Stefan > > Richard. > > > Cheers, > > Stefan > > --- > > gcc/internal-fn.c| 42 ++ > > gcc/internal-fn.def | 3 + > > gcc/target-insns.def | 3 + > > gcc/tree-loop-distribution.c | 257 ++- > > 4 files changed, 272 insertions(+), 33 deletions(-) > > > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > > index dd7173126fb..9cd62544a1a 100644 > > --- a/gcc/internal-fn.c > > +++ b/gcc/internal-fn.c > > @@ -2917,6 +2917,48 @@ expand_VEC_CONVERT (internal_fn, gcall *) > >gcc_unreachable (); > > } > > > > +static void > > +expand_RAWMEMCHR8 (internal_fn, gcall *stmt) > > +{ > > + if (targetm.have_rawmemchr8 ()) > > +{ > > + rtx result = expand_expr (gimple_call_lhs (stmt), NULL_RTX, > > VOIDmode, EXPAND_WRITE); > > + rtx start = expand_normal (gimple_cal
New Swedish PO file for 'gcc' (version 11.1-b20210207)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Swedish team of translators. The file is available at: https://translationproject.org/latest/gcc/sv.po (This file, 'gcc-11.1-b20210207.sv.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
[PATCH] GCC_CET_HOST_FLAGS: Check if host supports multi-byte NOPs
Check if host supports multi-byte NOPs before enabling CET on host. config/ PR binutils/27397 * cet.m4 (GCC_CET_HOST_FLAGS): Check if host supports multi-byte NOPs. libiberty/ PR binutils/27397 * configure: Regenerated. --- config/cet.m4 | 19 --- libiberty/configure | 29 + 2 files changed, 45 insertions(+), 3 deletions(-) diff --git a/config/cet.m4 b/config/cet.m4 index c67fb4f35b6..7718be1afe8 100644 --- a/config/cet.m4 +++ b/config/cet.m4 @@ -130,6 +130,18 @@ fi if test x$may_have_cet = xyes; then if test x$cross_compiling = xno; then AC_TRY_RUN([ +int +main () +{ + asm ("endbr32"); + return 0; +} +], +[have_multi_byte_nop=yes], +[have_multi_byte_nop=no]) +have_cet=no +if test x$have_multi_byte_nop = xyes; then + AC_TRY_RUN([ static void foo (void) { @@ -155,9 +167,10 @@ main () bar (); return 0; } -], -[have_cet=no], -[have_cet=yes]) + ], + [have_cet=no], + [have_cet=yes]) +fi if test x$enable_cet = xno -a x$have_cet = xyes; then AC_MSG_ERROR([Intel CET must be enabled on Intel CET enabled host]) fi diff --git a/libiberty/configure b/libiberty/configure index 160b8c9e8b1..29a690d44fc 100755 --- a/libiberty/configure +++ b/libiberty/configure @@ -5539,6 +5539,34 @@ else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ +int +main () +{ + asm ("endbr32"); + return 0; +} + +_ACEOF +if ac_fn_c_try_run "$LINENO"; then : + have_multi_byte_nop=yes +else + have_multi_byte_nop=no +fi +rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ + conftest.$ac_objext conftest.beam conftest.$ac_ext +fi + +have_cet=no +if test x$have_multi_byte_nop = xyes; then + if test "$cross_compiling" = yes; then : + { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 +$as_echo "$as_me: error: in \`$ac_pwd':" >&2;} +as_fn_error $? "cannot run test program while cross compiling +See \`config.log' for more details" "$LINENO" 5; } +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + static void foo (void) { @@ -5575,6 +5603,7 @@ rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi +fi if test x$enable_cet = xno -a x$have_cet = xyes; then as_fn_error $? "Intel CET must be enabled on Intel CET enabled host" "$LINENO" 5 fi -- 2.29.2
[PATCH] gitignore: ignore generated dejagnu test files treewide
These files are never commited, and are generated by most testsuites, so ignore them. ChangeLog: * .gitignore: Ignore generated dejagnu test files. --- .gitignore | 5 + 1 file changed, 5 insertions(+) diff --git a/.gitignore b/.gitignore index 382e2def731e..2d316e0bf881 100644 --- a/.gitignore +++ b/.gitignore @@ -66,3 +66,8 @@ stamp-* /mpc* /gmp* /isl* + +site.bak +site.exp +*.log +*.sum -- 2.30.0
[committed] libstdc++: Restore in testsuite_fs.h header [PR 99096]
libstdc++-v3/ChangeLog: PR libstdc++/99096 * testsuite/util/testsuite_fs.h: Always include . Tested x86_64-linux. Committed to trunk. commit 4e3590d06cf8a06fcc460ccda6150483a0311bae Author: Jonathan Wakely Date: Sun Feb 14 20:38:32 2021 libstdc++: Restore in testsuite_fs.h header [PR 99096] libstdc++-v3/ChangeLog: PR libstdc++/99096 * testsuite/util/testsuite_fs.h: Always include . diff --git a/libstdc++-v3/testsuite/util/testsuite_fs.h b/libstdc++-v3/testsuite/util/testsuite_fs.h index e4d04dd7799..1eadf7fa767 100644 --- a/libstdc++-v3/testsuite/util/testsuite_fs.h +++ b/libstdc++-v3/testsuite/util/testsuite_fs.h @@ -34,10 +34,10 @@ namespace test_fs = std::experimental::filesystem; #include #include #include +#include // unlink, close, getpid #if defined(_GNU_SOURCE) || _XOPEN_SOURCE >= 500 || _POSIX_C_SOURCE >= 200112L #include // mkstemp -#include // unlink, close #else #include// std::random_device #endif
Re: [PATCH] ipa/97346 - fix leak of reference_vars_to_consider
> This cleans up allocation/deallocation of reference_vars_to_consider, > specifically always releasing the vector allocated in ipa_init and > also making sure to release it before re-allocating it in > ipa_reference_write_optimization_summary. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > Thanks, > Richard. > > 2021-02-10 Richard Biener > > PR ipa/97346 > * ipa-reference.c (propagate): Always free > reference_vars_to_consider. > (ipa_reference_write_optimization_summary): Free > reference_vars_to_consider before re-allocating it. > (ipa_reference_write_optimization_summary): Use vec_free > and NULL reference_vars_to_consider. Hi, this is version I commited after discussion on the PR (it makes it more explicit that reference_vars_to_consider are used during analysis only to aid dumping). Honza 2021-02-14 Jan Hubicka Richard Biener PR ipa/97346 * ipa-reference.c (ipa_init): Only conditinally initialize reference_vars_to_consider. (propagate): Conditionally deninitialize reference_vars_to_consider. (ipa_reference_write_optimization_summary): Sanity check that reference_vars_to_consider is not allocated. diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c index 2ea2a6d5327..6cf78ff94a6 100644 --- a/gcc/ipa-reference.c +++ b/gcc/ipa-reference.c @@ -458,8 +458,8 @@ ipa_init (void) ipa_init_p = true; - vec_alloc (reference_vars_to_consider, 10); - + if (dump_file) +vec_alloc (reference_vars_to_consider, 10); if (ipa_ref_opt_sum_summaries != NULL) { @@ -967,8 +967,12 @@ propagate (void) } if (dump_file) -vec_free (reference_vars_to_consider); - reference_vars_to_consider = NULL; +{ + vec_free (reference_vars_to_consider); + reference_vars_to_consider = NULL; +} + else +gcc_checking_assert (!reference_vars_to_consider); return remove_p ? TODO_remove_functions : 0; } @@ -1059,6 +1063,7 @@ ipa_reference_write_optimization_summary (void) auto_bitmap ltrans_statics; int i; + gcc_checking_assert (!reference_vars_to_consider); vec_alloc (reference_vars_to_consider, ipa_reference_vars_uids); reference_vars_to_consider->safe_grow (ipa_reference_vars_uids, true); @@ -1117,7 +1122,8 @@ ipa_reference_write_optimization_summary (void) } } lto_destroy_simple_output_block (ob); - delete reference_vars_to_consider; + vec_free (reference_vars_to_consider); + reference_vars_to_consider = NULL; } /* Deserialize the ipa info for lto. */
PING 3 [PATCH] improve warning suppression for inlined functions (PR 98465, 98512)
Ping 3: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564060.html I submitted this as a fix for a fair number of false positives reported by Fedora package maintainers. Last week Jakub committed r11-7146, which is an alternate workaround for the same problem, but one isolated to libstdc++. That might make this patch less pressing but not less relevant since it also fixes pr98512 (a bug impacting Glibc) and adds the infrastructure for resolving pr98871 and other bugs about the #pragma diagnostic's inability to suppress warnings in inlined code. Jeff (or anyone else who cares about this) if you consider this patch too intrusive at this point let me know and I'll stop pinging it and resubmit it for GCC 12. On 2/6/21 10:12 AM, Martin Sebor wrote: Ping 2: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564060.html On 1/29/21 7:56 PM, Martin Sebor wrote: Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564060.html On 1/21/21 4:46 PM, Martin Sebor wrote: The initial patch I posted is missing initialization for a couple of locals. I'd noticed it in testing but forgot to add the fix to the patch before posting it. I have corrected that in the updated revision and also added the test case from pr98512, and retested the whole thing on x86_64-linux. On 1/19/21 11:58 AM, Martin Sebor wrote: std::string tends to trigger a class of false positive out of bounds access warnings for code GCC cannot prove is unreachable because of missing aliasing constrains, and that ends up expanded inline into user code. Simply inserting the contents of a constant char array does that. In GCC 10 these false positives are suppressed due to -Wno-system-headers, but in GCC 11, to help detect calls rendered invalid by user code passing in either incorrect or insufficiently constrained arguments, -Wno-system-header no longer has this effect on invalid access warnings. To solve the problem without at least partially reverting the change and going back to the GCC 10 way of things for the affected subset of calls (just memcpy and memmove), the attached patch enhances the #pragma GCC diagnostic machinery to consider not just a single location for inlined code but all locations at which an expression and its callers are inlined all the way up the stack. This gives each author of a function involved in inlining the ability to control a warning issued for the code, not just the user into whose code all the calls end up inlined. To resolve PR 98465, it lets us suppress the false positives selectively in std::string rather than across the board in GCC. The solution is to provide a new pair of overloads for warning functions that, instead of taking a single location argument, take a tree node from which the location(s) are determined. The tree argument is indirect because the diagnostic machinery doesn't (and cannot without more intrusive changes) at the moment depend on the various tree definitions. A nice feature of these overloads is that they do away with the need for the %K directive (and in the future also %G, with another enhancement to accept a gimple* argument). This patch depends on the fix for PR 98664 (already approved but not yet checked in). I've tested it on x86_64-linux. To avoid fallout I tried to keep the changes to a minimum, and so the design isn't as robust as I'd like it ultimately to be. I plan to enhance it in stage 1. Martin
PING 3 [PATCH] correct fix to avoid false positives for vectorized stores (PR 96963, 94655)
Ping 3: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564059.html This is a fix for two P2 bugs (false positives). On 2/6/21 10:13 AM, Martin Sebor wrote: Ping 2: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564059.html On 1/29/21 10:20 AM, Martin Sebor wrote: Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564059.html On 1/21/21 4:38 PM, Martin Sebor wrote: The hack I put in compute_objsize() last January for pr93200 isn't quite correct. It happened to suppress the false positive there but, due to what looks like a thinko on my part, not in some other test cases involving vectorized stores. The attached change adjusts the hack to have compute_objsize() give up on all MEM_REFs with a vector type. This effectively disables -Wstringop-{overflow,overread} for vector accesses (either written by the user or synthesized by GCC from ordinary accesses). It doesn't affect -Warray-bounds because this warning doesn't use compute_objsize() yet. When it does (it should considerably simplify the code) some additional changes will be needed to preserve -Warray-bounds for out of bounds vector accesses. The test this patch adds should serve as a reminder to make it if we forget. Tested on x86_64-linux. Since PR 94655 was reported against GCC 10 I'd like to apply this fix to both the trunk and the 10 branch. Martin
[PATCH 1/2] MIPS: unaligned load: use SImode for SUBREG if OK (PR98996)
It is found by ada s-pack96.adb ftbfs, due to 96bit load: 96 = 64 + 32. While the 32bit pair of l r is mark as SUBREG, so they are not in SImode, make it fail to find suitable insn. gcc/ChangeLog: * config/mips/mips.c (mips_expand_ext_as_unaligned_load): If TARGET_64BIT and dest is SUBREG, we check the width, if it equal to SImode, we use SImode operation, just like what we are doing for REG one. --- gcc/ChangeLog | 8 gcc/config/mips/mips.c | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index ddf4c7f92d7..fb12eeb971d 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,11 @@ +2021-02-15 YunQiang Su + + PR target/98996 + * config/mips/mips.c (mips_expand_ext_as_unaligned_load): + If TARGET_64BIT and dest is SUBREG, we check the width, if it + equal to SImode, we use SImode operation, just like what we are + doing for REG one. + 2021-02-11 Eric Botcazou * config/i386/winnt.c (i386_pe_seh_unwind_emit): When switching to diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index ebb04b72b2b..b77604f935d 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -8400,7 +8400,7 @@ mips_expand_ext_as_unaligned_load (rtx dest, rtx src, HOST_WIDE_INT width, /* If TARGET_64BIT, the destination of a 32-bit "extz" or "extzv" will be a DImode, create a new temp and emit a zero extend at the end. */ if (GET_MODE (dest) == DImode - && REG_P (dest) + && (REG_P (dest) || SUBREG_P(dest)) && GET_MODE_BITSIZE (SImode) == width) { dest1 = dest; -- 2.20.1
[PATCH 2/2] ada: add 128bit operation to MIPS N32 and N64
For MIPS N64 and N32: add GNATRTL_128BIT_PAIRS to LIBGNAT_TARGET_PAIRS add GNATRTL_128BIT_OBJS to EXTRA_GNATRTL_NONTASKING_OBJS gcc/ada/ChangeLog: PR ada/98996 * Makefile.rtl (LIBGNAT_TARGET_PAIRS, EXTRA_GNATRTL_NONTASKING_OBJS) : add 128Bit operation file to MIPS N64 and N32. --- gcc/ada/ChangeLog| 6 ++ gcc/ada/Makefile.rtl | 12 2 files changed, 18 insertions(+) diff --git a/gcc/ada/ChangeLog b/gcc/ada/ChangeLog index 43973550502..32e92c55ef8 100644 --- a/gcc/ada/ChangeLog +++ b/gcc/ada/ChangeLog @@ -1,3 +1,9 @@ +2021-02-03 YunQiang Su + + PR ada/98996 + * Makefile.rtl (LIBGNAT_TARGET_PAIRS, EXTRA_GNATRTL_NONTASKING_OBJS) + : add 128Bit operation file to MIPS N64 and N32. + 2021-02-03 Eric Botcazou * gcc-interface/decl.c (components_to_record): If the first component diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl index 35faf13ea46..d86eb8acbf3 100644 --- a/gcc/ada/Makefile.rtl +++ b/gcc/ada/Makefile.rtl @@ -2311,6 +2311,18 @@ ifeq ($(strip $(filter-out mips% linux%,$(target_cpu) $(target_os))),) s-tpopsp.adb