Re: [PATCH] real: fix encoding of negative IEEE double/quad values [PR98216]
On Thu, Sep 23, 2021 at 10:44 PM Patrick Palka via Gcc-patches wrote: > > In encode_ieee_double/quad, the assignment > > unsigned long VAL = r->sign << 31; > > is intended to set the 31st bit of VAL whenever the given REAL_CST is > negative. But on LP64 hosts it also unintentionally sets the upper 32 > bits of VAL due to the promotion of r->sign from unsigned:1 to int and > the subsequent sign extension of the shifted value from int to long. > > In the C++ frontend, this bug causes incorrect mangling of negative > double values due to the output of real_to_target during write_real_cst > unexpectedly having the upper 32 bits of each word set. (I'm not sure > if/how this bug manifests itself outside of the frontend..) > > This patch fixes this by avoiding the unwanted sign extension. Note > that r0-53976 fixed the same bug in encode_ieee_single long ago. > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for > trunk and perhaps the release branches? OK for trunk and branches. Thanks, Richard. > PR c++/98216 > PR c++/91292 > > gcc/ChangeLog: > > * real.c (encode_ieee_double): Avoid incorrect sign extension. > (encode_ieee_quad): Likewise. > > gcc/testsuite/ChangeLog: > > * g++.dg/cpp2a/nontype-float2.C: New test. > --- > gcc/real.c | 6 -- > gcc/testsuite/g++.dg/cpp2a/nontype-float2.C | 13 + > 2 files changed, 17 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-float2.C > > diff --git a/gcc/real.c b/gcc/real.c > index 555cf44c142..8c7a47a69e6 100644 > --- a/gcc/real.c > +++ b/gcc/real.c > @@ -3150,9 +3150,10 @@ encode_ieee_double (const struct real_format *fmt, > long *buf, > const REAL_VALUE_TYPE *r) > { >unsigned long image_lo, image_hi, sig_lo, sig_hi, exp; > + unsigned long sign = r->sign; >bool denormal = (r->sig[SIGSZ-1] & SIG_MSB) == 0; > > - image_hi = r->sign << 31; > + image_hi = sign << 31; >image_lo = 0; > >if (HOST_BITS_PER_LONG == 64) > @@ -3938,10 +3939,11 @@ encode_ieee_quad (const struct real_format *fmt, long > *buf, > const REAL_VALUE_TYPE *r) > { >unsigned long image3, image2, image1, image0, exp; > + unsigned long sign = r->sign; >bool denormal = (r->sig[SIGSZ-1] & SIG_MSB) == 0; >REAL_VALUE_TYPE u; > > - image3 = r->sign << 31; > + image3 = sign << 31; >image2 = 0; >image1 = 0; >image0 = 0; > diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-float2.C > b/gcc/testsuite/g++.dg/cpp2a/nontype-float2.C > new file mode 100644 > index 000..5db208a05d1 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/cpp2a/nontype-float2.C > @@ -0,0 +1,13 @@ > +// PR c++/98216 > +// { dg-do compile { target c++20 } } > + > +template void f() { } > + > +template void f<-1.0f>(); > +template void f<-2.0f>(); > + > +template void f<-1.0>(); > +template void f<-2.0>(); > + > +template void f<-1.0L>(); > +template void f<-2.0L>(); > -- > 2.33.0.514.g99c99ed825 >
[PATCH] Verify unallocated edge/BB flags are clear
This adds verification that unused auto_{edge,bb}_flag are not remaining set but correctly cleared by consumers. The intent is that those flags can be cheaply used on a smaller IL region and thus afterwards clearing can be restricted to the same small region as well. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. 2021-09-24 Richard Biener * cfghooks.c (verify_flow_info): Verify unallocated BB and edge flags are not set. --- gcc/cfghooks.c | 13 + 1 file changed, 13 insertions(+) diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c index 50b9b177639..6446e16ca8c 100644 --- a/gcc/cfghooks.c +++ b/gcc/cfghooks.c @@ -161,6 +161,12 @@ verify_flow_info (void) err = 1; } + if (bb->flags & ~cfun->cfg->bb_flags_allocated) + { + error ("verify_flow_info: unallocated flag set on BB %d", bb->index); + err = 1; + } + FOR_EACH_EDGE (e, ei, bb->succs) { if (last_visited [e->dest->index] == bb) @@ -202,6 +208,13 @@ verify_flow_info (void) err = 1; } + if (e->flags & ~cfun->cfg->edge_flags_allocated) + { + error ("verify_flow_info: unallocated edge flag set on %d -> %d", +e->src->index, e->dest->index); + err = 1; + } + edge_checksum[e->dest->index] += (size_t) e; } if (n_fallthru > 1) -- 2.31.1
Re: Fortran: Improve file-reading error diagnostic [PR55534] (was: Re: [Patch] Fortran: Improve -Wmissing-include-dirs warnings [PR55534])
On 23.09.21 23:01, Harald Anlauf via Fortran wrote: compiled with -cpp gives: pr55534-play.f90:4:2: 4 | type t | 1~~ Fatal Error: no/such/file.inc: No such file or directory compilation terminated. If you have an easy solution for that one, David has an easy but hackish solution, cf. https://gcc.gnu.org/PR100904 That's a GCC 7 regression, which also affects C/C++ but only when compiling with -traditional-cpp, which gfortran does by default but gcc/g++ don't. Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: [PATCH, Fortran] Add missing diagnostic for F2018 C711 (TS29113 C407c)
On 24.09.21 01:19, Sandra Loosemore wrote: Here's another missing-diagnostic patch for the Fortran front end, this time for PR Fortran/101333. OK to commit? That's "C711 An assumed-type actual argument that corresponds to an assumed-rank dummy argument shall be assumed-shape or assumed-rank." LGTM. Thanks for the patch! Tobias commit 53171e748e28901693ca4362ff658883dab97e13 Author: Sandra Loosemore Date: Thu Sep 23 15:00:43 2021 -0700 Fortran: Add missing diagnostic for F2018 C711 (TS29113 C407c) 2021-09-23 Sandra Loosemore PR Fortran/101333 gcc/fortran/ * interface.c (compare_parameter): Enforce F2018 C711. gcc/testsuite/ * gfortran.dg/c-interop/c407c-1.f90: Remove xfails. diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c index dae4b95..a2fea0e 100644 --- a/gcc/fortran/interface.c +++ b/gcc/fortran/interface.c @@ -2448,6 +2448,21 @@ compare_parameter (gfc_symbol *formal, gfc_expr *actual, return false; } + /* TS29113 C407c; F2018 C711. */ + if (actual->ts.type == BT_ASSUMED + && symbol_rank (formal) == -1 + && actual->rank != -1 + && !(actual->symtree->n.sym->as +&& actual->symtree->n.sym->as->type == AS_ASSUMED_SHAPE)) +{ + if (where) + gfc_error ("Assumed-type actual argument at %L corresponding to " +"assumed-rank dummy argument %qs must be " +"assumed-shape or assumed-rank", +&actual->where, formal->name); + return false; +} + /* F2008, 12.5.2.5; IR F08/0073. */ if (formal->ts.type == BT_CLASS && formal->attr.class_ok && actual->expr_type != EXPR_NULL diff --git a/gcc/testsuite/gfortran.dg/c-interop/c407c-1.f90 b/gcc/testsuite/gfortran.dg/c-interop/c407c-1.f90 index e4da66a..c77e6ac 100644 --- a/gcc/testsuite/gfortran.dg/c-interop/c407c-1.f90 +++ b/gcc/testsuite/gfortran.dg/c-interop/c407c-1.f90 @@ -44,7 +44,7 @@ subroutine s2 (x) implicit none type(*) :: x(*) - call g (x, 1) ! { dg-error "Assumed.type" "pr101333" { xfail *-*-* } } + call g (x, 1) ! { dg-error "Assumed.type" } end subroutine ! Check that a scalar gives an error. @@ -53,7 +53,7 @@ subroutine s3 (x) implicit none type(*) :: x - call g (x, 1) ! { dg-error "Assumed.type" "pr101333" { xfail *-*-* } } + call g (x, 1) ! { dg-error "Assumed.type" } end subroutine ! Explicit-shape assumed-type actual arguments are forbidden implicitly - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: [PATCH] combine: Check for paradoxical subreg
Hi, pinging this patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573509.html It introduces a check for a paradoxical subreg in combine that would ICE otherwise. Regards Robin
Re: [PATCH] top-level configure: setup target_configdirs based on repository
* Richard Biener [2021-09-23 10:53:16 +0200]: > On Wed, Sep 22, 2021 at 5:47 PM Andrew Burgess > wrote: > > > > The top-level configure script is shared between the gcc repository > > and the binutils-gdb repository. > > > > The target_configdirs variable in the configure.ac script, defines > > sub-directories that contain components that should be built for the > > target using the target tools. > > > > Some components, e.g. zlib, are built as both host and target > > libraries. > > > > This causes problems for binutils-gdb. If we run 'make all' in the > > binutils-gdb repository we end up trying to build a target version of > > the zlib library, which requires the target compiler be available. > > Often the target compiler isn't immediately available, and so the > > build fails. > > > > The problem with zlib impacted a previous attempt to synchronise the > > top-level configure scripts from gcc to binutils-gdb, see this thread: > > > > https://sourceware.org/pipermail/binutils/2019-May/107094.html > > > > And I'm in the process of importing libbacktrace in to binutils-gdb, > > which is also a host and target library, and triggers the same issues. > > > > I believe that for binutils-gdb, at least at the moment, there are no > > target libraries that we need to build. > > > > My proposal then is to make the value of target_libraries change based > > on which repository we are building in. Specifically, if the source > > tree has a gcc/ directory then we should set the target_libraries > > variable, otherwise this variable is left entry. > > > > I think that if someone tries to create a single unified tree (gcc + > > binutils-gdb in a single source tree) and then build, this change will > > not have a negative impact, the tree still has gcc/ so we'd expect the > > target compiler to be built, which means building the target_libraries > > should work just fine. > > > > However, if the source tree lacks gcc/ then we assume the target > > compiler isn't built/available, and so target_libraries shouldn't be > > built. > > > > There is already precedent within configure.ac for check on the > > existence of gcc/ in the source tree, see the handling of > > -enable-werror around line 3658. > > > > I've tested a build of gcc on x86-64, and the same set of target > > libraries still seem to get built. On binutils-gdb this change > > resolves the issues with 'make all'. > > > > Any thoughts? > > Hmm, why not use make all-binutils instead? That absolutely would work, but sucks when I have to say 'make all-binutils all-gas all-ld all-gdb' when 'make all' used to work. > Otherwise this does > look like a reasonable thing to do. Thanks. I'm reworking things anyway based on Thomas's feedback. Andrew > > Richard. > > > ChangeLog: > > > > * configure: Regenerate. > > * configure.ac (target_configdirs): Only set this when building > > within the gcc repository. > > --- > > ChangeLog| 6 ++ > > configure| 12 ++-- > > configure.ac | 12 ++-- > > 3 files changed, 26 insertions(+), 4 deletions(-) > > > > diff --git a/configure b/configure > > index 85ab9915402..3ef5c2b553f 100755 > > --- a/configure > > +++ b/configure > > @@ -2849,9 +2849,17 @@ target_tools="target-rda" > > ## We assign ${configdirs} this way to remove all embedded newlines. This > > ## is important because configure will choke if they ever get through. > > ## ${configdirs} is directories we build using the host tools. > > -## ${target_configdirs} is directories we build using the target tools. > > +## > > +## ${target_configdirs} is directories we build using the target > > +## tools, these are only needed when working in the gcc tree. This > > +## file is also reused in the binutils-gdb tree, where building any > > +## target stuff doesn't make sense. > > configdirs=`echo ${host_libs} ${host_tools}` > > -target_configdirs=`echo ${target_libraries} ${target_tools}` > > +if test -d ${srcdir}/gcc; then > > + target_configdirs=`echo ${target_libraries} ${target_tools}` > > +else > > + target_configdirs="" > > +fi > > build_configdirs=`echo ${build_libs} ${build_tools}` > > > > > > diff --git a/configure.ac b/configure.ac > > index 1df038b04f3..d1217e3f886 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -180,9 +180,17 @@ target_tools="target-rda" > > ## We assign ${configdirs} this way to remove all embedded newlines. This > > ## is important because configure will choke if they ever get through. > > ## ${configdirs} is directories we build using the host tools. > > -## ${target_configdirs} is directories we build using the target tools. > > +## > > +## ${target_configdirs} is directories we build using the target > > +## tools, these are only needed when working in the gcc tree. This > > +## file is also reused in the binutils-gdb tree, where building any > > +## target stuff doesn't make sense. > > configdirs=`echo ${hos
[PATCHv2] top-level configure: setup target_configdirs based on repository
* Thomas Schwinge [2021-09-23 11:29:05 +0200]: > Hi! > > I only had a curious look here; hope that's still useful. > > On 2021-09-22T16:30:42+0100, Andrew Burgess > wrote: > > The top-level configure script is shared between the gcc repository > > and the binutils-gdb repository. > > > > The target_configdirs variable in the configure.ac script, defines > > sub-directories that contain components that should be built for the > > target using the target tools. > > > > Some components, e.g. zlib, are built as both host and target > > libraries. > > > > This causes problems for binutils-gdb. If we run 'make all' in the > > binutils-gdb repository we end up trying to build a target version of > > the zlib library, which requires the target compiler be available. > > Often the target compiler isn't immediately available, and so the > > build fails. > > I did wonder: shouldn't normally these target libraries be masked out via > 'noconfigdirs' (see 'Handle --disable- generically' section), > via 'enable_[...]' being set to 'no'? But I think I now see the problem > here: the 'enable_[...]' variables guard both the host and target library > build! (... if I'm quickly understanding that correctly...) > > ... and you do need the host zlib, thus '$enable_zlib != no'. > > > The problem with zlib impacted a previous attempt to synchronise the > > top-level configure scripts from gcc to binutils-gdb, see this thread: > > > > https://sourceware.org/pipermail/binutils/2019-May/107094.html > > > > And I'm in the process of importing libbacktrace in to binutils-gdb, > > which is also a host and target library, and triggers the same issues. > > > > I believe that for binutils-gdb, at least at the moment, there are no > > target libraries that we need to build. > > > > My proposal then is to make the value of target_libraries change based > > on which repository we are building in. Specifically, if the source > > tree has a gcc/ directory then we should set the target_libraries > > variable, otherwise this variable is left entry. > > > > I think that if someone tries to create a single unified tree (gcc + > > binutils-gdb in a single source tree) and then build, this change will > > not have a negative impact, the tree still has gcc/ so we'd expect the > > target compiler to be built, which means building the target_libraries > > should work just fine. > > > > However, if the source tree lacks gcc/ then we assume the target > > compiler isn't built/available, and so target_libraries shouldn't be > > built. > > > > There is already precedent within configure.ac for check on the > > existence of gcc/ in the source tree, see the handling of > > -enable-werror around line 3658. > > (I understand that one to just guard the 'cat $srcdir/gcc/DEV-PHASE', > tough.) > > > I've tested a build of gcc on x86-64, and the same set of target > > libraries still seem to get built. On binutils-gdb this change > > resolves the issues with 'make all'. > > > > Any thoughts? > > > --- a/configure.ac > > +++ b/configure.ac > > @@ -180,9 +180,17 @@ target_tools="target-rda" > > ## We assign ${configdirs} this way to remove all embedded newlines. This > > ## is important because configure will choke if they ever get through. > > ## ${configdirs} is directories we build using the host tools. > > -## ${target_configdirs} is directories we build using the target tools. > > +## > > +## ${target_configdirs} is directories we build using the target > > +## tools, these are only needed when working in the gcc tree. This > > +## file is also reused in the binutils-gdb tree, where building any > > +## target stuff doesn't make sense. > > configdirs=`echo ${host_libs} ${host_tools}` > > -target_configdirs=`echo ${target_libraries} ${target_tools}` > > +if test -d ${srcdir}/gcc; then > > + target_configdirs=`echo ${target_libraries} ${target_tools}` > > +else > > + target_configdirs="" > > +fi > > build_configdirs=`echo ${build_libs} ${build_tools}` > > What I see is that after this, there are still occasions where inside > 'case "${target}"', 'target_configdirs' gets amended, so those won't be > caught by your approach? Good point, I'd failed to spot these. > > Instead of erasing 'target_configdirs' as you've posted, and > understanding that we can't just instead add all the "offending" ones to > 'noconfigdirs' for '! test -d "$srcdir"/gcc/' (because that would also > disable them for host usage), Great idea, this is what I've done in the revised patch below. >I wonder if it'd make sense to turn all > existing 'target_libraries=[...]' and 'target_tools=[...]' assignments > and later amendments into '[...]_gcc=[...]' variants, with potentially > further variants existing -- but probably not, because won't you always > need the target GCC to be able to build target libraries ;-) -- and then, > where we finally evalue '$target_libraries' and '$target_tools', only > evaluate the '[...]_gcc' variants
Re: [PATCH v2 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses
On Thu, 2021-09-23 at 13:55 +0200, Richard Biener wrote: > On Wed, 22 Sep 2021, Ilya Leoshkevich wrote: > > > PR tree-optimization/49749 introduced code that shortens dependency > > chains containing loop accumulators by placing them last on operand > > lists of associative operations. > > > > 456.hmmer benchmark on s390 could benefit from this, however, the > > code > > that needs it modifies loop accumulator before using it, and since > > only > > so-called loop-carried phis are are treated as loop accumulators, > > the > > code in the present form doesn't really help. According to Bill > > Schmidt - the original author - such a conservative approach was > > chosen > > so as to avoid unnecessarily swapping operands, which might cause > > unpredictable effects. However, giving special treatment to forms > > of > > loop accumulators is acceptable. > > > > The definition of loop-carried phi is: it's a single-use phi, which > > is > > used in the same innermost loop it's defined in, at least one > > argument > > of which is defined in the same innermost loop as the phi itself. > > Given this, it seems natural to treat single uses of such phis as > > phis > > themselves. > > > > gcc/ChangeLog: > > > > * tree-ssa-reassoc.c (biased_names): New global. > > (propagate_bias_p): New function. > > (loop_carried_phi): Remove. > > (propagate_rank): Propagate bias along single uses. > > (get_rank): Update biased_names when needed. > > --- > > gcc/tree-ssa-reassoc.c | 97 -- > > > > 1 file changed, 64 insertions(+), 33 deletions(-) > > > > diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c > > index 420c14e8cf5..2f7a8882aac 100644 > > --- a/gcc/tree-ssa-reassoc.c > > +++ b/gcc/tree-ssa-reassoc.c > > @@ -211,6 +211,10 @@ static int64_t *bb_rank; > > /* Operand->rank hashtable. */ > > static hash_map *operand_rank; > > > > +/* SSA_NAMEs that are forms of loop accumulators and whose ranks > > need to be > > + biased. */ > > +static auto_bitmap biased_names; > > + > > /* Vector of SSA_NAMEs on which after reassociate_bb is done with > > all basic blocks the CFG should be adjusted - basic blocks > > split right after that SSA_NAME's definition statement and > > before > > @@ -256,6 +260,50 @@ reassoc_remove_stmt (gimple_stmt_iterator > > *gsi) > > the rank difference between two blocks. */ > > #define PHI_LOOP_BIAS (1 << 15) > > > > +/* Return TRUE iff PHI_LOOP_BIAS should be propagated from one of > > the STMT's > > + operands to the STMT's left-hand side. The goal is to preserve > > bias in code > > + like this: > > + > > + x_1 = phi(x_0, x_2) > > + a = x_1 | 1 > > + b = a ^ 2 > > + .MEM = b > > + c = b + d > > + x_2 = c + e > > + > > + That is, we need to preserve bias along single-use chains > > originating from > > + loop-carried phis. Only GIMPLE_ASSIGNs to SSA_NAMEs are > > considered to be > > + uses, because only they participate in rank propagation. */ > > +static bool > > +propagate_bias_p (gimple *stmt) > > +{ > > + use_operand_p use; > > + imm_use_iterator use_iter; > > + gimple *single_use_stmt = NULL; > > + > > + FOR_EACH_IMM_USE_FAST (use, use_iter, gimple_assign_lhs (stmt)) > > + { > > + gimple *current_use_stmt = USE_STMT (use); > > + > > + if (is_gimple_assign (current_use_stmt) > > + && TREE_CODE (gimple_assign_lhs (current_use_stmt)) == > > SSA_NAME) > > + { > > + if (single_use_stmt != NULL) > > what if single_use_stmt == current_use_stmt? We might have two > uses on a stmt after all - should that still be biased? I guess not > and thus the check is correct? Come to think of it, it should be ok to bias it. Things like x = x + x are fine (this particular case can be transformed into something else earlier, but I think the overall point still holds). > > > + return false; > > + single_use_stmt = current_use_stmt; > > + } > > + } > > + > > + if (single_use_stmt == NULL) > > + return false; > > + > > + if (gimple_bb (stmt)->loop_father > > + != gimple_bb (single_use_stmt)->loop_father) > > + return false; > > + > > + return true; > > +} > > + > > /* Rank assigned to a phi statement. If STMT is a loop-carried > > phi of > > an innermost loop, and the phi has only a single use which is > > inside > > the loop, then the rank is the block rank of the loop latch > > plus an > > @@ -313,46 +361,23 @@ phi_rank (gimple *stmt) > > return bb_rank[bb->index]; > > } > > > > -/* If EXP is an SSA_NAME defined by a PHI statement that > > represents a > > - loop-carried dependence of an innermost loop, return TRUE; else > > - return FALSE. */ > > -static bool > > -loop_carried_phi (tree exp) > > -{ > > - gimple *phi_stmt; > > - int64_t block_rank; > > - > > - if (TREE_CODE (exp) != SSA_NAME > > - || SSA_NAME_IS_DEFAULT_DEF (exp)) > > - return fals
[PATCH] top-level: merge Makefile.def patches from binutils-gdb repository
This commit back-ports two patches to Makefile.def from the binutils-gdb repository, these patches were committed over there without first being merged in to the gcc repository. These commits all relate to dependencies for binutils-gdb modules, so should have no impact on gcc, I tested a gcc build/install on x86-64 GNU/Linux, and everything looked OK. The two patches being backported are binutils-gdb commits: commit ba4d88ad892fe29c6ca7938c8861f8edef5f7a3f (gdb-gnulib-issues) Date: Mon Oct 12 16:04:32 2020 +0100 gdb/gdbserver: add dependencies for distclean-gnulib And commit 755ba58ebef02e1be9fc6770d00243ba6ed0223c Date: Thu Mar 18 12:37:52 2021 + Add install dependencies for ld -> bfd and libctf -> bfd OK to merge? 2021-09-07 Andrew Burgess Merge from binutils-gdb: 2021-09-08 Nick Alcock PR libctf/27482 * Makefile.def: Add install-bfd dependencies for install-libctf and install-ld, and install-strip-bfd dependencies for install-strip-libctf and install-strip-ld; move the install-ld dependency on install-libctf to join it. * Makefile.in: Regenerated. And: 2020-10-14 Andrew Burgess * Makefile.in: Rebuild. * Makefile.def: Make distclean-gnulib depend on distclean-gdb and distclean-gdbserver. --- ChangeLog| 19 +++ Makefile.def | 14 ++ Makefile.in | 8 3 files changed, 41 insertions(+) diff --git a/Makefile.def b/Makefile.def index de3e0052106..143a6b469b2 100644 --- a/Makefile.def +++ b/Makefile.def @@ -471,6 +471,14 @@ dependencies = { module=all-ld; on=all-libctf; }; dependencies = { module=install-binutils; on=install-opcodes; }; dependencies = { module=install-strip-binutils; on=install-strip-opcodes; }; +// Likewise for ld, libctf, and bfd. +dependencies = { module=install-libctf; on=install-bfd; }; +dependencies = { module=install-ld; on=install-bfd; }; +dependencies = { module=install-ld; on=install-libctf; }; +dependencies = { module=install-strip-libctf; on=install-strip-bfd; }; +dependencies = { module=install-strip-ld; on=install-strip-bfd; }; +dependencies = { module=install-strip-ld; on=install-strip-libctf; }; + // libopcodes depends on libbfd dependencies = { module=install-opcodes; on=install-bfd; }; dependencies = { module=install-strip-opcodes; on=install-strip-bfd; }; @@ -564,6 +572,12 @@ dependencies = { module=configure-libctf; on=all-zlib; }; dependencies = { module=configure-libctf; on=all-libiconv; }; dependencies = { module=check-libctf; on=all-ld; }; +// The Makefiles in gdb and gdbserver pull in a file that configure +// generates in the gnulib directory, so distclean gnulib only after +// gdb and gdbserver. +dependencies = { module=distclean-gnulib; on=distclean-gdb; }; +dependencies = { module=distclean-gnulib; on=distclean-gdbserver; }; + // Warning, these are not well tested. dependencies = { module=all-bison; on=all-intl; }; dependencies = { module=all-bison; on=all-build-texinfo; }; diff --git a/Makefile.in b/Makefile.in index 61af99dc75a..7613da5a378 100644 --- a/Makefile.in +++ b/Makefile.in @@ -60763,6 +60763,12 @@ all-stageautoprofile-ld: maybe-all-stageautoprofile-libctf all-stageautofeedback-ld: maybe-all-stageautofeedback-libctf install-binutils: maybe-install-opcodes install-strip-binutils: maybe-install-strip-opcodes +install-libctf: maybe-install-bfd +install-ld: maybe-install-bfd +install-ld: maybe-install-libctf +install-strip-libctf: maybe-install-strip-bfd +install-strip-ld: maybe-install-strip-bfd +install-strip-ld: maybe-install-strip-libctf install-opcodes: maybe-install-bfd install-strip-opcodes: maybe-install-strip-bfd configure-gas: maybe-configure-intl @@ -61131,6 +61137,8 @@ check-stagetrain-libctf: maybe-all-stagetrain-ld check-stagefeedback-libctf: maybe-all-stagefeedback-ld check-stageautoprofile-libctf: maybe-all-stageautoprofile-ld check-stageautofeedback-libctf: maybe-all-stageautofeedback-ld +distclean-gnulib: maybe-distclean-gdb +distclean-gnulib: maybe-distclean-gdbserver all-bison: maybe-all-build-texinfo all-flex: maybe-all-build-bison all-flex: maybe-all-m4 -- 2.25.4
Re: [PATCH] Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode.
ping On Mon, Sep 13, 2021 at 11:19 PM Hongtao Liu wrote: > > On Mon, Sep 13, 2021 at 10:10 PM Jeff Law via Gcc-patches > wrote: > > > > > > > > On 9/9/2021 10:36 PM, liuhongt via Gcc-patches wrote: > > >Currently for (vec_concat:M (vec_select op0 idx1)(vec_select op0 > > > idx2)), > > > optimizer wouldn't simplify if op0 has different mode with M, but that's > > > too > > > restrict which will prevent below optimization, the condition can be > > > relaxed > > > to op0 must have same inner mode with M. > > > > > > (set (reg:V2DF 87 [ xx ]) > > > (vec_concat:V2DF (vec_select:DF (reg:V4DF 92) > > > (parallel [ > > > (const_int 2 [0x2]) > > > ])) > > > (vec_select:DF (reg:V4DF 92) > > > (parallel [ > > > (const_int 3 [0x3]) > > > ] > > > > > >Bootsrapped and regtested on x86_64-linux-gnu{-m32,}. > > >Ok for trunk? > > > > > > gcc/ChangeLog: > > > > > > * simplify-rtx.c > > > (simplify_context::simplify_binary_operation_1): Relax > > > condition of simplifying (vec_concat:M (vec_select op0 > > > index0)(vec_select op1 index1)) to allow different modes > > > between op0 and M, but have same inner mode. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/i386/vect-rebuild.c: > > > * gcc.target/i386/avx512f-vect-rebuild.c: New test. > > Funny, I was looking at something rather similar recently, but never > > pushed on it because we were going to need too many entries in the > > parallel selector. > > > > I'm not convinced that we need the inner mode to match anything. As > > long as the vec_concat's mode is twice the size of the vec_select modes > > and the vec_select mode is <= the mode of its operands ISTM this is > > fine. We might want the modes of the vec_select to match, but I don't > > think that's strictly necessary either, they just need to be the same > > size. ie, we could have somethig like > If they're different sizes, i.e, something like below should also be legal? > (vec_concat:V8SF (vec_select:V2SF (reg:V16SF)) (vec_select:V6SF (reg:V16SF))) > > > > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI))) > > > > I'm not sure if that level of generality is useful though. If we want > > the modes of the vec_selects to match I think we could still support > > > > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF))) > > > > Thoughts? > > > > jeff > > > > Jeff > > > > > > > --- > > > gcc/simplify-rtx.c| 3 ++- > > > .../gcc.target/i386/avx512f-vect-rebuild.c| 21 +++ > > > gcc/testsuite/gcc.target/i386/vect-rebuild.c | 2 +- > > > 3 files changed, 24 insertions(+), 2 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c > > > > > > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c > > > index ebad5cb5a79..16286befd79 100644 > > > --- a/gcc/simplify-rtx.c > > > +++ b/gcc/simplify-rtx.c > > > @@ -4587,7 +4587,8 @@ simplify_context::simplify_binary_operation_1 > > > (rtx_code code, > > > if (GET_CODE (trueop0) == VEC_SELECT > > > && GET_CODE (trueop1) == VEC_SELECT > > > && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0)) > > > - && GET_MODE (XEXP (trueop0, 0)) == mode) > > > + && GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0))) > > > +== GET_MODE_INNER(mode)) > > > { > > > rtx par0 = XEXP (trueop0, 1); > > > rtx par1 = XEXP (trueop1, 1); > > > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c > > > b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c > > > new file mode 100644 > > > index 000..aef6855aa46 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c > > > @@ -0,0 +1,21 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-O -mavx512vl -mavx512dq -fno-tree-forwprop" } */ > > > + > > > +typedef double v2df __attribute__ ((__vector_size__ (16))); > > > +typedef double v4df __attribute__ ((__vector_size__ (32))); > > > + > > > +v2df h (v4df x) > > > +{ > > > + v2df xx = { x[2], x[3] }; > > > + return xx; > > > +} > > > + > > > +v4df f2 (v4df x) > > > +{ > > > + v4df xx = { x[0], x[1], x[2], x[3] }; > > > + return xx; > > > +} > > > + > > > +/* { dg-final { scan-assembler-not "unpck" } } */ > > > +/* { dg-final { scan-assembler-not "valign" } } */ > > > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" > > > 1 } } */ > > > diff --git a/gcc/testsuite/gcc.target/i386/vect-rebuild.c > > > b/gcc/testsuite/gcc.target/i386/vect-rebuild.c > > > index 570967f6b5c..8e85b98bf1d 100644 > > > --- a/gcc/testsuite/gcc.target/i386/vect-rebuild.c > > > +++ b/gcc/testsuite/gcc.target/i386/vect-rebuild.c > > > @@ -30,4 +30,4 @@ v2df h (v4df x) > > > > > > /* { dg-final { scan-assembler-not "unpck" } } */ > > > /* { dg-fin
*PING* [PATCH] c++: fix cases of core1001/1322 by not dropping cv-qualifier of function parameter of type of typename or decltype[PR101402,PR102033,PR102034,PR102039,PR102044]
These bugs are considered duplicate cases of PR51851 which has been suspended since 2012, an issue known as "core1001/1322". Considering this background, it deserves a long comment to explain. Many people believed the root cause of this family of bugs is related with the nature of how and when the array type is converted to pointer type during function signature is calculated. This is true, but we may need to go into details to understand the exact reason. There is a pattern for these bugs(PR101402,PR102033,PR102034,PR102039). In the template function declaration, the function parameter is consisted of a "const" followed by a typename-type which is actually an array type. According to standard, function signature is calculated by dropping so-called "top-level-cv-qualifier". As a result, the templater specialization complains no matching to declaration can be found because specialization has const and template function declaration doesn't have const which is dropped as mentioned. Obviously the template function declaration should NOT drop the const. But why? Let's review the procedure of standard first. (https://timsong-cpp.github.io/cppwp/dcl.fct#5.sentence-3) "After determining the type of each parameter, any parameter of type “array of T” or of function type T is adjusted to be “pointer to T”. After producing the list of parameter types, any top-level cv-qualifiers modifying a parameter type are deleted when forming the function type." Please note the action of deleting top-level cv-qualifiers happens at last stage after array type is converted to pointer type. More importantly, there are two conditions: a) Each type must be able to be determined. b) The cv-qualifier must be top-level. Let's analysis if these two conditions can be met one by one. 1) Keyword "typename" indicates inside template it involves dependent name (https://timsong-cpp.github.io/cppwp/n4659/temp.res#2) for which the name lookup can be postponed until template instantiation. Clearly the type of dependent name cannot be determined without name lookup. Then we can NOT proceed to next step until concrete template argument type is determined during specialization. 2) After “array of T” is converted to “pointer to T”, the cv-qualifiers are no longer top-level! Unfortunately in standard there is no definition of "top-level". Mr. Dan Saks's articals (https://www.dansaks.com/articles.shtml) are tremendous help! Especially this wonderful paper (https://www.dansaks.com/articles/2000-02%20Top-Level%20cv-Qualifiers%20in%20Function%20Parameters.pdf) discusses this topic in details. In one short sentence, the "const" before array type is NOT top-level-cv-qualifier and should NOT be dropped. So, understanding the root cause makes the fix very clear: Let's NOT drop cv-qualifier for typename-type inside template. Leave this task for template substitution later when template specialization locks template argument types. Similarly inside template, "decltype" may also include dependent name and the best strategy for parser is to preserve all original declaration and postpone the task till template substitution. Here is an interesting observation to share. Originally my fix is trying to use function "resolve_typename_type" to see if the "typename-type" is indeed an array type so as to decide whether the const should be dropped. It works for cases of PR101402,PR102033(with a small fix of function), but cannot succeed on cases of PR102034,PR102039. Especially PR102039 is impossible because it depends on template argument. This helps me realize that parser should not do any work if it cannot be 100% successful. All can wait. At last I want to acknowledge other efforts to tackle this core 1001/1322 from PR92010 which is an irreplaceable different approach from this fix by doing rebuilding template function signature during template substitution stage. After all, this fix can only deal with dependent type started with "typename" or "decltype" which is not the case of pr92010. gcc/cp/ChangeLog: 2021-08-30 qingzhe huang * decl.c (grokparms): gcc/testsuite/ChangeLog: 2021-08-30 qingzhe huang * g++.dg/parse/pr101402.C: New test. * g++.dg/parse/pr102033.C: New test. * g++.dg/parse/pr102034.C: New test. * g++.dg/parse/pr102039.C: New test. * g++.dg/parse/pr102044.C: New test. diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index e0c603aaab6..940c43ce707 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -14384,7 +14384,16 @@ grokparms (tree parmlist, tree *parms) /* Top-level qualifiers on the parameters are ignored for function types. */ - type = cp_build_qualified_type (type, 0); + + int type_quals = 0; + /* Inside template declaration, typename and decltype indicating + dependent name and cv-qualifier are preserved until + template instantiation. + PR101402/PR102033/PR
*PING* Re: [PATCH] c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]
Reference with cv-qualifiers should be ignored instead of causing an error because standard accepts cv-qualified references introduced by typedef which is ignored. Therefore, the fix prevents GCC from reporting error by not setting variable "bad_quals" in case the reference is introduced by typedef. Still the cv-qualifier is silently ignored. Here I quote spec (https://timsong-cpp.github.io/cppwp/dcl.ref#1): "Cv-qualified references are ill-formed except when the cv-qualifiers are introduced through the use of a typedef-name ([dcl.typedef], [temp.param]) or decltype-specifier ([dcl.type.decltype]), in which case the cv-qualifiers are ignored." PR c++/101783 gcc/cp/ChangeLog: 2021-08-27 qingzhe huang * tree.c (cp_build_qualified_type_real): gcc/testsuite/ChangeLog: 2021-08-27 qingzhe huang * g++.dg/parse/pr101783.C: New test. diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index 8840932dba2..7aa4318a574 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -1356,12 +1356,22 @@ cp_build_qualified_type_real (tree type, /* A reference or method type shall not be cv-qualified. [dcl.ref], [dcl.fct]. This used to be an error, but as of DR 295 (in CD1) we always ignore extra cv-quals on functions. */ + + /* PR 101783 + Cv-qualified references are ill-formed except when the cv-qualifiers + are introduced through the use of a typedef-name ([dcl.typedef], + [temp.param]) or decltype-specifier ([dcl.type.decltype]), + in which case the cv-qualifiers are ignored. + */ if (type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE) && (TYPE_REF_P (type) || FUNC_OR_METHOD_TYPE_P (type))) { - if (TYPE_REF_P (type)) + // do NOT set bad_quals when non-method reference is introduced by typedef. + if (TYPE_REF_P (type) + && (!typedef_variant_p (type) || FUNC_OR_METHOD_TYPE_P (type))) bad_quals |= type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE); + // non-method reference introduced by typedef is also dropped silently type_quals &= ~(TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE); } diff --git a/gcc/testsuite/g++.dg/parse/pr101783.C b/gcc/testsuite/g++.dg/parse/pr101783.C new file mode 100644 index 000..4e0a435dd0b --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/pr101783.C @@ -0,0 +1,5 @@ +template struct A{ + typedef T& Type; +}; +template void f(const typename A::Type){} +template <> void f(const typename A::Type){}
[PATCH] [GIMPLE] Simplify (_Float16) ceil ((double) x) to .CEIL (x) when available.
Hi: Related discussion in [1] and PR. Bootstrapped and regtest on x86_64-linux-gnu{-m32,}. Ok for trunk? [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574330.html gcc/ChangeLog: PR target/102464 * config/i386/i386.c (ix86_optab_supported_p): Return true for HFmode. * match.pd: Simplify (_Float16) ceil ((double) x) to __builtin_ceilf16 (a) when a is _Float16 type and direct_internal_fn_supported_p. gcc/testsuite/ChangeLog: * gcc.target/i386/pr102464.c: New test. --- gcc/config/i386/i386.c | 20 +++- gcc/match.pd | 28 + gcc/testsuite/gcc.target/i386/pr102464.c | 39 3 files changed, 79 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr102464.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ba89e111d28..3767fe9806d 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -23582,20 +23582,24 @@ ix86_optab_supported_p (int op, machine_mode mode1, machine_mode, return opt_type == OPTIMIZE_FOR_SPEED; case rint_optab: - if (SSE_FLOAT_MODE_P (mode1) - && TARGET_SSE_MATH - && !flag_trapping_math - && !TARGET_SSE4_1) + if (mode1 == HFmode) + return true; + else if (SSE_FLOAT_MODE_P (mode1) + && TARGET_SSE_MATH + && !flag_trapping_math + && !TARGET_SSE4_1) return opt_type == OPTIMIZE_FOR_SPEED; return true; case floor_optab: case ceil_optab: case btrunc_optab: - if (SSE_FLOAT_MODE_P (mode1) - && TARGET_SSE_MATH - && !flag_trapping_math - && TARGET_SSE4_1) + if (mode1 == HFmode) + return true; + else if (SSE_FLOAT_MODE_P (mode1) + && TARGET_SSE_MATH + && !flag_trapping_math + && TARGET_SSE4_1) return true; return opt_type == OPTIMIZE_FOR_SPEED; diff --git a/gcc/match.pd b/gcc/match.pd index a9791ceb74a..9ccec8b6ce3 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -6191,6 +6191,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (froms (convert float_value_p@0)) (convert (tos @0) +#if GIMPLE +(match float16_value_p + @0 + (if (TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == float16_type_node))) +(for froms (BUILT_IN_TRUNCL BUILT_IN_TRUNC BUILT_IN_TRUNCF + BUILT_IN_FLOORL BUILT_IN_FLOOR BUILT_IN_FLOORF + BUILT_IN_CEILL BUILT_IN_CEIL BUILT_IN_CEILF + BUILT_IN_ROUNDEVENL BUILT_IN_ROUNDEVEN BUILT_IN_ROUNDEVENF + BUILT_IN_ROUNDL BUILT_IN_ROUND BUILT_IN_ROUNDF + BUILT_IN_NEARBYINTL BUILT_IN_NEARBYINT BUILT_IN_NEARBYINTF + BUILT_IN_RINTL BUILT_IN_RINT BUILT_IN_RINTF) + tos (IFN_TRUNC IFN_TRUNC IFN_TRUNC + IFN_FLOOR IFN_FLOOR IFN_FLOOR + IFN_CEIL IFN_CEIL IFN_CEIL + IFN_ROUNDEVEN IFN_ROUNDEVEN IFN_ROUNDEVEN + IFN_ROUND IFN_ROUND IFN_ROUND + IFN_NEARBYINT IFN_NEARBYINT IFN_NEARBYINT + IFN_RINT IFN_RINT IFN_RINT) + /* (_Float16) round ((doube) x) -> __built_in_roundf16 (x), etc., +if x is a _Float16. */ + (simplify + (convert (froms (convert float16_value_p@0))) + (if (types_match (type, TREE_TYPE (@0)) + && direct_internal_fn_supported_p (as_internal_fn (tos), +type, OPTIMIZE_FOR_BOTH)) + (tos @0 +#endif + (for froms (XFLOORL XCEILL XROUNDL XRINTL) tos (XFLOOR XCEIL XROUND XRINT) /* llfloorl(extend(x)) -> llfloor(x), etc., if x is a double. */ diff --git a/gcc/testsuite/gcc.target/i386/pr102464.c b/gcc/testsuite/gcc.target/i386/pr102464.c new file mode 100644 index 000..e3e060ee80b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr102464.c @@ -0,0 +1,39 @@ +/* PR target/102464. */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx512fp16" } */ + +#define FOO(FUNC,SUFFIX) \ + _Float16 \ + foo_##FUNC##_##SUFFIX (_Float16 a) \ + {\ +return __builtin_##FUNC##SUFFIX (a); \ + } + +FOO (roundeven, f16); +FOO (roundeven, f); +FOO (roundeven, ); +FOO (roundeven, l); +FOO (trunc, f16); +FOO (trunc, f); +FOO (trunc, ); +FOO (trunc, l); +FOO (ceil, f16); +FOO (ceil, f); +FOO (ceil, ); +FOO (ceil, l); +FOO (floor, f16); +FOO (floor, f); +FOO (floor, ); +FOO (floor, l); +FOO (nearbyint, f16); +FOO (nearbyint, f); +FOO (nearbyint, ); +FOO (nearbyint, l); +FOO (rint, f16); +FOO (rint, f); +FOO (rint, ); +FOO (rint, l); + +/* { dg-final { scan-assembler-not "vcvtsh2s\[sd\]" } } */ +/* { dg-final { scan-assembler-not "extendhfxf" } } */ +/* { dg-final { scan-assembler-times "vrndscalesh\[^\n\r\]*xmm\[0-9\]" 24 } } */ -- 2.27.0
Re: [PATCH] Avoid invalid loop transformations in jump threading registry.
On 9/23/21 6:10 PM, Jeff Law wrote: On 9/23/2021 5:15 AM, Aldy Hernandez wrote: My upcoming improvements to the forward jump threader make it thread more aggressively. In investigating some "regressions", I noticed that it has always allowed threading through empty latches and across loop boundaries. As we have discussed recently, this should be avoided until after loop optimizations have run their course. Note that this wasn't much of a problem before because DOM/VRP couldn't find these opportunities, but with a smarter solver, we trip over them more easily. We used to be much more aggressive in this space -- but we removed the equivalency tracking on backedges in the main part of DOM which had the side effect to reducing the number of threads related to back edges in loops. I thought we couldn't thread through back edges at all in the old threader, or are we talking about the same thing? We have a hard fail on backedge thread attempts for anything but the backward threader and its custom copier. Of course that was generally a positive thing given the issues we've been discussing. Yeah. These tweaks have reduced the number of jump threads in my bootstrap .ii by 6%, so a considerable amount. But they were problematic threading paths to begin with. For example, it removed the regression introduced by the backward threader rewrite in gcc.dg/vect/bb-slp-16.c. Interestingly, for all the checks we do in the backward threader, some threading through loop boundaries seep through. In particular the check for loop crossing excludes the taken edge, which IMO is a mistake. If the entire path is in a loop, but the taken edge points to another loop, that by definition is a loop crossing. Note, we have an exception for the first block in a path being in another loop, but that's something else ;-). Anywhoo... we're catching it now. We really should clean this up and merge the differing implementations. But I'm way over my time budget for this ;-). Because the forward threader doesn't have an independent localized cost model like the new threader (profitable_path_p), it is difficult to catch these things at discovery. However, we can catch them at registration time, with the added benefit that all the threaders (forward and backward) can share the handcuffs. In an ideal world profitability and correctness would be separated -- but they're still intertwined and I don't think this makes that situation particularly worse. And I do like that having a single choke point. Huh, I hadn't though about it that way, but you're right. The profitable_path_p code is catching both correctness as well as profitability issues. It seems all the profitability stuff is keyed by param*fsm* compile options, though. It should be easy enough to separate. Obviously you're cleaning this up, so I think a significant degree of freedom should be given here Much appreciated. Aldy
Re: [PATCH] top-level: merge Makefile.def patches from binutils-gdb repository
On Fri, Sep 24, 2021 at 12:49 PM Andrew Burgess wrote: > > This commit back-ports two patches to Makefile.def from the > binutils-gdb repository, these patches were committed over there > without first being merged in to the gcc repository. > > These commits all relate to dependencies for binutils-gdb modules, so > should have no impact on gcc, I tested a gcc build/install on x86-64 > GNU/Linux, and everything looked OK. > > The two patches being backported are binutils-gdb commits: > > commit ba4d88ad892fe29c6ca7938c8861f8edef5f7a3f (gdb-gnulib-issues) > Date: Mon Oct 12 16:04:32 2020 +0100 > > gdb/gdbserver: add dependencies for distclean-gnulib > > And > > commit 755ba58ebef02e1be9fc6770d00243ba6ed0223c > Date: Thu Mar 18 12:37:52 2021 + > > Add install dependencies for ld -> bfd and libctf -> bfd > > OK to merge? OK. > 2021-09-07 Andrew Burgess > > Merge from binutils-gdb: > 2021-09-08 Nick Alcock > > PR libctf/27482 > * Makefile.def: Add install-bfd dependencies for install-libctf and > install-ld, and install-strip-bfd dependencies for > install-strip-libctf and install-strip-ld; move the install-ld > dependency on install-libctf to join it. > * Makefile.in: Regenerated. > > And: > 2020-10-14 Andrew Burgess > > * Makefile.in: Rebuild. > * Makefile.def: Make distclean-gnulib depend on distclean-gdb and > distclean-gdbserver. > --- > ChangeLog| 19 +++ > Makefile.def | 14 ++ > Makefile.in | 8 > 3 files changed, 41 insertions(+) > > diff --git a/Makefile.def b/Makefile.def > index de3e0052106..143a6b469b2 100644 > --- a/Makefile.def > +++ b/Makefile.def > @@ -471,6 +471,14 @@ dependencies = { module=all-ld; on=all-libctf; }; > dependencies = { module=install-binutils; on=install-opcodes; }; > dependencies = { module=install-strip-binutils; on=install-strip-opcodes; }; > > +// Likewise for ld, libctf, and bfd. > +dependencies = { module=install-libctf; on=install-bfd; }; > +dependencies = { module=install-ld; on=install-bfd; }; > +dependencies = { module=install-ld; on=install-libctf; }; > +dependencies = { module=install-strip-libctf; on=install-strip-bfd; }; > +dependencies = { module=install-strip-ld; on=install-strip-bfd; }; > +dependencies = { module=install-strip-ld; on=install-strip-libctf; }; > + > // libopcodes depends on libbfd > dependencies = { module=install-opcodes; on=install-bfd; }; > dependencies = { module=install-strip-opcodes; on=install-strip-bfd; }; > @@ -564,6 +572,12 @@ dependencies = { module=configure-libctf; on=all-zlib; }; > dependencies = { module=configure-libctf; on=all-libiconv; }; > dependencies = { module=check-libctf; on=all-ld; }; > > +// The Makefiles in gdb and gdbserver pull in a file that configure > +// generates in the gnulib directory, so distclean gnulib only after > +// gdb and gdbserver. > +dependencies = { module=distclean-gnulib; on=distclean-gdb; }; > +dependencies = { module=distclean-gnulib; on=distclean-gdbserver; }; > + > // Warning, these are not well tested. > dependencies = { module=all-bison; on=all-intl; }; > dependencies = { module=all-bison; on=all-build-texinfo; }; > diff --git a/Makefile.in b/Makefile.in > index 61af99dc75a..7613da5a378 100644 > --- a/Makefile.in > +++ b/Makefile.in > @@ -60763,6 +60763,12 @@ all-stageautoprofile-ld: > maybe-all-stageautoprofile-libctf > all-stageautofeedback-ld: maybe-all-stageautofeedback-libctf > install-binutils: maybe-install-opcodes > install-strip-binutils: maybe-install-strip-opcodes > +install-libctf: maybe-install-bfd > +install-ld: maybe-install-bfd > +install-ld: maybe-install-libctf > +install-strip-libctf: maybe-install-strip-bfd > +install-strip-ld: maybe-install-strip-bfd > +install-strip-ld: maybe-install-strip-libctf > install-opcodes: maybe-install-bfd > install-strip-opcodes: maybe-install-strip-bfd > configure-gas: maybe-configure-intl > @@ -61131,6 +61137,8 @@ check-stagetrain-libctf: maybe-all-stagetrain-ld > check-stagefeedback-libctf: maybe-all-stagefeedback-ld > check-stageautoprofile-libctf: maybe-all-stageautoprofile-ld > check-stageautofeedback-libctf: maybe-all-stageautofeedback-ld > +distclean-gnulib: maybe-distclean-gdb > +distclean-gnulib: maybe-distclean-gdbserver > all-bison: maybe-all-build-texinfo > all-flex: maybe-all-build-bison > all-flex: maybe-all-m4 > -- > 2.25.4 >
[PATCH] aarch64: Fix type qualifiers for qtbl1 and qtbx1 Neon builtins
Hi, This patch fixes type qualifiers for the qtbl1 and qtbx1 Neon builtins and removes the casts from the Neon intrinsic function bodies that use these builtins. Regression tested and bootstrapped on aarch64-none-linux-gnu - no issues. Ok for master? Thanks, Jonathan --- gcc/ChangeLog: 23-09-2021 Jonathan Wright * config/aarch64/aarch64-builtins.c (TYPES_BINOP_PPU): Define new type qualifier enum. (TYPES_TERNOP_SSSU): Likewise. (TYPES_TERNOP_PPPU): Likewise. * config/aarch64/aarch64-simd-builtins.def: Define PPU, SSU, PPPU and SSSU builtin generator macros for qtbl1 and qtbx1 Neon builtins. * config/aarch64/arm_neon.h (vqtbl1_p8): Use type-qualified builtin and remove casts. (vqtbl1_s8): Likewise. (vqtbl1q_p8): Likewise. (vqtbl1q_s8): Likewise. (vqtbx1_s8): Likewise. (vqtbx1_p8): Likewise. (vqtbx1q_s8): Likewise. (vqtbx1q_p8): Likewise. (vtbl1_p8): Likewise. (vtbl2_p8): Likewise. (vtbx2_p8): Likewise. rb14884.patch Description: rb14884.patch
FW: [PING] Re: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under -ffast-math on aarch64
Hi, Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577846.html The patch is attached as text for ease of use. Is there anything that needs to change? Ok for master? If OK, can it be committed for me, I have no commit rights. Jirui Wu -Original Message- From: Jirui Wu Sent: Friday, September 10, 2021 10:14 AM To: Richard Biener Cc: Richard Biener ; Andrew Pinski ; Richard Sandiford ; i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers Subject: [PING] Re: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under -ffast-math on aarch64 Hi, Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577846.html Ok for master? If OK, can it be committed for me, I have no commit rights. Jirui Wu -Original Message- From: Jirui Wu Sent: Friday, September 3, 2021 12:39 PM To: 'Richard Biener' Cc: Richard Biener ; Andrew Pinski ; Richard Sandiford ; i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under -ffast-math on aarch64 Ping -Original Message- From: Jirui Wu Sent: Friday, August 20, 2021 4:28 PM To: Richard Biener Cc: Richard Biener ; Andrew Pinski ; Richard Sandiford ; i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under -ffast-math on aarch64 > -Original Message- > From: Richard Biener > Sent: Friday, August 20, 2021 8:15 AM > To: Jirui Wu > Cc: Richard Biener ; Andrew Pinski > ; Richard Sandiford ; > i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers > > Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for > (double)(int) under -ffast-math on aarch64 > > On Thu, 19 Aug 2021, Jirui Wu wrote: > > > Hi all, > > > > This patch generates FRINTZ instruction to optimize type casts. > > > > The changes in this patch covers: > > * Generate FRINTZ for (double)(int) casts. > > * Add new test cases. > > > > The intermediate type is not checked according to the C99 spec. > > Overflow of the integral part when casting floats to integers causes > undefined behavior. > > As a result, optimization to trunc() is not invalid. > > I've confirmed that Boolean type does not match the matching condition. > > > > Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? If OK can it be committed for me, I have no commit rights. > > +/* Detected a fix_trunc cast inside a float type cast, > + use IFN_TRUNC to optimize. */ > +#if GIMPLE > +(simplify > + (float (fix_trunc @0)) > + (if (direct_internal_fn_supported_p (IFN_TRUNC, type, > + OPTIMIZE_FOR_BOTH) > + && flag_unsafe_math_optimizations > + && type == TREE_TYPE (@0)) > > types_match (type, TREE_TYPE (@0)) > > please. Please perform cheap tests first (the flag test). > > + (IFN_TRUNC @0))) > +#endif > > why only for GIMPLE? I'm not sure flag_unsafe_math_optimizations is a > good test here. If you say we can use undefined behavior of any > overflow of the fix_trunc operation what do we guard here? > If it's Inf/NaN input then flag_finite_math_only would be more > appropriate, if it's behavior for -0. (I suppose trunc (-0.0) == -0.0 > and thus "wrong") then a && !HONOR_SIGNED_ZEROS (type) is missing > instead. If it's setting of FENV state and possibly trapping on > overflow (but it's undefined?!) then flag_trapping_math covers the > latter but we don't have any flag for eliding FENV state affecting > transforms, so there the kitchen-sink flag_unsafe_math_optimizations might > apply. > > So - which is it? > This change is only for GIMPLE because we can't test for the optab support without being in GIMPLE. direct_internal_fn_supported_p is defined only for GIMPLE. IFN_TRUNC's documentation mentions nothing for zero, NaNs/inf inputs. So I think the correct guard is just flag_fp_int_builtin_inexact. !flag_trapping_math because the operation can only still raise inexacts. The new pattern is moved next to the place you mentioned. Ok for master? If OK can it be committed for me, I have no commit rights. Thanks, Jirui > Note there's also the pattern > > /* Handle cases of two conversions in a row. */ (for ocvt (convert > float > fix_trunc) (for icvt (convert float) > (simplify >(ocvt (icvt@1 @0)) >(with > { > ... > > which is related so please put the new pattern next to that (the set > of conversions handled there does not include (float (fix_trunc @0))) > > Thanks, > Richard. > > > Thanks, > > Jirui > > > > gcc/ChangeLog: > > > > * match.pd: Generate IFN_TRUNC. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/merge_trunc1.c: New test. > > > > > -Original Message- > > > From: Richard Biener > > > Sent: Tuesday, August 17, 2021 9:13 AM > > > To: Andrew Pinski > > > Cc: Jirui Wu ; Richard Sandiford > > > ; i...@airs.com; > > > gcc-patches@gcc.gnu.org; rguent.
Re: [PATCH] Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode.
On Mon, Sep 13, 2021 at 04:24:13PM +0200, Richard Biener wrote: > On Mon, Sep 13, 2021 at 4:10 PM Jeff Law via Gcc-patches > wrote: > > I'm not convinced that we need the inner mode to match anything. As > > long as the vec_concat's mode is twice the size of the vec_select modes > > and the vec_select mode is <= the mode of its operands ISTM this is > > fine. We might want the modes of the vec_select to match, but I don't > > think that's strictly necessary either, they just need to be the same > > size. ie, we could have somethig like > > > > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI))) > > > > I'm not sure if that level of generality is useful though. If we want > > the modes of the vec_selects to match I think we could still support > > > > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF))) > > > > Thoughts? > > I think the component or scalar modes of the elements to concat need to match > the component mode of the result. I don't think you example involving > a cat of DF and DI is too useful - but you could use a subreg around the DI > value ;) I agree. If you want to concatenate components of different modes, you should change mode first, using subregs for example. ("Inner mode" is something of subregs btw, "component mode" is what this concept of modes is called, the name GET_MODE_INNER is a bit confusing though :-) ) Btw, the documentation for "concat" says @findex concat @item (concat@var{m} @var{rtx} @var{rtx}) This RTX represents the concatenation of two other RTXs. This is used for complex values. It should only appear in the RTL attached to declarations and during RTL generation. It should not appear in the ordinary insn chain. which needs some updating (in many ways). Segher
Re: [PATCH] Avoid invalid loop transformations in jump threading registry.
On 23/09/2021 13:15, Aldy Hernandez via Gcc-patches wrote: My upcoming improvements to the forward jump threader make it thread more aggressively. In investigating some "regressions", I noticed that it has always allowed threading through empty latches and across loop boundaries. As we have discussed recently, this should be avoided until after loop optimizations have run their course. Note that this wasn't much of a problem before because DOM/VRP couldn't find these opportunities, but with a smarter solver, we trip over them more easily. Because the forward threader doesn't have an independent localized cost model like the new threader (profitable_path_p), it is difficult to catch these things at discovery. However, we can catch them at registration time, with the added benefit that all the threaders (forward and backward) can share the handcuffs. This patch is an adaptation of what we do in the backward threader, but it is not meant to catch everything we do there, as some of the restrictions there are due to limitations of the different block copiers (for example, the generic copier does not re-use existing threading paths). We could ideally remove the now redundant bits in profitable_path_p, but I would prefer not to for two reasons. First, the backward threader uses profitable_path_p as it discovers paths to avoid discovering paths in unprofitable directions. Second, I would like to merge all the forward cost restrictions into the profitability class in the backward threader, not the other way around. Alas, that reshuffling will have to wait for the next release. As usual, there are quite a few tests that needed adjustments. It seems we were quite happily threading improper scenarios. With most of them, as can be seen in pr77445-2.c, we're merely shifting the threading to after loop optimizations. Tested on x86-64 Linux. OK for trunk? p.s. "Sure, sounds like fun... how hard can improving the threaders be?" gcc/ChangeLog: * tree-ssa-threadupdate.c (jt_path_registry::cancel_invalid_paths): New. (jt_path_registry::register_jump_thread): Call cancel_invalid_paths. * tree-ssa-threadupdate.h (class jt_path_registry): Add cancel_invalid_paths. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/20030714-2.c: Adjust. * gcc.dg/tree-ssa/pr66752-3.c: Adjust. * gcc.dg/tree-ssa/pr77445-2.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust. * gcc.dg/vect/bb-slp-16.c: Adjust. After your commit r12-3876, I've noticed that some of the updated tests fail on some arm targets: FAIL: gcc:gcc.dg/tree-ssa/tree-ssa.exp=gcc.dg/tree-ssa/pr66752-3.c scan-tree-dump-not thread3 "if .flag" FAIL: gcc:gcc.dg/tree-ssa/tree-ssa.exp=gcc.dg/tree-ssa/pr77445-2.c scan-tree-dump thread1 "Jumps threaded: 9" when cpu is: * cortex-a5 (fpu = vfpv3-d16-fp16) * cortex-m0, m3, m4, m7 and m55 (with assorted -march/-mfloat-abi) See https://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/r12-3876-g4a960d548b7d7d942f316c5295f6d849b74214f5/report-build-info.html for more details (you can ignore the regressions in libstdc++, they are related to random timeouts) Thanks, Christophe --- gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c| 7 +- gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c | 19 -- gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c | 4 +- .../gcc.dg/tree-ssa/ssa-dom-thread-18.c | 4 +- .../gcc.dg/tree-ssa/ssa-dom-thread-7.c| 4 +- gcc/testsuite/gcc.dg/vect/bb-slp-16.c | 7 -- gcc/tree-ssa-threadupdate.c | 67 +++ gcc/tree-ssa-threadupdate.h | 1 + 8 files changed, 78 insertions(+), 35 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c index eb663f2ff5b..9585ff11307 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c @@ -32,7 +32,8 @@ get_alias_set (t) } } -/* There should be exactly three IF conditionals if we thread jumps - properly. */ -/* { dg-final { scan-tree-dump-times "if " 3 "dom2"} } */ +/* There should be exactly 4 IF conditionals if we thread jumps + properly. There used to be 3, but one thread was crossing + loops. */ +/* { dg-final { scan-tree-dump-times "if " 4 "dom2"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c index e1464e21170..922a331b217 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-dce2" } */ +/* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-thread3" } */ extern int status, pt; extern int count; @@ -32,10 +32,15 @@ foo (int N, int c, int b, int *a) pt--; }
Re: [PATCH] Enable auto-vectorization at O2 with very-cheap cost model.
On 9/23/21 9:32 PM, Hongtao Liu wrote: On Thu, Sep 23, 2021 at 11:18 PM Martin Sebor wrote: On 9/23/21 12:30 AM, Richard Biener wrote: On Thu, 23 Sep 2021, Hongtao Liu wrote: On Thu, Sep 23, 2021 at 9:48 AM Hongtao Liu wrote: On Wed, Sep 22, 2021 at 10:21 PM Martin Sebor wrote: On 9/21/21 7:38 PM, Hongtao Liu wrote: On Mon, Sep 20, 2021 at 4:13 AM Martin Sebor wrote: ... diff --git a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c index 1d79930cd58..9351f7e7a1a 100644 --- a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c +++ b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c @@ -1,7 +1,7 @@ /* PR middle-end/91458 - inconsistent warning for writing past the end of an array member { dg-do compile } - { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf" } */ + { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf -fno-tree-vectorize" } */ The testcase is large - what part requires this change? Given the testcase was added for inconsistent warnings do they now become inconsistent again as we enable vectorization at -O2? That said, the testcase adjustments need some explaining - I suppose you didn't just slap -fno-tree-vectorize to all of those changing behavior? void ga1_ (void) { a1_.a[0] = 0; a1_.a[1] = 1; // { dg-warning "\\\[-Wstringop-overflow" } a1_.a[2] = 2; // { dg-warning "\\\[-Wstringop-overflow" } struct A1 a; a.a[0] = 0; a.a[1] = 1; // { dg-warning "\\\[-Wstringop-overflow" } a.a[2] = 2; // { dg-warning "\\\[-Wstringop-overflow" } sink (&a); } It's supposed to be 2 warning for a.a[1] = 1 and a.a[2] = 1 since there are 2 accesses, but after enabling vectorization, there's only one access, so one warning is missing which causes the failure. With the stores vectorized, is the warning on the correct line or does it point to the first store, the one that's in bounds, as it does with -O3? The latter would be a regression at -O2. For the upper case, It points to the second store which is out of bounds, the third store warning is missing. I would find it preferable to change the test code over disabling optimizations that are on by default. My concern is that the test would no longer exercise the default behavior. (The same goes for the -fno-ipa-icf option.) Hmm, it's a middle-end test, for some backend, it may not do vectorization(it depends on TARGET_VECTOR_MODE_SUPPORTED_P and relative cost model). Yes, there are quite a few warning tests like that. Their main purpose is to verify that in common GCC invocations (i.e., without any special options) warnings are a) issued when expected and b) not issued when not expected. Otherwise, middle end warnings are known to have both false positives and false negatives in some invocations, depending on what optimizations are in effect. Indiscriminately disabling common optimizations for these large tests and invoking them under artificial conditions would compromise this goal and hide the problems. If enabling vectorization at -O2 causes regressions in the quality of diagnostics (as the test failure above indicates seems to be happening) we should investigate these and open bugs for them so they can be fixed. We can then tweak the specific failing test cases to avoid the failures until they are fixed. There are indeed cases of false positives and false negatives .i.e. // Verify warning for access to a definition with an initializer that // initializes the one-element array member. struct A1 a1i_1 = { 0, { 1 } }; void ga1i_1 (void) { a1i_1.a[0] = 0; a1i_1.a[1] = 1; // { dg-warning "\\\[-Wstringop-overflow" } a1i_1.a[2] = 2; // { dg-warning "\\\[-Wstringop-overflow" } struct A1 a = { 0, { 1 } }; --- false positive here. a.a[0] = 1; a.a[1] = 2; // { dg-warning "\\\[-Wstringop-overflow" } false negative here. a.a[2] = 3; // { dg-warning "\\\[-Wstringop-overflow" } false negative here. sink (&a); } Similar for * gcc.dg/Warray-bounds-51.c. * gcc.dg/Warray-parameter-3.c * gcc.dg/Wstringop-overflow-14.c * gcc.dg/Wstringop-overflow-21.c So there're 3 situations. 1. All accesses are out of bound, and after vectorization, there are some warnings missing. 2. Part of accesses are inbound, part of accesses are out of bound, and after vectorization, the warning goes from out of bound line to inbound line. 3. All access are out of bound, and after vectoriation, all warning are missing, and goes to a false-positive line. My mistake, there's no case3, just case 1 and case2. So i'm going to install the patch, ok? Please don't add the -fno- option to the warning tests. As I said, I would prefer to either suppress the vectorization for the failing cases by tweaking the test code or xfail them. That way future regressions won't be masked by
PING [PATCH] warn for more impossible null pointer tests [PR102103]
Ping: Jeff, with the C++ part approved, can you please confirm your approval with the C parts of the patch? https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579693.html On 9/21/21 6:34 PM, Martin Sebor wrote: On 9/21/21 3:40 PM, Jason Merrill wrote: The C++ changes are OK. Jeff, should I take your previous "Generally OK" as an approval for the rest of the patch as well? (It has not changed in v2.) I have just submitted a Glibc patch to suppress the new instances there. Martin
[committed] libstdc++: Remove redundant 'inline' specifiers
These functions are constexpr, which means they are implicitly inline. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * include/bits/range_access.h (cbegin, cend): Remove redundant 'inline' specifier. Tested x86_64-linux. Committed to trunk. commit 9b11107ed72ca543af41dbb3226e16b61d31b098 Author: Jonathan Wakely Date: Fri Sep 24 11:30:59 2021 libstdc++: Remove redundant 'inline' specifiers These functions are constexpr, which means they are implicitly inline. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * include/bits/range_access.h (cbegin, cend): Remove redundant 'inline' specifier. diff --git a/libstdc++-v3/include/bits/range_access.h b/libstdc++-v3/include/bits/range_access.h index ab2d4f8652c..3dec687dd94 100644 --- a/libstdc++-v3/include/bits/range_access.h +++ b/libstdc++-v3/include/bits/range_access.h @@ -122,7 +122,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION */ template [[__nodiscard__]] -inline constexpr auto +constexpr auto cbegin(const _Container& __cont) noexcept(noexcept(std::begin(__cont))) -> decltype(std::begin(__cont)) { return std::begin(__cont); } @@ -134,7 +134,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION */ template [[__nodiscard__]] -inline constexpr auto +constexpr auto cend(const _Container& __cont) noexcept(noexcept(std::end(__cont))) -> decltype(std::end(__cont)) { return std::end(__cont); }
[COMMITTED] path solver: Avoid further lookups when range is defined in block.
If an SSA is defined in the current block, there is no need to query range_on_path_entry for additional information. Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::path_range_query): Move debugging header... (path_range_query::precompute_ranges): ...here. (path_range_query::internal_range_of_expr): Do not call range_on_path_entry if NAME is defined in the current block. --- gcc/gimple-range-path.cc | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc index d9704c8f86b..0738a5ca159 100644 --- a/gcc/gimple-range-path.cc +++ b/gcc/gimple-range-path.cc @@ -39,9 +39,6 @@ along with GCC; see the file COPYING3. If not see path_range_query::path_range_query (gimple_ranger &ranger, bool resolve) : m_ranger (ranger) { - if (DEBUG_SOLVER) -fprintf (dump_file, "\n*** path_range_query **\n"); - m_cache = new ssa_global_cache; m_has_cache_entry = BITMAP_ALLOC (NULL); m_path = NULL; @@ -173,9 +170,6 @@ path_range_query::internal_range_of_expr (irange &r, tree name, gimple *stmt) if (TREE_CODE (name) == SSA_NAME) r.intersect (gimple_range_global (name)); - if (m_resolve && r.varying_p ()) - range_on_path_entry (r, name); - set_cache (r, name); return true; } @@ -467,6 +461,9 @@ void path_range_query::precompute_ranges (const vec &path, const bitmap_head *imports) { + if (DEBUG_SOLVER) +fprintf (dump_file, "\n*** path_range_query **\n"); + set_path (path); bitmap_copy (m_imports, imports); m_undefined_path = false; -- 2.31.1
[PATCH] rs6000: Fix vec_cpsgn parameter order (PR101985)
Hi! This fixes a bug reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101985. The vec_cpsgn built-in function API differs in argument order from the copysign3 convention. Currently that pattern is incorrctly used to implement vec_cpsgn. Fix that while leaving the existing pattern in place to implement copysignf for vector modes. Part of the fix when using the new built-in support requires an adjustment to a pending patch that replaces much of altivec.h with an automatically generated file. So that adjustment will be coming later... Also fix a bug in the new built-in overload infrastructure where we were using the VSX form of the VEC_COPYSIGN built-in when we should default to the VMX form. Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. Is this okay for trunk? Thanks! Bill 2021-09-24 Bill Schmidt gcc/ PR target/101985 * config/rs6000/altivec.h (vec_cpsgn): Adjust. * config/rs6000/rs6000-overload.def (VEC_COPYSIGN): Use SKIP to avoid generating an automatic #define of vec_cpsgn. Use the correct built-in for V4SFmode that doesn't depend on VSX. gcc/testsuite/ PR target/101985 * gcc.target/powerpc/pr101985.c: New. --- gcc/config/rs6000/altivec.h | 2 +- gcc/config/rs6000/rs6000-overload.def | 4 ++-- gcc/testsuite/gcc.target/powerpc/pr101985-1.c | 18 ++ gcc/testsuite/gcc.target/powerpc/pr101985-2.c | 18 ++ 4 files changed, 39 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr101985-1.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr101985-2.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 5b631c7ebaf..ea72c9c1789 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -129,7 +129,7 @@ #define vec_vcfux __builtin_vec_vcfux #define vec_cts __builtin_vec_cts #define vec_ctu __builtin_vec_ctu -#define vec_cpsgn __builtin_vec_copysign +#define vec_cpsgn(x,y) __builtin_vec_copysign(y,x) #define vec_double __builtin_vec_double #define vec_doublee __builtin_vec_doublee #define vec_doubleo __builtin_vec_doubleo diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def index 141f831e2c0..4f583312f36 100644 --- a/gcc/config/rs6000/rs6000-overload.def +++ b/gcc/config/rs6000/rs6000-overload.def @@ -1154,9 +1154,9 @@ vus __builtin_vec_convert_4f32_8f16 (vf, vf); CONVERT_4F32_8F16 -[VEC_COPYSIGN, vec_cpsgn, __builtin_vec_copysign] +[VEC_COPYSIGN, SKIP, __builtin_vec_copysign] vf __builtin_vec_copysign (vf, vf); -CPSGNSP +COPYSIGN_V4SF vd __builtin_vec_copysign (vd, vd); CPSGNDP diff --git a/gcc/testsuite/gcc.target/powerpc/pr101985-1.c b/gcc/testsuite/gcc.target/powerpc/pr101985-1.c new file mode 100644 index 000..a1ec2d68d53 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr101985-1.c @@ -0,0 +1,18 @@ +/* PR target/101985 */ +/* { dg-do run } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2" } */ + +#include + +int +main (void) +{ + vector float a = { 1, 2, - 3, - 4}; + vector float b = {-10, 20, -30, 40}; + vector float c = { 10, 20, -30, -40}; + a = vec_cpsgn (a, b); + if (! vec_all_eq (a, c)) +__builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/powerpc/pr101985-2.c b/gcc/testsuite/gcc.target/powerpc/pr101985-2.c new file mode 100644 index 000..71cc254c170 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr101985-2.c @@ -0,0 +1,18 @@ +/* PR target/101985 */ +/* { dg-do run } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2" } */ + +#include + +int +main (void) +{ + vector double a = { 1, -4}; + vector double b = { -10, 40}; + vector double c = { 10, -40}; + a = vec_cpsgn (a, b); + if (! vec_all_eq (a, c)) +__builtin_abort (); + return 0; +} -- 2.27.0
[pushed] IRA: Make profitability calculation of RA conflict presentations independent of host compiler type sizes of RA conflict presentations independent of host compiler type sizes [PR102147]
The following patch solves https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102147 The patch was successfully bootstrapped and tested on x86-64. commit ec4c30b64942e615b4bb4b9761cd3b2635158608 (HEAD -> master) Author: Vladimir N. Makarov Date: Fri Sep 24 10:06:45 2021 -0400 Make profitability calculation of RA conflict presentations independent of host compiler type sizes. [PR102147] gcc/ChangeLog: 2021-09-24 Vladimir Makarov PR rtl-optimization/102147 * ira-build.c (ira_conflict_vector_profitable_p): Make profitability calculation independent of host compiler pointer and IRA_INT_BITS sizes. diff --git a/gcc/ira-build.c b/gcc/ira-build.c index 42120656366..2a30efc4f2f 100644 --- a/gcc/ira-build.c +++ b/gcc/ira-build.c @@ -629,7 +629,7 @@ ior_hard_reg_conflicts (ira_allocno_t a, const_hard_reg_set set) bool ira_conflict_vector_profitable_p (ira_object_t obj, int num) { - int nw; + int nbytes; int max = OBJECT_MAX (obj); int min = OBJECT_MIN (obj); @@ -638,9 +638,14 @@ ira_conflict_vector_profitable_p (ira_object_t obj, int num) in allocation. */ return false; - nw = (max - min + IRA_INT_BITS) / IRA_INT_BITS; - return (2 * sizeof (ira_object_t) * (num + 1) - < 3 * nw * sizeof (IRA_INT_TYPE)); + nbytes = (max - min) / 8 + 1; + STATIC_ASSERT (sizeof (ira_object_t) <= 8); + /* Don't use sizeof (ira_object_t), use constant 8. Size of ira_object_t (a + pointer) is different on 32-bit and 64-bit targets. Usage sizeof + (ira_object_t) can result in different code generation by GCC built as 32- + and 64-bit program. In any case the profitability is just an estimation + and border cases are rare. */ + return (2 * 8 /* sizeof (ira_object_t) */ * (num + 1) < 3 * nbytes); } /* Allocates and initialize the conflict vector of OBJ for NUM
[PATCH] Replace VRP threader with a hybrid forward threader.
This patch implements the new hybrid forward threader and replaces the embedded VRP threader with it. With all the pieces that have gone in, the implementation of the hybrid threader is straightforward: convert the current state into SSA imports that the solver will understand, and let the path solver precompute ranges and relations for the path. After this setup is done, we can use the range_query API to solve gimple statements in the threader. The forward threader is now engine agnostic so there are no changes to the threader per se. I have put the hybrid bits in tree-ssa-threadedge.*, instead of VRP, because they will also be used in the evrp removal of the DOM/threader, which is my next task. Most of the patch, is actually test changes. I have gone through every single one and verified that we're correct. Most were trivial dump file name changes, but others required going through the IL an certifying that the different IL was expected. For example, in pr59597.c, we have one less thread because the ASSERT_EXPR was getting in the way, and making it seem like things were not crossing loops. The hybrid threader sees the correct representation of the IL, and avoids threading this one case. The final numbers are a 12.16% improvement in jump threads immediately after VRP, and a 0.82% improvement in overall jump threads. The performance drop is 0.6% (plus the 1.43% hit from moving the embedded threader into its own pass). As I've said, I'd prefer to keep the threader in its own pass, but if this is an issue, we can address this with a shared ranger when VRP is replaced with an evrp instance (upcoming). Note, that these numbers are slightly different than what I originally posted. A few correctness tweaks, plus restricting loop threads, made the difference. That being said, I was aiming for par. A 12% gain is just gravy ;-). When we merge the threaders, we should see even better numbers-- and we'll have the benefit of an entire release stress testing the solver. As I mentioned in my introductory note, paths ending in MEM_REF conditional are missing. In reality, this didn't make a difference, as it was so rare. However, as a follow-up, I will distill a test and add a suitable PR to keep us honest. There is a one-line change to libgomp/team.c silencing a new used uninitialized warning. As my previous work with the threaders has shown, warnings flare up after each improvement to jump threading. I expect this to be no different. I've promised Jakub to investigate fully, so I will analyze and add the appropriate PR for the warning experts. Oh yeah, the new pass dump is called vrp-threader[12] to match each VRP[12] pass. However, there's no reason for it to either be named vrp-threader, or for it to live in tree-vrp.c. Tested on x86-64 Linux. OK? p.s. "Did I say 5 weeks? My bad, I meant 5 months." gcc/ChangeLog: * passes.def (pass_vrp_threader): New. * tree-pass.h (make_pass_vrp_threader): Add make_pass_vrp_threader. * tree-ssa-threadedge.c (hybrid_jt_state::register_equivs_stmt): New. (hybrid_jt_simplifier::hybrid_jt_simplifier): New. (hybrid_jt_simplifier::simplify): New. (hybrid_jt_simplifier::compute_ranges_from_state): New. * tree-ssa-threadedge.h (class hybrid_jt_state): New. (class hybrid_jt_simplifier): New. * tree-vrp.c (execute_vrp): Remove ASSERT_EXPR based jump threader. (class hybrid_threader): New. (hybrid_threader::hybrid_threader): New. (hybrid_threader::~hybrid_threader): New. (hybrid_threader::before_dom_children): New. (hybrid_threader::after_dom_children): New. (execute_vrp_threader): New. (class pass_vrp_threader): New. (make_pass_vrp_threader): New. libgomp/ChangeLog: * team.c: Initialize start_data. * testsuite/libgomp.graphite/force-parallel-4.c: Adjust. * testsuite/libgomp.graphite/force-parallel-8.c: Adjust. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr55107.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-1.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-2.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-3.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-4.c: Adjust. * gcc.dg/tree-ssa/pr21559.c: Adjust. * gcc.dg/tree-ssa/pr59597.c: Adjust. * gcc.dg/tree-ssa/pr61839_1.c: Adjust. * gcc.dg/tree-ssa/pr61839_3.c: Adjust. * gcc.dg/tree-ssa/pr71437.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-11.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-16.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-2a.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-4.c: Adjust. * gcc.dg/tree-ssa/ssa-thread-14.c: Adjust. * gcc.dg/tree-ssa/ssa-vrp-thread-1.c: Adjust. * gcc.dg/tree-ssa/vrp106.c: Adjust. * gcc.dg/tree-ssa/vrp55.c: Adjust. --- gcc/passes.def
Re: [PATCH] Avoid invalid loop transformations in jump threading registry.
On 9/24/2021 5:34 AM, Aldy Hernandez wrote: On 9/23/21 6:10 PM, Jeff Law wrote: On 9/23/2021 5:15 AM, Aldy Hernandez wrote: My upcoming improvements to the forward jump threader make it thread more aggressively. In investigating some "regressions", I noticed that it has always allowed threading through empty latches and across loop boundaries. As we have discussed recently, this should be avoided until after loop optimizations have run their course. Note that this wasn't much of a problem before because DOM/VRP couldn't find these opportunities, but with a smarter solver, we trip over them more easily. We used to be much more aggressive in this space -- but we removed the equivalency tracking on backedges in the main part of DOM which had the side effect to reducing the number of threads related to back edges in loops. I thought we couldn't thread through back edges at all in the old threader, or are we talking about the same thing? We have a hard fail on backedge thread attempts for anything but the backward threader and its custom copier. We used to have it in the distant past IIRC. Jeff
Re: [PATCH] [GIMPLE] Simplify (_Float16) ceil ((double) x) to .CEIL (x) when available.
On Fri, Sep 24, 2021 at 1:26 PM liuhongt wrote: > > Hi: > Related discussion in [1] and PR. > > Bootstrapped and regtest on x86_64-linux-gnu{-m32,}. > Ok for trunk? > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574330.html > > gcc/ChangeLog: > > PR target/102464 > * config/i386/i386.c (ix86_optab_supported_p): > Return true for HFmode. > * match.pd: Simplify (_Float16) ceil ((double) x) to > __builtin_ceilf16 (a) when a is _Float16 type and > direct_internal_fn_supported_p. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr102464.c: New test. OK for x86 part. Thanks, Uros. > --- > gcc/config/i386/i386.c | 20 +++- > gcc/match.pd | 28 + > gcc/testsuite/gcc.target/i386/pr102464.c | 39 > 3 files changed, 79 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr102464.c > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index ba89e111d28..3767fe9806d 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -23582,20 +23582,24 @@ ix86_optab_supported_p (int op, machine_mode mode1, > machine_mode, >return opt_type == OPTIMIZE_FOR_SPEED; > > case rint_optab: > - if (SSE_FLOAT_MODE_P (mode1) > - && TARGET_SSE_MATH > - && !flag_trapping_math > - && !TARGET_SSE4_1) > + if (mode1 == HFmode) > + return true; > + else if (SSE_FLOAT_MODE_P (mode1) > + && TARGET_SSE_MATH > + && !flag_trapping_math > + && !TARGET_SSE4_1) > return opt_type == OPTIMIZE_FOR_SPEED; >return true; > > case floor_optab: > case ceil_optab: > case btrunc_optab: > - if (SSE_FLOAT_MODE_P (mode1) > - && TARGET_SSE_MATH > - && !flag_trapping_math > - && TARGET_SSE4_1) > + if (mode1 == HFmode) > + return true; > + else if (SSE_FLOAT_MODE_P (mode1) > + && TARGET_SSE_MATH > + && !flag_trapping_math > + && TARGET_SSE4_1) > return true; >return opt_type == OPTIMIZE_FOR_SPEED; > > diff --git a/gcc/match.pd b/gcc/match.pd > index a9791ceb74a..9ccec8b6ce3 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -6191,6 +6191,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (froms (convert float_value_p@0)) > (convert (tos @0) > > +#if GIMPLE > +(match float16_value_p > + @0 > + (if (TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == float16_type_node))) > +(for froms (BUILT_IN_TRUNCL BUILT_IN_TRUNC BUILT_IN_TRUNCF > + BUILT_IN_FLOORL BUILT_IN_FLOOR BUILT_IN_FLOORF > + BUILT_IN_CEILL BUILT_IN_CEIL BUILT_IN_CEILF > + BUILT_IN_ROUNDEVENL BUILT_IN_ROUNDEVEN BUILT_IN_ROUNDEVENF > + BUILT_IN_ROUNDL BUILT_IN_ROUND BUILT_IN_ROUNDF > + BUILT_IN_NEARBYINTL BUILT_IN_NEARBYINT BUILT_IN_NEARBYINTF > + BUILT_IN_RINTL BUILT_IN_RINT BUILT_IN_RINTF) > + tos (IFN_TRUNC IFN_TRUNC IFN_TRUNC > + IFN_FLOOR IFN_FLOOR IFN_FLOOR > + IFN_CEIL IFN_CEIL IFN_CEIL > + IFN_ROUNDEVEN IFN_ROUNDEVEN IFN_ROUNDEVEN > + IFN_ROUND IFN_ROUND IFN_ROUND > + IFN_NEARBYINT IFN_NEARBYINT IFN_NEARBYINT > + IFN_RINT IFN_RINT IFN_RINT) > + /* (_Float16) round ((doube) x) -> __built_in_roundf16 (x), etc., > +if x is a _Float16. */ > + (simplify > + (convert (froms (convert float16_value_p@0))) > + (if (types_match (type, TREE_TYPE (@0)) > + && direct_internal_fn_supported_p (as_internal_fn (tos), > +type, OPTIMIZE_FOR_BOTH)) > + (tos @0 > +#endif > + > (for froms (XFLOORL XCEILL XROUNDL XRINTL) > tos (XFLOOR XCEIL XROUND XRINT) > /* llfloorl(extend(x)) -> llfloor(x), etc., if x is a double. */ > diff --git a/gcc/testsuite/gcc.target/i386/pr102464.c > b/gcc/testsuite/gcc.target/i386/pr102464.c > new file mode 100644 > index 000..e3e060ee80b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr102464.c > @@ -0,0 +1,39 @@ > +/* PR target/102464. */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +#define FOO(FUNC,SUFFIX) \ > + _Float16 \ > + foo_##FUNC##_##SUFFIX (_Float16 a) \ > + {\ > +return __builtin_##FUNC##SUFFIX (a); \ > + } > + > +FOO (roundeven, f16); > +FOO (roundeven, f); > +FOO (roundeven, ); > +FOO (roundeven, l); > +FOO (trunc, f16); > +FOO (trunc, f); > +FOO (trunc, ); > +FOO (trunc, l); > +FOO (ceil, f16); > +FOO (ceil, f); > +FOO (ceil, ); > +FOO (ceil, l); > +FOO (floor, f16); > +FOO (floor, f); > +FOO (floor, ); > +FOO (floor, l); > +FOO (nearbyint, f16); > +FOO (nearbyint, f); > +FOO (nearbyint, ); > +FOO (nearbyint, l); > +FOO (rint, f16); > +FOO (rint, f); > +FOO (rint, ); > +
[PING}[PATCH] libgcc, emutls: Allow building weak definitions of the emutls functions.
Hi, as noted below the non-Darwin parts of this are trivial (and a no-OP). I’d like to apply this to start work towards solving Darwin’s libgcc issues, OTOH, the two raised questions remain… thanks Iain > On 20 Sep 2021, at 09:25, Iain Sandoe wrote: > > Hi, > > The non-Darwin part of this patch is trivial but raises a couple of questions > > A/ > We define builtins to support emulated TLS. > These are defined with void * pointers > The implementation (in libgcc) uses the correct type (struct __emutls_object > *) > in both a forward declaration of the functions and in thier eventual > implementation. > > This leads to a (long-standing, nothing new) complaint at build-time about > the mismatch in the builtin/implementation decls. > > AFAICT, there’s no way to fix that unless we introduce struct __emutls_object > * > as a built-in type? > > B/ > It seems that a consequence of the mismatch in decls means that if I apply > attributes to the decl (in the implementation file), they are ignored and I > have > to apply them to the definition in order for this to work. > > This (B) is what the patch below does. > > tested on powerpc,i686,x86_64-darwin, x86_64-linux > OK for master? > thanks, > Iain > > If the current situation is that A or B indicates “there’s a bug”, please > could that > be considered as distinct from the current patch (which doesn’t alter this in > any > way) so that we can make progress on fixing Darwin libgcc issues. > > = commit log > > In order to better support use of the emulated TLS between objects with > DSO dependencies and static-linked libgcc, allow a target to make weak > definitions. > > Signed-off-by: Iain Sandoe > > libgcc/ChangeLog: > > * config.host: Add weak-defined emutls crt. > * config/t-darwin: Build weak-defined emutls objects. > * emutls.c (__emutls_get_address): Add optional attributes. > (__emutls_register_common): Likewise. > (EMUTLS_ATTR): New. > --- > libgcc/config.host | 2 +- > libgcc/config/t-darwin | 13 + > libgcc/emutls.c| 17 +++-- > 3 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/libgcc/config.host b/libgcc/config.host > index 6c34b13d611..a447ac7ae30 100644 > --- a/libgcc/config.host > +++ b/libgcc/config.host > @@ -215,7 +215,7 @@ case ${host} in > *-*-darwin*) > asm_hidden_op=.private_extern > tmake_file="$tmake_file t-darwin ${cpu_type}/t-darwin t-libgcc-pic > t-slibgcc-darwin" > - extra_parts="crt3.o libd10-uwfef.a crttms.o crttme.o" > + extra_parts="crt3.o libd10-uwfef.a crttms.o crttme.o libemutls_w.a" > ;; > *-*-dragonfly*) > tmake_file="$tmake_file t-crtstuff-pic t-libgcc-pic t-eh-dw2-dip" > diff --git a/libgcc/config/t-darwin b/libgcc/config/t-darwin > index 14ae6b35a4e..d6f688d66d5 100644 > --- a/libgcc/config/t-darwin > +++ b/libgcc/config/t-darwin > @@ -15,6 +15,19 @@ crttme.o: $(srcdir)/config/darwin-crt-tm.c > LIB2ADDEH = $(srcdir)/unwind-dw2.c $(srcdir)/config/unwind-dw2-fde-darwin.c \ > $(srcdir)/unwind-sjlj.c $(srcdir)/unwind-c.c > > +# Make emutls weak so that we can deal with -static-libgcc, override the > +# hidden visibility when this is present in libgcc_eh. > +emutls.o: HOST_LIBGCC2_CFLAGS += \ > + -DEMUTLS_ATTR='__attribute__((__weak__,__visibility__("default")))' > +emutls_s.o: HOST_LIBGCC2_CFLAGS += \ > + -DEMUTLS_ATTR='__attribute__((__weak__,__visibility__("default")))' > + > +# Make the emutls crt as a convenience lib so that it can be linked > +# optionally, use the shared version so that we can link with DSO. > +libemutls_w.a: emutls_s.o > + $(AR_CREATE_FOR_TARGET) $@ $< > + $(RANLIB_FOR_TARGET) $@ > + > # Patch to __Unwind_Find_Enclosing_Function for Darwin10. > d10-uwfef.o: $(srcdir)/config/darwin10-unwind-find-enc-func.c > $(crt_compile) -mmacosx-version-min=10.6 -c $< > diff --git a/libgcc/emutls.c b/libgcc/emutls.c > index ed2658170f5..d553a74728f 100644 > --- a/libgcc/emutls.c > +++ b/libgcc/emutls.c > @@ -50,7 +50,16 @@ struct __emutls_array > void **data[]; > }; > > +/* EMUTLS_ATTR is provided to allow targets to build the emulated tls > + routines as weak definitions, for example. > + If there is no definition, fall back to the default. */ > +#ifndef EMUTLS_ATTR > +# define EMUTLS_ATTR > +#endif > + > +EMUTLS_ATTR > void *__emutls_get_address (struct __emutls_object *); > +EMUTLS_ATTR > void __emutls_register_common (struct __emutls_object *, word, word, void *); > > #ifdef __GTHREADS > @@ -123,7 +132,11 @@ emutls_alloc (struct __emutls_object *obj) > return ret; > } > > -void * > +/* Despite applying the attribute to the declaration, in this case the mis- > + match between the builtin's declaration [void * (*)(void *)] and the > + implementation here, causes the decl. attributes to be discarded. */ > + > +EMUTLS_ATTR void * > __emutls_get_address (struct __emutls_object *obj) > { > if (! __gthread_active_p ()) > @@ -187,7 +200,7 @@ __emutls_get_address (stru
Re: [PATCH] top-level: merge Makefile.def patches from binutils-gdb repository
* Richard Biener [2021-09-24 13:58:20 +0200]: > On Fri, Sep 24, 2021 at 12:49 PM Andrew Burgess > wrote: > > > > This commit back-ports two patches to Makefile.def from the > > binutils-gdb repository, these patches were committed over there > > without first being merged in to the gcc repository. > > > > These commits all relate to dependencies for binutils-gdb modules, so > > should have no impact on gcc, I tested a gcc build/install on x86-64 > > GNU/Linux, and everything looked OK. > > > > The two patches being backported are binutils-gdb commits: > > > > commit ba4d88ad892fe29c6ca7938c8861f8edef5f7a3f (gdb-gnulib-issues) > > Date: Mon Oct 12 16:04:32 2020 +0100 > > > > gdb/gdbserver: add dependencies for distclean-gnulib > > > > And > > > > commit 755ba58ebef02e1be9fc6770d00243ba6ed0223c > > Date: Thu Mar 18 12:37:52 2021 + > > > > Add install dependencies for ld -> bfd and libctf -> bfd > > > > OK to merge? > > OK. Thanks, I pushed this patch. Andrew
Re: [PATCH] Allow different vector types for stmt groups
Richard Biener writes: > This allows vectorization (in practice non-loop vectorization) to > have a stmt participate in different vector type vectorizations. > It allows us to remove vect_update_shared_vectype and replace it > by pushing/popping STMT_VINFO_VECTYPE from SLP_TREE_VECTYPE around > vect_analyze_stmt and vect_transform_stmt. > > For data-ref the situation is a bit more complicated since we > analyze alignment info with a specific vector type in mind which > doesn't play well when that changes. > > So the bulk of the change is passing down the actual vector type > used for a vectorized access to the various accessors of alignment > info, first and foremost dr_misalignment but also aligned_access_p, > known_alignment_for_access_p, vect_known_alignment_in_bytes and > vect_supportable_dr_alignment. I took the liberty to replace > ALL_CAPS macro accessors with the lower-case function invocations. > > The actual changes to the behavior are in dr_misalignment which now > is the place factoring in the negative step adjustment as well as > handling alignment queries for a vector type with bigger alignment > requirements than what we can (or have) analyze(d). > > vect_slp_analyze_node_alignment makes use of this and upon receiving > a vector type with a bigger alingment desire re-analyzes the DR > with respect to it but keeps an older more precise result if possible. > In this context it might be possible to do the analysis just once > but instead of analyzing with respect to a specific desired alignment > look for the biggest alignment we can compute a not unknown alignment. > > The ChangeLog includes the functional changes but not the bulk due > to the alignment accessor API changes - I hope that's something good. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, testing on SPEC > CPU 2017 in progress (for stats and correctness). > > Any comments? Sorry for the super-slow response, some comments below. > […] > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c > index a57700f2c1b..c42fc2fb272 100644 > --- a/gcc/tree-vect-data-refs.c > +++ b/gcc/tree-vect-data-refs.c > @@ -887,37 +887,53 @@ vect_slp_analyze_instance_dependence (vec_info *vinfo, > slp_instance instance) >return res; > } > > -/* Return the misalignment of DR_INFO. */ > +/* Return the misalignment of DR_INFO accessed in VECTYPE. */ > > int > -dr_misalignment (dr_vec_info *dr_info) > +dr_misalignment (dr_vec_info *dr_info, tree vectype) > { > + HOST_WIDE_INT diff = 0; > + /* Alignment is only analyzed for the first element of a DR group, > + use that but adjust misalignment by the offset of the access. */ >if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt)) > { >dr_vec_info *first_dr > = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (dr_info->stmt)); > - int misalign = first_dr->misalignment; > - gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED); > - if (misalign == DR_MISALIGNMENT_UNKNOWN) > - return misalign; >/* vect_analyze_data_ref_accesses guarantees that DR_INIT are >INTEGER_CSTs and the first element in the group has the lowest >address. Likewise vect_compute_data_ref_alignment will >have ensured that target_alignment is constant and otherwise >set misalign to DR_MISALIGNMENT_UNKNOWN. */ Can you move the second sentence down so that it stays with the to_constant? > - HOST_WIDE_INT diff = (TREE_INT_CST_LOW (DR_INIT (dr_info->dr)) > - - TREE_INT_CST_LOW (DR_INIT (first_dr->dr))); > + diff = (TREE_INT_CST_LOW (DR_INIT (dr_info->dr)) > + - TREE_INT_CST_LOW (DR_INIT (first_dr->dr))); >gcc_assert (diff >= 0); > - unsigned HOST_WIDE_INT target_alignment_c > - = first_dr->target_alignment.to_constant (); > - return (misalign + diff) % target_alignment_c; > + dr_info = first_dr; > } > - else > + > + int misalign = dr_info->misalignment; > + gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED); > + if (misalign == DR_MISALIGNMENT_UNKNOWN) > +return misalign; > + > + /* If the access is only aligned for a vector type with smaller alignment > + requirement the access has unknown misalignment. */ > + if (maybe_lt (dr_info->target_alignment * BITS_PER_UNIT, > + targetm.vectorize.preferred_vector_alignment (vectype))) > +return DR_MISALIGNMENT_UNKNOWN; > + > + /* If this is a backward running DR then first access in the larger > + vectype actually is N-1 elements before the address in the DR. > + Adjust misalign accordingly. */ > + if (tree_int_cst_sgn (DR_STEP (dr_info->dr)) < 0) > { > - int misalign = dr_info->misalignment; > - gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED); > - return misalign; > + if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant ()) > + return DR_MISALIGNMENT_UNKNOWN; > + diff += ((TYPE_VECTOR_SUBPARTS (vectype).to_constant () - 1) > +
Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC
On Tue, Sep 21, 2021 at 7:08 PM Fāng-ruì Sòng wrote: > > On Tue, Sep 21, 2021 at 6:57 PM H.J. Lu wrote: > > > > On Tue, Sep 21, 2021 at 9:16 AM Uros Bizjak wrote: > > > > > > On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches > > > wrote: > > > > > > > > PING^5 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng > > > > wrote: > > > > > > > > > > PING^4 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > > > > One major design goal of PIE was to avoid copy relocations. > > > > > The original patch for GCC 5 caused problems for many years. > > > > > > > > > > On Wed, Aug 18, 2021 at 11:54 PM Fāng-ruì Sòng > > > > > wrote: > > > > >> > > > > >> PING^3 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > >> > > > > >> On Fri, Jun 4, 2021 at 3:04 PM Fāng-ruì Sòng > > > > >> wrote: > > > > >> > > > > > >> > PING^2 > > > > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > >> > > > > > >> > On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng > > > > >> > wrote: > > > > >> > > > > > > >> > > Ping > > > > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > >> > > > > > > >> > > On Tue, May 11, 2021 at 8:29 PM Fangrui Song > > > > >> > > wrote: > > > > >> > > > > > > > >> > > > This was introduced in 2014-12 to use local binding for > > > > >> > > > external symbols > > > > >> > > > for -fPIE. Now that we have H.J. Lu's GOTPCRELX for years > > > > >> > > > which mostly > > > > >> > > > nullify the benefit of HAVE_LD_PIE_COPYRELOC, > > > > >> > > > HAVE_LD_PIE_COPYRELOC > > > > >> > > > should retire now. > > > > >> > > > > > > > >> > > > One design goal of -fPIE was to avoid copy relocations. > > > > >> > > > HAVE_LD_PIE_COPYRELOC has deviated from the goal. With this > > > > >> > > > change, the > > > > >> > > > -fPIE behavior of x86-64 will be closer to x86-32 and other > > > > >> > > > targets. > > > > >> > > > > > > > >> > > > --- > > > > >> > > > > > > > >> > > > See https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html > > > > >> > > > for a list > > > > >> > > > of fixed and unfixed (e.g. gold incompatibility with protected > > > > >> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=19823) issues. > > > > >> > > > > > > > >> > > > If you prefer a longer write-up, see > > > > >> > > > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected > > > > >> > > > --- > > > > >> > > > gcc/config.in | 6 --- > > > > >> > > > gcc/config/i386/i386.c| 11 +--- > > > > >> > > > gcc/configure | 52 > > > > >> > > > --- > > > > >> > > > gcc/configure.ac | 48 > > > > >> > > > - > > > > >> > > > gcc/doc/sourcebuild.texi | 3 -- > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-1.c| 14 - > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-2.c| 14 - > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-3.c| 14 - > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-4.c| 17 -- > > > > >> > > > gcc/testsuite/lib/target-supports.exp | 47 > > > > >> > > > - > > > > >> > > > 10 files changed, 2 insertions(+), 224 deletions(-) > > > > >> > > > delete mode 100644 > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c > > > > >> > > > delete mode 100644 > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c > > > > >> > > > delete mode 100644 > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c > > > > >> > > > delete mode 100644 > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c > > > > > > From x86 maintainer's PoV, the implementation is trivially correct, > > > but I have no idea about functionality. HJ, can you please review the > > > functionality and post your opinion on the patch to move it forward? > > > > > > Thanks, > > > Uros. > > > > I prefer to leave it alone and apply this: > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576736.html > > > > instead. I am working to add a nodirect_extern_access attribute based > > on feedback at LPC 2021. > > I think -fpie should be fixed as soon as possible. > > "Add -f[no-]direct-extern-access" says "-fdirect-extern-access is the > default." > IMHO this is not a good choice for -fpie. > As the description of this patch says, one of the design goals of > -fpie is to avoid copy relocations. > > > In executable and shared library, bind symbols with the STV_PROTECTED > > visibility locally > > As I have repeated many times (also Clang's behavior), STV_PROTECTED > visibility symbol should be bound locally regardless of > -fno-direct-extern-access. > > I think it is fair to say all of Michael Matz, Alan Modra, and I think > adding so many beh
Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC
On Fri, Sep 24, 2021 at 10:29 AM Fāng-ruì Sòng wrote: > > On Tue, Sep 21, 2021 at 7:08 PM Fāng-ruì Sòng wrote: > > > > On Tue, Sep 21, 2021 at 6:57 PM H.J. Lu wrote: > > > > > > On Tue, Sep 21, 2021 at 9:16 AM Uros Bizjak wrote: > > > > > > > > On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches > > > > wrote: > > > > > > > > > > PING^5 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > > > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng > > > > > wrote: > > > > > > > > > > > > PING^4 > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > > > > > > One major design goal of PIE was to avoid copy relocations. > > > > > > The original patch for GCC 5 caused problems for many years. > > > > > > > > > > > > On Wed, Aug 18, 2021 at 11:54 PM Fāng-ruì Sòng > > > > > > wrote: > > > > > >> > > > > > >> PING^3 > > > > > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > >> > > > > > >> On Fri, Jun 4, 2021 at 3:04 PM Fāng-ruì Sòng > > > > > >> wrote: > > > > > >> > > > > > > >> > PING^2 > > > > > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > >> > > > > > > >> > On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng > > > > > >> > wrote: > > > > > >> > > > > > > > >> > > Ping > > > > > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > >> > > > > > > > >> > > On Tue, May 11, 2021 at 8:29 PM Fangrui Song > > > > > >> > > wrote: > > > > > >> > > > > > > > > >> > > > This was introduced in 2014-12 to use local binding for > > > > > >> > > > external symbols > > > > > >> > > > for -fPIE. Now that we have H.J. Lu's GOTPCRELX for years > > > > > >> > > > which mostly > > > > > >> > > > nullify the benefit of HAVE_LD_PIE_COPYRELOC, > > > > > >> > > > HAVE_LD_PIE_COPYRELOC > > > > > >> > > > should retire now. > > > > > >> > > > > > > > > >> > > > One design goal of -fPIE was to avoid copy relocations. > > > > > >> > > > HAVE_LD_PIE_COPYRELOC has deviated from the goal. With this > > > > > >> > > > change, the > > > > > >> > > > -fPIE behavior of x86-64 will be closer to x86-32 and other > > > > > >> > > > targets. > > > > > >> > > > > > > > > >> > > > --- > > > > > >> > > > > > > > > >> > > > See https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html > > > > > >> > > > for a list > > > > > >> > > > of fixed and unfixed (e.g. gold incompatibility with > > > > > >> > > > protected > > > > > >> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=19823) > > > > > >> > > > issues. > > > > > >> > > > > > > > > >> > > > If you prefer a longer write-up, see > > > > > >> > > > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected > > > > > >> > > > --- > > > > > >> > > > gcc/config.in | 6 --- > > > > > >> > > > gcc/config/i386/i386.c| 11 +--- > > > > > >> > > > gcc/configure | 52 > > > > > >> > > > --- > > > > > >> > > > gcc/configure.ac | 48 > > > > > >> > > > - > > > > > >> > > > gcc/doc/sourcebuild.texi | 3 -- > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-1.c| 14 - > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-2.c| 14 - > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-3.c| 14 - > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-4.c| 17 -- > > > > > >> > > > gcc/testsuite/lib/target-supports.exp | 47 > > > > > >> > > > - > > > > > >> > > > 10 files changed, 2 insertions(+), 224 deletions(-) > > > > > >> > > > delete mode 100644 > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c > > > > > >> > > > delete mode 100644 > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c > > > > > >> > > > delete mode 100644 > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c > > > > > >> > > > delete mode 100644 > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c > > > > > > > > From x86 maintainer's PoV, the implementation is trivially correct, > > > > but I have no idea about functionality. HJ, can you please review the > > > > functionality and post your opinion on the patch to move it forward? > > > > > > > > Thanks, > > > > Uros. > > > > > > I prefer to leave it alone and apply this: > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576736.html > > > > > > instead. I am working to add a nodirect_extern_access attribute based > > > on feedback at LPC 2021. > > > > I think -fpie should be fixed as soon as possible. > > > > "Add -f[no-]direct-extern-access" says "-fdirect-extern-access is the > > default." > > IMHO this is not a good choice for -fpie. > > As the description of this patch says, one of the design goals of > > -fpie is to avoid copy relocations. > > > > > In exec
[PATCH] Add a simulate_record_decl lang hook
This patch adds a lang hook for defining a struct/RECORD_TYPE “as if” it had appeared directly in the source code. It follows the similar existing hook for enums. It's the caller's responsibility to create the fields (as FIELD_DECLs) but the hook's responsibility to create and declare the associated RECORD_TYPE. For now the hook is hard-coded to do the equivalent of: typedef struct NAME { FIELDS } NAME; but this could be controlled by an extra parameter if some callers want a different behaviour in future. The motivating use case is to allow the long list of struct definitions in arm_neon.h to be provided by the compiler, which in turn unblocks various arm_neon.h optimisations. Tested on aarch64-linux-gnu, individually and with a follow-on patch from Jonathan that makes use of the hook. OK to install? Richard gcc/ * langhooks.h (lang_hooks_for_types::simulate_record_decl): New hook. * langhooks-def.h (lhd_simulate_record_decl): Declare. (LANG_HOOKS_SIMULATE_RECORD_DECL): Define. (LANG_HOOKS_FOR_TYPES_INITIALIZER): Include it. * langhooks.c (lhd_simulate_record_decl): New function. gcc/c/ * c-tree.h (c_simulate_record_decl): Declare. * c-objc-common.h (LANG_HOOKS_SIMULATE_RECORD_DECL): Override. * c-decl.c (c_simulate_record_decl): New function. gcc/cp/ * decl.c: Include langhooks-def.h. (cxx_simulate_record_decl): New function. * cp-objcp-common.h (cxx_simulate_record_decl): Declare. (LANG_HOOKS_SIMULATE_RECORD_DECL): Override. --- gcc/c/c-decl.c | 31 +++ gcc/c/c-objc-common.h| 2 ++ gcc/c/c-tree.h | 2 ++ gcc/cp/cp-objcp-common.h | 4 gcc/cp/decl.c| 38 ++ gcc/langhooks-def.h | 4 gcc/langhooks.c | 21 + gcc/langhooks.h | 10 ++ 8 files changed, 112 insertions(+) diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c index 771efa3eadf..8d1324b118c 100644 --- a/gcc/c/c-decl.c +++ b/gcc/c/c-decl.c @@ -9436,6 +9436,37 @@ c_simulate_enum_decl (location_t loc, const char *name, input_location = saved_loc; return enumtype; } + +/* Implement LANG_HOOKS_SIMULATE_RECORD_DECL. */ + +tree +c_simulate_record_decl (location_t loc, const char *name, + array_slice fields) +{ + location_t saved_loc = input_location; + input_location = loc; + + class c_struct_parse_info *struct_info; + tree ident = get_identifier (name); + tree type = start_struct (loc, RECORD_TYPE, ident, &struct_info); + + for (unsigned int i = 0; i < fields.size (); ++i) +{ + DECL_FIELD_CONTEXT (fields[i]) = type; + if (i > 0) + DECL_CHAIN (fields[i - 1]) = fields[i]; +} + + finish_struct (loc, type, fields[0], NULL_TREE, struct_info); + + tree decl = build_decl (loc, TYPE_DECL, ident, type); + TYPE_NAME (type) = decl; + TYPE_STUB_DECL (type) = decl; + lang_hooks.decls.pushdecl (decl); + + input_location = saved_loc; + return type; +} /* Create the FUNCTION_DECL for a function definition. DECLSPECS, DECLARATOR and ATTRIBUTES are the parts of diff --git a/gcc/c/c-objc-common.h b/gcc/c/c-objc-common.h index 7d35a0621e4..f4e8271f06c 100644 --- a/gcc/c/c-objc-common.h +++ b/gcc/c/c-objc-common.h @@ -81,6 +81,8 @@ along with GCC; see the file COPYING3. If not see #undef LANG_HOOKS_SIMULATE_ENUM_DECL #define LANG_HOOKS_SIMULATE_ENUM_DECL c_simulate_enum_decl +#undef LANG_HOOKS_SIMULATE_RECORD_DECL +#define LANG_HOOKS_SIMULATE_RECORD_DECL c_simulate_record_decl #undef LANG_HOOKS_TYPE_FOR_MODE #define LANG_HOOKS_TYPE_FOR_MODE c_common_type_for_mode #undef LANG_HOOKS_TYPE_FOR_SIZE diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h index d50d0cb7f2d..8578d2d1e77 100644 --- a/gcc/c/c-tree.h +++ b/gcc/c/c-tree.h @@ -598,6 +598,8 @@ extern tree finish_struct (location_t, tree, tree, tree, class c_struct_parse_info *); extern tree c_simulate_enum_decl (location_t, const char *, vec *); +extern tree c_simulate_record_decl (location_t, const char *, + array_slice); extern struct c_arg_info *build_arg_info (void); extern struct c_arg_info *get_parm_info (bool, tree); extern tree grokfield (location_t, struct c_declarator *, diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h index f1704aad557..d5859406e8f 100644 --- a/gcc/cp/cp-objcp-common.h +++ b/gcc/cp/cp-objcp-common.h @@ -39,6 +39,8 @@ extern bool cp_handle_option (size_t, const char *, HOST_WIDE_INT, int, extern tree cxx_make_type_hook (tree_code); extern tree cxx_simulate_enum_decl (location_t, const char *, vec *); +extern tree cxx_simulate_record_decl (location_t, const char *, + array_slice); /* Lang hooks that are shared between C++ and ObjC++ are defined here. Hooks
Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC
On Fri, Sep 24, 2021 at 10:41 AM H.J. Lu wrote: > > On Fri, Sep 24, 2021 at 10:29 AM Fāng-ruì Sòng wrote: > > > > On Tue, Sep 21, 2021 at 7:08 PM Fāng-ruì Sòng wrote: > > > > > > On Tue, Sep 21, 2021 at 6:57 PM H.J. Lu wrote: > > > > > > > > On Tue, Sep 21, 2021 at 9:16 AM Uros Bizjak wrote: > > > > > > > > > > On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches > > > > > wrote: > > > > > > > > > > > > PING^5 > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > > > > > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng > > > > > > wrote: > > > > > > > > > > > > > > PING^4 > > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > > > > > > > > One major design goal of PIE was to avoid copy relocations. > > > > > > > The original patch for GCC 5 caused problems for many years. > > > > > > > > > > > > > > On Wed, Aug 18, 2021 at 11:54 PM Fāng-ruì Sòng > > > > > > > wrote: > > > > > > >> > > > > > > >> PING^3 > > > > > > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > >> > > > > > > >> On Fri, Jun 4, 2021 at 3:04 PM Fāng-ruì Sòng > > > > > > >> wrote: > > > > > > >> > > > > > > > >> > PING^2 > > > > > > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > >> > > > > > > > >> > On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng > > > > > > >> > wrote: > > > > > > >> > > > > > > > > >> > > Ping > > > > > > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > >> > > > > > > > > >> > > On Tue, May 11, 2021 at 8:29 PM Fangrui Song > > > > > > >> > > wrote: > > > > > > >> > > > > > > > > > >> > > > This was introduced in 2014-12 to use local binding for > > > > > > >> > > > external symbols > > > > > > >> > > > for -fPIE. Now that we have H.J. Lu's GOTPCRELX for years > > > > > > >> > > > which mostly > > > > > > >> > > > nullify the benefit of HAVE_LD_PIE_COPYRELOC, > > > > > > >> > > > HAVE_LD_PIE_COPYRELOC > > > > > > >> > > > should retire now. > > > > > > >> > > > > > > > > > >> > > > One design goal of -fPIE was to avoid copy relocations. > > > > > > >> > > > HAVE_LD_PIE_COPYRELOC has deviated from the goal. With > > > > > > >> > > > this change, the > > > > > > >> > > > -fPIE behavior of x86-64 will be closer to x86-32 and > > > > > > >> > > > other targets. > > > > > > >> > > > > > > > > > >> > > > --- > > > > > > >> > > > > > > > > > >> > > > See > > > > > > >> > > > https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html > > > > > > >> > > > for a list > > > > > > >> > > > of fixed and unfixed (e.g. gold incompatibility with > > > > > > >> > > > protected > > > > > > >> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=19823) > > > > > > >> > > > issues. > > > > > > >> > > > > > > > > > >> > > > If you prefer a longer write-up, see > > > > > > >> > > > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected > > > > > > >> > > > --- > > > > > > >> > > > gcc/config.in | 6 --- > > > > > > >> > > > gcc/config/i386/i386.c| 11 +--- > > > > > > >> > > > gcc/configure | 52 > > > > > > >> > > > --- > > > > > > >> > > > gcc/configure.ac | 48 > > > > > > >> > > > - > > > > > > >> > > > gcc/doc/sourcebuild.texi | 3 -- > > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-1.c| 14 - > > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-2.c| 14 - > > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-3.c| 14 - > > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-4.c| 17 -- > > > > > > >> > > > gcc/testsuite/lib/target-supports.exp | 47 > > > > > > >> > > > - > > > > > > >> > > > 10 files changed, 2 insertions(+), 224 deletions(-) > > > > > > >> > > > delete mode 100644 > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c > > > > > > >> > > > delete mode 100644 > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c > > > > > > >> > > > delete mode 100644 > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c > > > > > > >> > > > delete mode 100644 > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c > > > > > > > > > > From x86 maintainer's PoV, the implementation is trivially correct, > > > > > but I have no idea about functionality. HJ, can you please review the > > > > > functionality and post your opinion on the patch to move it forward? > > > > > > > > > > Thanks, > > > > > Uros. > > > > > > > > I prefer to leave it alone and apply this: > > > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576736.html > > > > > > > > instead. I am working to add a nodirect_extern_access attribute based > > > > on feedback at LPC 2021. > > > > > > I think -fpie sho
Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC
On Fri, Sep 24, 2021 at 11:14 AM Fāng-ruì Sòng wrote: > > On Fri, Sep 24, 2021 at 10:41 AM H.J. Lu wrote: > > > > On Fri, Sep 24, 2021 at 10:29 AM Fāng-ruì Sòng wrote: > > > > > > On Tue, Sep 21, 2021 at 7:08 PM Fāng-ruì Sòng wrote: > > > > > > > > On Tue, Sep 21, 2021 at 6:57 PM H.J. Lu wrote: > > > > > > > > > > On Tue, Sep 21, 2021 at 9:16 AM Uros Bizjak wrote: > > > > > > > > > > > > On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches > > > > > > wrote: > > > > > > > > > > > > > > PING^5 > > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > > > > > > > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng > > > > > > > wrote: > > > > > > > > > > > > > > > > PING^4 > > > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > > > > > > > > > > One major design goal of PIE was to avoid copy relocations. > > > > > > > > The original patch for GCC 5 caused problems for many years. > > > > > > > > > > > > > > > > On Wed, Aug 18, 2021 at 11:54 PM Fāng-ruì Sòng > > > > > > > > wrote: > > > > > > > >> > > > > > > > >> PING^3 > > > > > > > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > >> > > > > > > > >> On Fri, Jun 4, 2021 at 3:04 PM Fāng-ruì Sòng > > > > > > > >> wrote: > > > > > > > >> > > > > > > > > >> > PING^2 > > > > > > > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > >> > > > > > > > > >> > On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng > > > > > > > >> > wrote: > > > > > > > >> > > > > > > > > > >> > > Ping > > > > > > > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html > > > > > > > >> > > > > > > > > > >> > > On Tue, May 11, 2021 at 8:29 PM Fangrui Song > > > > > > > >> > > wrote: > > > > > > > >> > > > > > > > > > > >> > > > This was introduced in 2014-12 to use local binding for > > > > > > > >> > > > external symbols > > > > > > > >> > > > for -fPIE. Now that we have H.J. Lu's GOTPCRELX for > > > > > > > >> > > > years which mostly > > > > > > > >> > > > nullify the benefit of HAVE_LD_PIE_COPYRELOC, > > > > > > > >> > > > HAVE_LD_PIE_COPYRELOC > > > > > > > >> > > > should retire now. > > > > > > > >> > > > > > > > > > > >> > > > One design goal of -fPIE was to avoid copy relocations. > > > > > > > >> > > > HAVE_LD_PIE_COPYRELOC has deviated from the goal. With > > > > > > > >> > > > this change, the > > > > > > > >> > > > -fPIE behavior of x86-64 will be closer to x86-32 and > > > > > > > >> > > > other targets. > > > > > > > >> > > > > > > > > > > >> > > > --- > > > > > > > >> > > > > > > > > > > >> > > > See > > > > > > > >> > > > https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html > > > > > > > >> > > > for a list > > > > > > > >> > > > of fixed and unfixed (e.g. gold incompatibility with > > > > > > > >> > > > protected > > > > > > > >> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=19823) > > > > > > > >> > > > issues. > > > > > > > >> > > > > > > > > > > >> > > > If you prefer a longer write-up, see > > > > > > > >> > > > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected > > > > > > > >> > > > --- > > > > > > > >> > > > gcc/config.in | 6 --- > > > > > > > >> > > > gcc/config/i386/i386.c| 11 +--- > > > > > > > >> > > > gcc/configure | 52 > > > > > > > >> > > > --- > > > > > > > >> > > > gcc/configure.ac | 48 > > > > > > > >> > > > - > > > > > > > >> > > > gcc/doc/sourcebuild.texi | 3 -- > > > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-1.c| 14 - > > > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-2.c| 14 - > > > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-3.c| 14 - > > > > > > > >> > > > .../gcc.target/i386/pie-copyrelocs-4.c| 17 > > > > > > > >> > > > -- > > > > > > > >> > > > gcc/testsuite/lib/target-supports.exp | 47 > > > > > > > >> > > > - > > > > > > > >> > > > 10 files changed, 2 insertions(+), 224 deletions(-) > > > > > > > >> > > > delete mode 100644 > > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c > > > > > > > >> > > > delete mode 100644 > > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c > > > > > > > >> > > > delete mode 100644 > > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c > > > > > > > >> > > > delete mode 100644 > > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c > > > > > > > > > > > > From x86 maintainer's PoV, the implementation is trivially correct, > > > > > > but I have no idea about functionality. HJ, can you please review > > > > > > the > > > > > > functionality and post your opinion on the patch to move it forward? > > > > > > > > > > > > Thanks, > > > > >
Re: *PING* [PATCH] c++: fix cases of core1001/1322 by not dropping cv-qualifier of function parameter of type of typename or decltype[PR101402,PR102033,PR102034,PR102039,PR102044]
I already responded to this patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579527.html
Re: [PATCH] c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]
On 8/28/21 07:54, nick huang via Gcc-patches wrote: Reference with cv-qualifiers should be ignored instead of causing an error because standard accepts cv-qualified references introduced by typedef which is ignored. Therefore, the fix prevents GCC from reporting error by not setting variable "bad_quals" in case the reference is introduced by typedef. Still the cv-qualifier is silently ignored. Here I quote spec (https://timsong-cpp.github.io/cppwp/dcl.ref#1): "Cv-qualified references are ill-formed except when the cv-qualifiers are introduced through the use of a typedef-name ([dcl.typedef], [temp.param]) or decltype-specifier ([dcl.type.decltype]), in which case the cv-qualifiers are ignored." PR c++/101783 gcc/cp/ChangeLog: 2021-08-27 qingzhe huang * tree.c (cp_build_qualified_type_real): The git commit verifier rejects this commit message with Checking 1fa0fbcdd15adf936ab4fae584f841beb35da1bb: FAILED ERR: missing description of a change: " * tree.c (cp_build_qualified_type_real):" (your initial patch had a description here, you just need to copy it over) ERR: PR 101783 in subject but not in changelog: "c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]" (the PR number needs to have a Tab before it) In Jonathan's earlier reply he asked how you tested the patch; this message still doesn't say anything about that. https://gcc.gnu.org/contribute.html#testing What is the legal status of your contributions? https://gcc.gnu.org/contribute.html#legal Existing code tries to handle this with the tf_ignore_bad_quals, but the unnecessary use of typename gets past the code that tries to set the flag. But your approach is nice and straightforward, so let's go ahead with it. gcc/testsuite/ChangeLog: 2021-08-27 qingzhe huang * g++.dg/parse/pr101783.C: New test. diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index 8840932dba2..7aa4318a574 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -1356,12 +1356,22 @@ cp_build_qualified_type_real (tree type, /* A reference or method type shall not be cv-qualified. [dcl.ref], [dcl.fct]. This used to be an error, but as of DR 295 (in CD1) we always ignore extra cv-quals on functions. */ + + /* PR 101783 Let's cite where this comes from in the standard ([dcl.ref]/1), and not the PR number. + Cv-qualified references are ill-formed except when the cv-qualifiers + are introduced through the use of a typedef-name ([dcl.typedef], + [temp.param]) or decltype-specifier ([dcl.type.decltype]), + in which case the cv-qualifiers are ignored. + */ if (type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE) && (TYPE_REF_P (type) || FUNC_OR_METHOD_TYPE_P (type))) { - if (TYPE_REF_P (type)) + // do NOT set bad_quals when non-method reference is introduced by typedef. + if (TYPE_REF_P (type) + && (!typedef_variant_p (type) || FUNC_OR_METHOD_TYPE_P (type))) bad_quals |= type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE); + // non-method reference introduced by typedef is also dropped silently These two // comments seem redundant with the quote from the standard above, let's drop them. type_quals &= ~(TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE); } diff --git a/gcc/testsuite/g++.dg/parse/pr101783.C b/gcc/testsuite/g++.dg/parse/pr101783.C new file mode 100644 index 000..4e0a435dd0b --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/pr101783.C @@ -0,0 +1,5 @@ +template struct A{ +typedef T& Type; +}; +template void f(const typename A::Type){} +template <> void f(const typename A::Type){}
Re: [Patch] Fortran: Fix assumed-size to assumed-rank passing [PR94070]
Hi Tobias, OK for mainline? As promised on IRC, here's the review. Maybe you can add a test case which shows that the call to the size intrinsic really does not happen. OK with that. Thanks for the patch! Best regards Thomas
[PATCH] [i386] Remove storage only description for _Float16 w/o avx512fp16.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580207.html gcc/ChangeLog: * doc/extend.texi (Half-Precision): Remove storage only description for _Float16 w/o avx512fp16. --- gcc/doc/extend.texi | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 9501a60f20e..79fa1bd4bf8 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -1156,12 +1156,11 @@ It is recommended that portable code use the @code{_Float16} type defined by ISO/IEC TS 18661-3:2015. @xref{Floating Types}. On x86 targets with SSE2 enabled, without @option{-mavx512fp16}, -@code{_Float16} type is storage only, all operations will be emulated by -software emulation and the @code{float} instructions. The default behavior -for @code{FLT_EVAL_METHOD} is to keep the intermediate result of the operation -as 32-bit precision. This may lead to inconsistent behavior between software -emulation and AVX512-FP16 instructions. Using @option{-fexcess-precision=16} -will force round back after each operation. +all operations will be emulated by software emulation and the @code{float} +instructions. The default behavior for @code{FLT_EVAL_METHOD} is to keep the +intermediate result of the operation as 32-bit precision. This may lead to +inconsistent behavior between software emulation and AVX512-FP16 instructions. +Using @option{-fexcess-precision=16} will force round back after each operation. Using @option{-mavx512fp16} will generate AVX512-FP16 instructions instead of software emulation. The default behavior of @code{FLT_EVAL_METHOD} is to round -- 2.27.0