Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule
Hi Andrew, On 1/27/20 9:57 PM, Andrew Benson wrote: The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus function name (plus the additional "_"'s that get prepended), but insufficient if a submodule name is included. The name then gets truncated and can lead to two different functions having the same (truncated) symbol name. The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which allows for the submodule name plus the "." added between module and submodule names. I've attached a patch for this which includes a new test case for this PR. The patch regression tests cleanly. OK to commit? Can you also update the comment before the #define? It currently has: /* Mangled symbols take the form __module__name. */ -#define GFC_MAX_MANGLED_SYMBOL_LEN (GFC_MAX_SYMBOL_LEN*2+4) +#define GFC_MAX_MANGLED_SYMBOL_LEN (GFC_MAX_SYMBOL_LEN*3+5) Otherweise, it LGTM. Tobias PS: I wonder whether there are relevant systems which will fail because they do not handle that long symbol names...
Re: [patch, fortran] PR93473 - ICE on valid with long module + submodule names
On 1/28/20 12:41 AM, Andrew Benson wrote: The problem occurs in set_syms_host_assoc() where the "parent1" and "parent2" variables have a maximum length of GFC_MAX_SYMBOL_LEN+1. This is insufficient when the parent names are a module+submodule name concatenated with a ".". The patch above fixes this by increasing their length to 2*GFC_MAX_SYMBOL_LEN+2. A patch to fix this is attached. The patch regression tests cleanly - ok to commit? The patch is okay, but can you add a comment – similar to the other patch – which makes clear why one needs twice the normal GFC_MAX_SYMBOL_LEN (+2)? You currently have: const char dot[2] = "."; - char parent1[GFC_MAX_SYMBOL_LEN + 1]; - char parent2[GFC_MAX_SYMBOL_LEN + 1]; + char parent1[2 * GFC_MAX_SYMBOL_LEN + 2]; + char parent2[2 * GFC_MAX_SYMBOL_LEN + 2]; Tobias
Re: [PATCH] forwprop: Tweak choice of VEC_PERM_EXPR filler [PR92822]
On Mon, 27 Jan 2020, Richard Sandiford wrote: > For the 2s failures in the PR, we have a V4SF VEC_PERM_EXPR in > which the first two elements are duplicates of one element and > the other two are don't-care: > > v4sf_b = VEC_PERM_EXPR ; > > The heuristic was to extend this with a blend: > > v4sf_b = VEC_PERM_EXPR ; > > but it seems better to extend a partial duplicate to a full duplicate: > > v4sf_b = VEC_PERM_EXPR ; > > Obviously this is still just a heuristic though. > > I wondered whether to restrict this to two elements or more > but couldn't find any examples in which it made a difference. > Either way should be fine for the purposes of fixing this PR. > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. Richard mentioned > in the PR that he had a different fix in mind, but since I'd tested > this overnight, I thought I might as well post it anyway as a possible > belt-and-braces fix. OK to install? OK. It's a sound heuristic - IIRC I played with some form of this (not sure if I tried to restrict it to all-same elts) and for some cases it was worse on x86. On x86 there's a splat but obviously not from a random vector lane so if one wants to use that it's an extract (move to lane zero) and then splat. Anyway, if we have particular cases that regress with this we want testsuite coverage. [not sure how difficult it is to cover all cases up to N elements with some scripting... looking over the assembly diff would be not fun in any case...] Thanks, Richard. > Richard > > > 2020-01-27 Richard Sandiford > > gcc/ > PR tree-optimization/92822 > * tree-ssa-forwprop.c (simplify_vector_constructor): When filling > out the don't-care elements of a vector whose significant elements > are duplicates, make the don't-care elements duplicates too. > --- > gcc/tree-ssa-forwprop.c | 22 -- > 1 file changed, 16 insertions(+), 6 deletions(-) > > diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c > index d63e87c8a5b..5203891950a 100644 > --- a/gcc/tree-ssa-forwprop.c > +++ b/gcc/tree-ssa-forwprop.c > @@ -2455,16 +2455,26 @@ simplify_vector_constructor (gimple_stmt_iterator > *gsi) >it and its source indexes to make the permutation supported. >For now it mimics a blend. */ >vec_perm_builder sel (refnelts, refnelts, 1); > + bool all_same_p = true; >for (i = 0; i < elts.length (); ++i) > - sel.quick_push (elts[i].second + elts[i].first * refnelts); > + { > + sel.quick_push (elts[i].second + elts[i].first * refnelts); > + all_same_p &= known_eq (sel[i], sel[0]); > + } >/* And fill the tail with "something". It's really don't care, > and ideally we'd allow VEC_PERM to have a smaller destination > - vector. As heuristic try to preserve a uniform orig[0] which > - facilitates later pattern-matching VEC_PERM_EXPR to a > - BIT_INSERT_EXPR. */ > + vector. As a heuristic: > + > + (a) if what we have so far duplicates a single element, make the > + tail do the same > + > + (b) otherwise preserve a uniform orig[0]. This facilitates > + later pattern-matching of VEC_PERM_EXPR to a BIT_INSERT_EXPR. */ >for (; i < refnelts; ++i) > - sel.quick_push ((elts[0].second == 0 && elts[0].first == 0 > - ? 0 : refnelts) + i); > + sel.quick_push (all_same_p > + ? sel[0] > + : (elts[0].second == 0 && elts[0].first == 0 > +? 0 : refnelts) + i); >vec_perm_indices indices (sel, orig[1] ? 2 : 1, refnelts); >if (!can_vec_perm_const_p (TYPE_MODE (perm_type), indices)) > return false;
Re: [PATCH 3/3] Check array contiguity for OpenACC/Fortran
(CC: fortran@ as it relates to Fortran.) Hi all, On 1/7/20 12:16 PM, Tobias Burnus wrote: in terms of the check, it looks fine to me – but I am not sure about the spec. * [OpenACC] Actually, I simply missed the bit (here: OpenACC 3; OpenACC 2.6 is same): “Any array or subarray in a data clause, including Fortran array pointers, must be a contiguous block of memory, except for dynamic multidimensional C arrays.” (2.7.1 under Restrictions). * OpenMP (quoting TR8): “If a list item is an array section, it must specify contiguous storage.” (2.22.7.1 map Clause under Restrictions). However, that one seems to miss the case: “non_cont_ptr => A(::2); !$omp ... map(non_cont_ptr)” as non_cont_ptr is noncontiguous and an array but not an array section. In any case, those are restrictions imposed on the user – which the compiler may or may not report. (A good one will, I presume. Although, one can also regard it as implementation defined/vendor extension as GCC will properly handle those – by mapping also the gaps.) Cheers, Tobias At least the following test case seems to work fine: integer :: A(10,10), out(12) A = reshape([(i, i=0,100)], shape(A)) !$omp target map(A(3:6,3:5), out) !$acc parallel copy(A(3:6,3:5), out) out = reshape(A(3:6,3:5), shape(out)) A(3:6,3:5) = 4 !$acc end parallel !$omp end target print '(4i3)', out print '(10i3)', A end The reason that it works is that the stride is included in the length calculation: #pragma omp target num_teams(1) thread_limit(0) map(tofrom:MEM[(c_char *)_6] [len: _5]) Has for the section (with A(3:6,3:5) -> parm(1:4,1:3)): parm.1.dim[0].lbound = 1; parm.1.dim[0].ubound = 4; parm.1.dim[0].stride = 1; parm.1.dim[1].lbound = 1; parm.1.dim[1].ubound = 3; parm.1.dim[1].stride = 10; And instead of doing '(4-1+1) * (3-1+1)' = 12 (i.e. multiplying the extends), the code does: 'stride * (3-1+1)' = 30. Cheers, Tobias PS: It also works if one puts the stride on the ptr, i.e.: integer,target :: A(10,10), out(12) integer, pointer :: ptr(:,:) A = reshape([(i, i=0,100)], shape(A)) ptr => A(3:6,3:5) !$omp target map(ptr, out) !$acc parallel copy(ptr, out) out = reshape(ptr, shape(out)) ptr = 4 !$acc end parallel !$omp end target print '(4i3)', out print '(10i3)', A end On 1/4/20 3:25 AM, Julian Brown wrote: This patch tightens up error checking for array references used in OpenACC clauses such that they must now be contiguous. I believe this matches up to the spec (as of 2.6). I've tried to make it so an error only triggers if the compiler is sure that a given array expression must be non-contiguous at compile time, although this errs on the side of potentially allowing the user to shoot themselves in the foot at runtime: at least one existing test in the testsuite appears to expect that behaviour. […]
* PING * Re: [OpenACC] bump version for 2.6 plus libgomp.texi update — document acc_attach/acc_detach, acc_*_async, acc_*_finalize(_async) functions
*PING* Those two patches bump the OpenACC version number from 2.0 (alias 201306) to OpenACC 2.6 (alias 201711). I believe except for bugs and known omissions (e.g. PR93225+93226), the OpenACC 2.6 support is complete. Additionally, it updates the documentation accordingly, no longer marks OpenACC as experimental and documents some missing run-time functions. * https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00600.html – main patch * https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01173.html – remove more 'experimental' status. OK? — Sandra was so kind and looke through the documentation, which look fine to here. Tobias On 1/20/20 10:39 PM, Sandra Loosemore wrote: On 1/20/20 3:08 AM, Tobias Burnus wrote: Hi Sandra, On 1/20/20 5:39 AM, Sandra Loosemore wrote: I happen to have noticed a couple weeks ago that this language about OpenACC support being experimental appears in multiple places in the gfortran manual, […] The same disclaimer for that option in the main GCC manual was removed years ago, so unless the Fortran support is much more broken than the C/C++ support, I think it ought to be removed from the Fortran manual as well. […] I concur. That would be the attached patch (on top of the previous patch* in this thread). This is good, thank you. -Sandra
Re: [Ping][GCC][IRA] Revert 11b8091fb to fix Bug 93221
> Ping! Eric, do you have any objections to reverting? See my comment posted in the audit trail of the TN on 01/20... -- Eric Botcazou
[PATCH] Fix 2 typos in documentation of libstdc++.
Hi. It's a simple documentation fix. I'm going to install the patch. Martin libstdc++-v3/ChangeLog: 2020-01-28 Martin Liska PR libstdc++/93478 * include/std/atomic: Fix typo. * include/std/optional: Likewise. --- libstdc++-v3/include/std/atomic | 2 +- libstdc++-v3/include/std/optional | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic index 66c62381e6b..40f23bdfc96 100644 --- a/libstdc++-v3/include/std/atomic +++ b/libstdc++-v3/include/std/atomic @@ -174,7 +174,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /** * @brief Generic atomic type, primary class template. * - * @tparam _Tp Type to be made atomic, must be trivally copyable. + * @tparam _Tp Type to be made atomic, must be trivially copyable. */ template struct atomic diff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional index 09e7a7efca7..b920a1453ba 100644 --- a/libstdc++-v3/include/std/optional +++ b/libstdc++-v3/include/std/optional @@ -453,7 +453,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * Such a separate base class template is necessary in order to * conditionally make copy/move constructors trivial. * -* When the contained value is trivally copy/move constructible, +* When the contained value is trivially copy/move constructible, * the copy/move constructors of _Optional_base will invoke the * trivial copy/move constructor of _Optional_payload. Otherwise, * they will invoke _Optional_payload(bool, const _Optional_payload&)
Re: [Ping][GCC][IRA] Revert 11b8091fb to fix Bug 93221
On 28/01/2020 09:07, Eric Botcazou wrote: >> Ping! Eric, do you have any objections to reverting? > > See my comment posted in the audit trail of the TN on 01/20... > Probably missing live range splitting or somesuch, as envisioned by > Vladimir in its approval message. Feel free to eventually revert it. Great. Vladimir, Ok for trunk? Changelog: 2020-01-21 Joel Hutton PR target/93221 * ira.c (ira): Revert use of simplified LRA algorithm. gcc/testsuite/ChangeLog: 2020-01-21 Joel Hutton PR target/93221 * gcc.target/aarch64/pr93221.c: New test. From 1a2980ef6eeb76dbf0556f806a85a4f49ad3ebdd Mon Sep 17 00:00:00 2001 From: Joel Hutton Date: Tue, 21 Jan 2020 09:37:48 + Subject: [PATCH] [IRA] Fix bug 93221 by reverting 11b8091fb 11b8091fb introduced a simplified LRA algorithm for -O0 that turned off hard register splitting, this causes a problem for parameters passed in multiple registers on aarch64. This fixes bug 93221. --- gcc/ira.c | 38 +- gcc/testsuite/gcc.target/aarch64/pr93221.c | 10 ++ 2 files changed, 25 insertions(+), 23 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr93221.c diff --git a/gcc/ira.c b/gcc/ira.c index 46091adf8109263c72343dccfe4913857b5c74ae..c8b5f869da121506f0414901271eae9810689316 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -5205,35 +5205,27 @@ ira (FILE *f) /* Perform target specific PIC register initialization. */ targetm.init_pic_reg (); - if (optimize) -{ - ira_conflicts_p = true; - - /* Determine the number of pseudos actually requiring coloring. */ - unsigned int num_used_regs = 0; - for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++) - if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i)) - num_used_regs++; - - /* If there are too many pseudos and/or basic blocks (e.g. 10K - pseudos and 10K blocks or 100K pseudos and 1K blocks), we will - use simplified and faster algorithms in LRA. */ - lra_simple_p - = ira_use_lra_p - && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun); -} - else -{ - ira_conflicts_p = false; - lra_simple_p = ira_use_lra_p; -} + ira_conflicts_p = optimize > 0; + + /* Determine the number of pseudos actually requiring coloring. */ + unsigned int num_used_regs = 0; + for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++) +if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i)) + num_used_regs++; + + /* If there are too many pseudos and/or basic blocks (e.g. 10K + pseudos and 10K blocks or 100K pseudos and 1K blocks), we will + use simplified and faster algorithms in LRA. */ + lra_simple_p += ira_use_lra_p + && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun); if (lra_simple_p) { /* It permits to skip live range splitting in LRA. */ flag_caller_saves = false; /* There is no sense to do regional allocation when we use - simplified LRA. */ + simplified LRA. */ flag_ira_region = IRA_REGION_ONE; ira_conflicts_p = false; } diff --git a/gcc/testsuite/gcc.target/aarch64/pr93221.c b/gcc/testsuite/gcc.target/aarch64/pr93221.c new file mode 100644 index ..4dc2c3d0149423dd3d666f7428277ffa9eb765c4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr93221.c @@ -0,0 +1,10 @@ +/* PR target/93221 */ +/* { dg-do compile } */ +/* { dg-options "-O0 -mno-omit-leaf-frame-pointer" } */ + +struct S { __Int32x4_t b[2]; }; + +void +foo (struct S x) +{ +} -- 2.17.1
[PATCH] predcom: Fix invalid store-store commoning [PR93434]
predcom has the following code to stop one rogue load from interfering with other store-load opportunities: /* If A is read and B write or vice versa and there is unsuitable dependence, instead of merging both components into a component that will certainly not pass suitable_component_p, just put the read into bad component, perhaps at least the write together with all the other data refs in it's component will be optimizable. */ But when store-store commoning was added later, this had the effect of ignoring loads that occur between two candidate stores. There is code further up to handle loads and stores with unknown dependences: /* Don't do store elimination if there is any unknown dependence for any store data reference. */ if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb)) && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know || DDR_NUM_DIST_VECTS (ddr) == 0)) eliminate_store_p = false; But the store-load code above skips loads for *known* dependences if (a) the load has already been marked "bad" or (b) the data-ref machinery knows the dependence distance, but determine_offsets can't handle the combination. (a) happens to be the problem in the testcase, but a different sequence could have given (b) instead. We have writes to individual fields of a structure and reads from the whole structure. Since determine_offsets requires the types to be the same, it returns false for each such read/write combination. This patch records which components have had loads removed and prevents store-store commoning for them. It's a bit too pessimistic, since there shouldn't be a problem if a "bad" load dominates all stores in a component. But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p here and (b) the handling for that case would probably need to be removed again if we handled more exotic cases in future. 2020-01-28 Richard Sandiford gcc/ PR tree-optimization/93434 * tree-predcom.c (split_data_refs_to_components): Record which components have had aliasing loads removed. Prevent store-store commoning for all such components. gcc/testsuite/ PR tree-optimization/93434 * gcc.c-torture/execute/pr93434.c: New test. --- gcc/testsuite/gcc.c-torture/execute/pr93434.c | 36 +++ gcc/tree-predcom.c| 24 +++-- 2 files changed, 58 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93434.c diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c index 047c7fa9583..d2dcfe7f42d 100644 --- a/gcc/tree-predcom.c +++ b/gcc/tree-predcom.c @@ -767,6 +767,7 @@ split_data_refs_to_components (class loop *loop, /* Don't do store elimination if loop has multiple exit edges. */ bool eliminate_store_p = single_exit (loop) != NULL; basic_block last_always_executed = last_always_executed_block (loop); + auto_bitmap no_store_store_comps; FOR_EACH_VEC_ELT (datarefs, i, dr) { @@ -838,9 +839,13 @@ split_data_refs_to_components (class loop *loop, else if (DR_IS_READ (dra) && ib != bad) { if (ia == bad) - continue; + { + bitmap_set_bit (no_store_store_comps, ib); + continue; + } else if (!determine_offset (dra, drb, &dummy_off)) { + bitmap_set_bit (no_store_store_comps, ib); merge_comps (comp_father, comp_size, bad, ia); continue; } @@ -848,9 +853,13 @@ split_data_refs_to_components (class loop *loop, else if (DR_IS_READ (drb) && ia != bad) { if (ib == bad) - continue; + { + bitmap_set_bit (no_store_store_comps, ia); + continue; + } else if (!determine_offset (dra, drb, &dummy_off)) { + bitmap_set_bit (no_store_store_comps, ia); merge_comps (comp_father, comp_size, bad, ib); continue; } @@ -908,6 +917,17 @@ split_data_refs_to_components (class loop *loop, comp->refs.quick_push (dataref); } + if (eliminate_store_p) +{ + bitmap_iterator bi; + EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi) + { + ca = component_of (comp_father, ia); + if (ca != bad) + comps[ca]->eliminate_store_p = false; + } +} + for (i = 0; i < n; i++) { comp = comps[i]; diff --git a/gcc/testsuite/gcc.c-torture/execute/pr93434.c b/gcc/testsuite/gcc.c-torture/execute/pr93434.c new file mode 100644 index 000..e786252794b --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr93434.c @@ -0,0 +1,36 @@ +typedef struct creal_T { + double re; + double im; +} creal_T; + +#define N 16 +int main() { + int k; + int i; + int j; + creal_T t2[N]; + double inval; + + inval = 1.0; + for (j = 0; j < N; ++j) { +t2[j].re = 0; +
Re: [PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)
On Mon, Jan 27, 2020 at 10:47 PM Segher Boessenkool wrote: > > Hi! > > On Tue, Jan 21, 2020 at 02:10:21PM +, Wilco Dijkstra wrote: > > While code hoisting generally improves codesize, it can affect performance > > negatively. Benchmarking shows it doesn't help SPEC and negatively affects > > embedded benchmarks. Since the impact is relatively small with -O2 and > > mainly > > affects -O3, the simplest option is to disable code hoisting for -O3 and > > higher. > > Should this be a generic thing, not target-specific? The change doesn't make sense - I'm sure if you'd look at the actual cases it's not code hoisting in itself but "optimizations" enabled by it that cause the issues. It's probably the same thing as PRE causing issues downstream for some benchmarks but do you disable PRE then by default just because of that? Richard. > > Segher
Re: [PATCH] predcom: Fix invalid store-store commoning [PR93434]
Richard Sandiford writes: > predcom has the following code to stop one rogue load from > interfering with other store-load opportunities: > > /* If A is read and B write or vice versa and there is unsuitable >dependence, instead of merging both components into a component >that will certainly not pass suitable_component_p, just put the >read into bad component, perhaps at least the write together with >all the other data refs in it's component will be optimizable. */ > > But when store-store commoning was added later, this had the effect > of ignoring loads that occur between two candidate stores. > > There is code further up to handle loads and stores with unknown > dependences: > > /* Don't do store elimination if there is any unknown dependence for >any store data reference. */ > if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb)) > && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know > || DDR_NUM_DIST_VECTS (ddr) == 0)) > eliminate_store_p = false; > > But the store-load code above skips loads for *known* dependences > if (a) the load has already been marked "bad" or (b) the data-ref > machinery knows the dependence distance, but determine_offsets > can't handle the combination. > > (a) happens to be the problem in the testcase, but a different > sequence could have given (b) instead. We have writes to individual > fields of a structure and reads from the whole structure. Since > determine_offsets requires the types to be the same, it returns false > for each such read/write combination. > > This patch records which components have had loads removed and > prevents store-store commoning for them. It's a bit too pessimistic, > since there shouldn't be a problem if a "bad" load dominates all stores > in a component. But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p > here and (b) the handling for that case would probably need to be > removed again if we handled more exotic cases in future. Forgot to add: Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK for master and eventually for backports? Thanks, Richard > > 2020-01-28 Richard Sandiford > > gcc/ > PR tree-optimization/93434 > * tree-predcom.c (split_data_refs_to_components): Record which > components have had aliasing loads removed. Prevent store-store > commoning for all such components. > > gcc/testsuite/ > PR tree-optimization/93434 > * gcc.c-torture/execute/pr93434.c: New test. > --- > gcc/testsuite/gcc.c-torture/execute/pr93434.c | 36 +++ > gcc/tree-predcom.c| 24 +++-- > 2 files changed, 58 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93434.c > > diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c > index 047c7fa9583..d2dcfe7f42d 100644 > --- a/gcc/tree-predcom.c > +++ b/gcc/tree-predcom.c > @@ -767,6 +767,7 @@ split_data_refs_to_components (class loop *loop, >/* Don't do store elimination if loop has multiple exit edges. */ >bool eliminate_store_p = single_exit (loop) != NULL; >basic_block last_always_executed = last_always_executed_block (loop); > + auto_bitmap no_store_store_comps; > >FOR_EACH_VEC_ELT (datarefs, i, dr) > { > @@ -838,9 +839,13 @@ split_data_refs_to_components (class loop *loop, >else if (DR_IS_READ (dra) && ib != bad) > { > if (ia == bad) > - continue; > + { > + bitmap_set_bit (no_store_store_comps, ib); > + continue; > + } > else if (!determine_offset (dra, drb, &dummy_off)) > { > + bitmap_set_bit (no_store_store_comps, ib); > merge_comps (comp_father, comp_size, bad, ia); > continue; > } > @@ -848,9 +853,13 @@ split_data_refs_to_components (class loop *loop, >else if (DR_IS_READ (drb) && ia != bad) > { > if (ib == bad) > - continue; > + { > + bitmap_set_bit (no_store_store_comps, ia); > + continue; > + } > else if (!determine_offset (dra, drb, &dummy_off)) > { > + bitmap_set_bit (no_store_store_comps, ia); > merge_comps (comp_father, comp_size, bad, ib); > continue; > } > @@ -908,6 +917,17 @@ split_data_refs_to_components (class loop *loop, >comp->refs.quick_push (dataref); > } > > + if (eliminate_store_p) > +{ > + bitmap_iterator bi; > + EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi) > + { > + ca = component_of (comp_father, ia); > + if (ca != bad) > + comps[ca]->eliminate_store_p = false; > + } > +} > + >for (i = 0; i < n; i++) > { >comp = comps[i]; > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr93434.c > b/gcc/testsuite/gcc.c-torture/execute/pr93434.c > new file mode 100644 > index 000..e786252794b > --- /dev/null >
[Patch][Fortran] gfortran.texi - minor style cleanup
Most items in the @menu do not end with a period; all of them do not end with one when used as @section. Hence, it make sense to unify the style to w/o tailing '.'. Committed. Tobias commit 4593f60558474983c79cbbdedc31b4c19e63b671 Author: Tobias Burnus Date: Tue Jan 28 10:58:00 2020 +0100 [Fortran] gfortran.texi - minor style cleanup * gfortran.texi (Runtime): Remove tailing '.' in @menu. diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog index bfc3b224ecb..4040ff284b3 100644 --- a/gcc/fortran/ChangeLog +++ b/gcc/fortran/ChangeLog @@ -1,3 +1,7 @@ +2020-01-28 Tobias Burnus + + * gfortran.texi (Runtime): Remove tailing '.' in @menu. + 2020-01-27 Tobias Burnus PR fortran/85781 diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi index a50634ab9d2..0b52c1b6ab8 100644 --- a/gcc/fortran/gfortran.texi +++ b/gcc/fortran/gfortran.texi @@ -604,15 +604,15 @@ Malformed environment variables are silently ignored. * GFORTRAN_STDIN_UNIT:: Unit number for standard input * GFORTRAN_STDOUT_UNIT:: Unit number for standard output * GFORTRAN_STDERR_UNIT:: Unit number for standard error -* GFORTRAN_UNBUFFERED_ALL:: Do not buffer I/O for all units. +* GFORTRAN_UNBUFFERED_ALL:: Do not buffer I/O for all units * GFORTRAN_UNBUFFERED_PRECONNECTED:: Do not buffer I/O for preconnected units. * GFORTRAN_SHOW_LOCUS:: Show location for runtime errors * GFORTRAN_OPTIONAL_PLUS:: Print leading + where permitted * GFORTRAN_LIST_SEPARATOR:: Separator for list output * GFORTRAN_CONVERT_UNIT:: Set endianness for unformatted I/O * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors -* GFORTRAN_FORMATTED_BUFFER_SIZE:: Buffer size for formatted files. -* GFORTRAN_UNFORMATTED_BUFFER_SIZE:: Buffer size for unformatted files. +* GFORTRAN_FORMATTED_BUFFER_SIZE:: Buffer size for formatted files +* GFORTRAN_UNFORMATTED_BUFFER_SIZE:: Buffer size for unformatted files @end menu @node TMPDIR
Re: [PATCH] doc: clarify the situation with pointer arithmetic
On Tue, Jan 28, 2020 at 8:20 AM Alexander Monakov wrote: > > On Tue, 28 Jan 2020, Uecker, Martin wrote: > > > > (*) this also shows the level of "obfuscation" needed to fool compilers > > > to lose provenance knowledge is hard to predict. > > > > Well, this is exactly the problem we want to address by defining > > a clear way to do this. Casting to an integer would be the way > > to state: "consider the pointer as escaped and forget the > > provenance" and casting an integer to a pointer would > > mean "this pointer may point to all objects whose pointer has > > escaped". This would give the programmer explicit control about > > this aspect and make most existing code using pointer-to-integer > > casts well-defined. At the same time, this should be simple > > to add to existing points-to analysis. > > Can you explain why you make it required for the compiler to treat the > points-to set unnecessarily broader than it could prove? In the Matlab > example, there's a simple chain of computations that the compiler is > following to prove that the pointer resulting from the final cast is > derived from exactly one other pointer (no other pointers have > participated in the computations). > > Or, in other words: > > is there an example where a programmer can distinguish between the > requirement you explain above vs. the fine-grained interpretation > that GIMPLE aims to implement (at least as I understand it), which is: > > when the program creates a pointer by means of non-pointer computations > (casts, representation access, etc), the resulting pointer may point to: > > * any object which address could have participated in the computation > (which is at worst the entire set of "exposed" objects up to that >program point, but can be much narrower if the compiler can see >the entire chain of computations) > > * any objects which is not "exposed" but could have known address, e.g. > because it is placed at a specific address during linking Note for the current PTA implementation there's almost no cases we can handle conservatively enough. Consider the simple int a[4]; int *p = &a[1]; uintptr_t pi = (uintptr_t)p; pi += 4; int *q = (int *)pi; our PTA knows that p points to a (but not the exact offset), same for pi (the cast doesn't change the value). Now you add 4 - this could lead you outside of 'a' so the points-to set becomes 'a and anything'. I'm also not sure what PVNI does to int a[4]; int *p = &a[1]; p += 10; uintptr_t pi = (uintptr_t)p; p = (int *)pi; we assume that p points to a even after p += 10 (but it of course points outside of the object - obvious here, but not necessarily in more obfuscated cases). Now, can we assume pi points to a? The cast isn't value-changing. Do we have to assume (int *)pi points to anything? So, is p = (int *)(uintptr_t)p; something like "laundering" a pointer? We don't preserve such simple round-trip casts since they are value-preserving. Are they provenance preserving? Richard. > Thanks. > Alexander
Re: [PATCH] predcom: Fix invalid store-store commoning [PR93434]
On Tue, Jan 28, 2020 at 10:49 AM Richard Sandiford wrote: > > Richard Sandiford writes: > > predcom has the following code to stop one rogue load from > > interfering with other store-load opportunities: > > > > /* If A is read and B write or vice versa and there is unsuitable > >dependence, instead of merging both components into a component > >that will certainly not pass suitable_component_p, just put the > >read into bad component, perhaps at least the write together with > >all the other data refs in it's component will be optimizable. */ > > > > But when store-store commoning was added later, this had the effect > > of ignoring loads that occur between two candidate stores. > > > > There is code further up to handle loads and stores with unknown > > dependences: > > > > /* Don't do store elimination if there is any unknown dependence for > >any store data reference. */ > > if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb)) > > && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know > > || DDR_NUM_DIST_VECTS (ddr) == 0)) > > eliminate_store_p = false; > > > > But the store-load code above skips loads for *known* dependences > > if (a) the load has already been marked "bad" or (b) the data-ref > > machinery knows the dependence distance, but determine_offsets > > can't handle the combination. > > > > (a) happens to be the problem in the testcase, but a different > > sequence could have given (b) instead. We have writes to individual > > fields of a structure and reads from the whole structure. Since > > determine_offsets requires the types to be the same, it returns false > > for each such read/write combination. > > > > This patch records which components have had loads removed and > > prevents store-store commoning for them. It's a bit too pessimistic, > > since there shouldn't be a problem if a "bad" load dominates all stores > > in a component. But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p > > here and (b) the handling for that case would probably need to be > > removed again if we handled more exotic cases in future. > > Forgot to add: > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK for master > and eventually for backports? OK for master and backports. Richard. > Thanks, > Richard > > > > > 2020-01-28 Richard Sandiford > > > > gcc/ > > PR tree-optimization/93434 > > * tree-predcom.c (split_data_refs_to_components): Record which > > components have had aliasing loads removed. Prevent store-store > > commoning for all such components. > > > > gcc/testsuite/ > > PR tree-optimization/93434 > > * gcc.c-torture/execute/pr93434.c: New test. > > --- > > gcc/testsuite/gcc.c-torture/execute/pr93434.c | 36 +++ > > gcc/tree-predcom.c| 24 +++-- > > 2 files changed, 58 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93434.c > > > > diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c > > index 047c7fa9583..d2dcfe7f42d 100644 > > --- a/gcc/tree-predcom.c > > +++ b/gcc/tree-predcom.c > > @@ -767,6 +767,7 @@ split_data_refs_to_components (class loop *loop, > >/* Don't do store elimination if loop has multiple exit edges. */ > >bool eliminate_store_p = single_exit (loop) != NULL; > >basic_block last_always_executed = last_always_executed_block (loop); > > + auto_bitmap no_store_store_comps; > > > >FOR_EACH_VEC_ELT (datarefs, i, dr) > > { > > @@ -838,9 +839,13 @@ split_data_refs_to_components (class loop *loop, > >else if (DR_IS_READ (dra) && ib != bad) > > { > > if (ia == bad) > > - continue; > > + { > > + bitmap_set_bit (no_store_store_comps, ib); > > + continue; > > + } > > else if (!determine_offset (dra, drb, &dummy_off)) > > { > > + bitmap_set_bit (no_store_store_comps, ib); > > merge_comps (comp_father, comp_size, bad, ia); > > continue; > > } > > @@ -848,9 +853,13 @@ split_data_refs_to_components (class loop *loop, > >else if (DR_IS_READ (drb) && ia != bad) > > { > > if (ib == bad) > > - continue; > > + { > > + bitmap_set_bit (no_store_store_comps, ia); > > + continue; > > + } > > else if (!determine_offset (dra, drb, &dummy_off)) > > { > > + bitmap_set_bit (no_store_store_comps, ia); > > merge_comps (comp_father, comp_size, bad, ib); > > continue; > > } > > @@ -908,6 +917,17 @@ split_data_refs_to_components (class loop *loop, > >comp->refs.quick_push (dataref); > > } > > > > + if (eliminate_store_p) > > +{ > > + bitmap_iterator bi; > > + EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi) > > + { > > + ca = component_of (comp_father, ia); > > + if (ca
Re: [PATCH] analyzer: fix build with gcc 4.4 (PR 93276)
On Fri, Jan 24, 2020 at 07:53:28PM -0500, David Malcolm wrote: > This patch fixes various build failures seen with gcc 4.4 > > gcc prior to 4.6 complains about: > > error: #pragma GCC diagnostic not allowed inside functions > > for various uses of PUSH_IGNORE_WFORMAT and POP_IGNORE_WFORMAT. > This patch makes them a no-op with such compilers. I think this is wrong. All that is really needed is make sure you #include "diagnostic-core.h" before including pretty-print.h. By including diagnostic-core.h first, you do: #ifndef GCC_DIAG_STYLE #define GCC_DIAG_STYLE __gcc_tdiag__ #endif and then pretty-print.h will do: #ifdef GCC_DIAG_STYLE #define GCC_PPDIAG_STYLE GCC_DIAG_STYLE #else #define GCC_PPDIAG_STYLE __gcc_diag__ #endif If instead pretty-print.h is included first, then it will use __gcc_diag__ instead of __gcc_tdiag__ and thus will assume %E/%D etc. can't be handled. I've so far just tested that in stage3 with this patch analyzer builds without any -Wformat/-Wformat-extra-args warnings. Ok for trunk if it passes bootstrap/regtest? 2020-01-28 Jakub Jelinek * analyzer.h (PUSH_IGNORE_WFORMAT, POP_IGNORE_WFORMAT): Remove. * constraint-manager.cc: Include diagnostic-core.h before graphviz.h. (range::dump, equiv_class::print): Don't use PUSH_IGNORE_WFORMAT or POP_IGNORE_WFORMAT. * state-purge.cc: Include diagnostic-core.h before gimple-pretty-print.h. (state_purge_annotator::add_node_annotations, print_vec_of_names): Don't use PUSH_IGNORE_WFORMAT or POP_IGNORE_WFORMAT. * region-model.cc: Move diagnostic-core.h include before graphviz.h. (path_var::dump, svalue::print, constant_svalue::print_details, region::dump_to_pp, region::dump_child_label, region::print_fields, map_region::print_fields, map_region::dump_dot_to_pp, map_region::dump_child_label, array_region::print_fields, array_region::dump_dot_to_pp): Don't use PUSH_IGNORE_WFORMAT or POP_IGNORE_WFORMAT. --- gcc/analyzer/analyzer.h.jj 2020-01-27 22:40:57.012420793 +0100 +++ gcc/analyzer/analyzer.h 2020-01-28 10:57:28.078715244 +0100 @@ -100,22 +100,6 @@ public: ~auto_cfun () { pop_cfun (); } }; -/* Macros for temporarily suppressing -Wformat and -Wformat-extra-args, - for those versions of GCC that support pragmas within a function - (4.6 onwards). */ - -#if GCC_VERSION >= 4006 -# define PUSH_IGNORE_WFORMAT \ - _Pragma("GCC diagnostic push") \ - _Pragma("GCC diagnostic ignored \"-Wformat\"") \ - _Pragma("GCC diagnostic ignored \"-Wformat-extra-args\"") -# define POP_IGNORE_WFORMAT \ - _Pragma("GCC diagnostic pop") -#else -# define PUSH_IGNORE_WFORMAT -# define POP_IGNORE_WFORMAT -#endif - /* A template for creating hash traits for a POD type. */ template --- gcc/analyzer/constraint-manager.cc.jj 2020-01-22 22:37:04.478533500 +0100 +++ gcc/analyzer/constraint-manager.cc 2020-01-28 10:54:41.403226986 +0100 @@ -28,6 +28,7 @@ along with GCC; see the file COPYING3. #include "gimple-iterator.h" #include "fold-const.h" #include "selftest.h" +#include "diagnostic-core.h" #include "graphviz.h" #include "function.h" #include "analyzer/analyzer.h" @@ -120,13 +121,11 @@ bound::get_relation_as_str () const void range::dump (pretty_printer *pp) const { -PUSH_IGNORE_WFORMAT pp_printf (pp, "%qE %s x %s %qE", m_lower_bound.m_constant, m_lower_bound.get_relation_as_str (), m_upper_bound.get_relation_as_str (), m_upper_bound.m_constant); -POP_IGNORE_WFORMAT } /* Determine if there is only one possible value for this range. @@ -200,9 +199,7 @@ equiv_class::print (pretty_printer *pp) { if (i > 0) pp_string (pp, " == "); -PUSH_IGNORE_WFORMAT pp_printf (pp, "%qE", m_constant); -POP_IGNORE_WFORMAT } pp_character (pp, '}'); } --- gcc/analyzer/state-purge.cc.jj 2020-01-14 22:57:20.802253484 +0100 +++ gcc/analyzer/state-purge.cc 2020-01-28 10:57:17.642872508 +0100 @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. #include "tree-phinodes.h" #include "options.h" #include "ssa-iterators.h" +#include "diagnostic-core.h" #include "gimple-pretty-print.h" #include "function.h" #include "analyzer/analyzer.h" @@ -444,12 +445,10 @@ state_purge_annotator::add_node_annotati state_purge_per_ssa_name *per_name_data = (*iter).second; if (per_name_data->get_function () == n.m_fun) { -PUSH_IGNORE_WFORMAT if (per_name_data->needed_at_point_p (before_supernode)) pp_printf (pp, "%qE needed here", name); else pp_printf (pp, "%qE not needed here", name); -POP_IGNORE_WFORMAT } pp_newline (pp); } @@ -476,9 +475,7 @@ print_vec_of_names (graphviz_out *gv, co { if (i > 0) pp_string (pp, ", "); -PUSH_IGNORE_WFORMAT pp_printf (pp, "%qE", name); -POP_IGNORE_WFORMAT } pp_printf (pp, "}"); pp_writ
Re: [committed] Add diagnostic_metadata and CWE support
On Wed, Dec 18, 2019 at 07:08:25PM -0500, David Malcolm wrote: > This patch adds support for associating a diagnostic message with an > optional diagnostic_metadata object, so that plugins can add extra data > to their diagnostics (e.g. mapping a diagnostic to a taxonomy or coding > standard such as from CERT or MISRA). > > Currently this only supports associating a CWE identifier with a > diagnostic (which is what I'm using for the warnings in the analyzer > patch kit), but adding a diagnostic_metadata class allows for future > growth in this area without an explosion of further "warning_at" > overloads for all of the different kinds of custom data that a plugin > might want to add. > > This version of the patch renames the overly-general > -fdiagnostics-show-metadata to -fdiagnostics-show-cwe and adds test > coverage for it via a plugin. > > It also adds a note to the documentation that no GCC diagnostics > currently use this; it's a feature for plugins (and, at some point, > I hope, the analyzer). > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > Committed to trunk as r279556. Unfortunately, this patch broke the i18n. $ make gcc.pot /bin/sh ../../gcc/../mkinstalldirs po make srcextra make[1]: Entering directory '/usr/src/gcc/obj/gcc' cp -p gengtype-lex.c ../../gcc cp -p gengtype-lex.c ../../gcc make[1]: Leaving directory '/usr/src/gcc/obj/gcc' AWK=gawk /bin/sh ../../gcc/po/exgettext \ /usr/bin/xgettext gcc ../../gcc scanning for keywords, %e and %n strings... emit_diagnostic_valist used incompatibly as both --keyword=emit_diagnostic_valist:4 --flag=emit_diagnostic_valist:4:gcc-internal-format and --keyword=emit_diagnostic_valist:5 --flag=emit_diagnostic_valist:5:gcc-internal-format make: *** [Makefile:4338: po/gcc.pot] Error 1 While C++ can have overloads, xgettext can't deal with overloads that have different argument positions. emit_diagnostic_valist with the new args (i.e. the :5 one) isn't used right now, so one way around at least for now is to remove it again, another is to rename it. Jakub
Re: [PATCH] gcc: Add new configure options to allow static libraries to be selected
Hi Andrew, Andrew Burgess wrote: * Jeff Law [2020-01-22 13:52:27 -0700]: On Wed, 2020-01-22 at 15:39 +, Andrew Burgess wrote: The motivation behind this change is to make it easier for a user to link against static libraries on a target where dynamic libraries are the default library type (for example GNU/Linux). Further, my motivation is really for linking libraries into GDB, however, the binutils-gdb/config/ directory is a copy of gcc/config/ so changes for GDB need to be approved by the GCC project first. After making this change in the gcc/config/ directory I've run autoreconf on all of the configure scripts in the GCC tree and a couple have been updated, so I'll use one of these to describe what my change does. Consider libcpp, this library links against libiconv. Currently if the user builds on a system with both static and dynamic libiconv installed then autotools will pick up the dynamic libiconv by default. This is almost certainly the right thing to do. However, if the user wants to link against static libiconv then things are a little harder, they could remove the dynamic libiconv from their system, but this is probably a bad idea (other things might depend on that library), or the user can build their own version of libiconv, install it into a unique prefix, and then configure gcc using the --with-libiconv-prefix=DIR flag. This works fine, but is somewhat annoying, the static library available, I just can't get autotools to use it. My change then adds a new flag --with-libiconv-type=TYPE, where type is either auto, static, or shared. The default auto, ensures we keep the existing behaviour unchanged. If the user configures with --with-libiconv-type=static then the configure script will ignore any dynamic libiconv it finds, and will only look for a static libiconv, if no static libiconv is found then the configure will continue as though there is no libiconv at all available. Similarly a user can specify --with-libiconv-type=shared and force the use of shared libiconv, any static libiconv will be ignored. As I've implemented this change within the AC_LIB_LINKFLAGS_BODY macro then only libraries configured using the AC_LIB_LINKFLAGS or AC_LIB_HAVE_LINKFLAGS macros will gain the new configure flag. If this is accepted into GCC then there will be follow on patches for binutils and GDB to regenerate some configure scripts in those projects. For GCC only two configure scripts needed updated after this commit, libcpp and libstdc++-v3, both of which link against libiconv. This kinda surprises me, gcc/ also configures for iconv config/ChangeLog: * lib-link.m4 (AC_LIB_LINKFLAGS_BODY): Add new --with-libXXX-type=... option. Use this to guide the selection of either a shared library or a static library. libcpp/ChangeLog: * configure: Regnerate. libstdc++-v3/ChangeLog: * configure: Regnerate. s/Regnerate/Regenerate/ This isn't strictly a regression bugfix. But given the nature of these files I think we probably need to be a bit more lax and allow safe changes so that downstream uses can move forward independent of the gcc development and release schedule. So, OK. Thanks for the flexibility. Now pushed. this (r10-6269, https://gcc.gnu.org/g:e7c26e04b2dd6266d62d5a5825ff7eb44d1cf14e) causes or exposes a problem which breaks bootstrap on all Darwin platforms I tried. Bootstrap fails stage1 self-check with: cc1: internal compiler error: in on_diagnostic, at input.c:2182 * AFAICT, this is caused by self-test attempting to do something that libcpp was not configured to support. * All viable Darwin platforms have libiconv installed (but Darwin’s /lib is /usr/lib; this might well apply to other BSD derivatives too). * Before the patch, libcpp and gcc configury finds this and they agree on the availability of ICONV (#define HAVE_ICONV 1). * After the patch libcpp no longer thinks iconv is available, but gcc continues to find it - and that, I think, leads to it attempting tests for which libcpp has not been configured. * I can work around this by adding —with-iconv-prefix=/usr to the configure line which forces both libcpp and gcc to find the library explicitly - but that’s only a short-term solution. Can you clarify why there’s no need to match the configury changes in libcpp / gcc / libstdc++ ? thanks Iain
Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
Hi Stam, On 1/8/20 3:18 PM, Stam Markianos-Wright wrote: On 12/10/19 5:03 PM, Kyrill Tkachov wrote: Hi Stam, On 11/15/19 5:26 PM, Stam Markianos-Wright wrote: Pinging with more correct maintainers this time :) Also would need to backport to gcc7,8,9, but need to get this approved first! Sorry for the delay. Same here now! Sorry totally forget about this in the lead up to Xmas! Done the changes marked below and also removed the unnecessary extra #defines from the test. This is ok with a nit on the testcase... diff --git a/gcc/testsuite/gcc.target/arm/pr91816.c b/gcc/testsuite/gcc.target/arm/pr91816.c new file mode 100644 index ..757c897e9c0db32709227b3fdf1b4a8033428232 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr91816.c @@ -0,0 +1,61 @@ +/* { dg-do compile } */ +/* { dg-options "-march=armv7-a -mthumb -mfpu=vfpv3-d16" } */ +int printf(const char *, ...); + I think this needs a couple of effective target checks like arm_hard_vfp_ok and arm_thumb2_ok. See other tests in gcc.target/arm that add -mthumb to the options. Thanks, Kyrill
[COMMITTED] testsuite: Add -Wpsabi to gcc.dg/torture/pr93170.c [PR93460]
Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. Pushed as obvious. Richard 2020-01-28 Richard Sandiford gcc/testsuite/ PR testsuite/93460 * gcc.dg/torture/pr93170.c: Add -Wpsabi. --- gcc/testsuite/gcc.dg/torture/pr93170.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/testsuite/gcc.dg/torture/pr93170.c b/gcc/testsuite/gcc.dg/torture/pr93170.c index 25a93a3743e..80910bfb3d5 100644 --- a/gcc/testsuite/gcc.dg/torture/pr93170.c +++ b/gcc/testsuite/gcc.dg/torture/pr93170.c @@ -1,5 +1,6 @@ /* { dg-do run } */ /* { dg-require-effective-target int128 } */ +/* { dg-options "-Wno-psabi" } */ /* { dg-additional-options "-frename-registers -fno-tree-forwprop -fno-tree-fre -fira-algorithm=priority -mstringop-strategy=loop --param=hot-bb-frequency-fraction=0 -Wno-psabi" { target { x86_64-*-* i?86-*-* } } } */ typedef unsigned char v64u8 __attribute__ ((vector_size (64)));
[PATCH] testsuite: XFAIL gcc.dg/torture/pr93133.c for powerpc*-*-* [PR93393]
Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. OK to install? Richard 2020-01-28 Richard Sandiford gcc/testsuite/ PR testsuite/93393 * gcc.dg/torture/pr93133.c: XFAIL for powerpc*-*-*. --- gcc/testsuite/gcc.dg/torture/pr93133.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/torture/pr93133.c b/gcc/testsuite/gcc.dg/torture/pr93133.c index 21eae1eb3dd..19af6b8ee6d 100644 --- a/gcc/testsuite/gcc.dg/torture/pr93133.c +++ b/gcc/testsuite/gcc.dg/torture/pr93133.c @@ -1,4 +1,4 @@ -/* { dg-do run } */ +/* { dg-do run { xfail powerpc*-*-* } } */ /* { dg-add-options ieee } */ /* { dg-require-effective-target fenv_exceptions } */ /* { dg-skip-if "fenv" { powerpc-ibm-aix* } } */
[Patch, committed][Fortran] avoid ICE in gfc_omp_check_optional_argument (PR93464)
(This was a GCC 10 regression, affecting both OpenMP and OpenACC.) In gfc_omp_check_optional_argument: DECL_LANG_SPECIFIC is always available (check at the top). However, if decl is not a PARM_DECL, it can have two reasons: Either it is no parameter (C sense; Fortran: dummy argument) at all or – as gfortran does for assumed-shape arrays – it is a locally defined decl which belongs to the array parameter. In the latter case, GFC_DECL_SAVED_DESCRIPTOR contains the actual PARAM_DECL. Well, as this test case shows, the GFC_DESCRIPTOR_TYPE_P condition is fulfilled, but as it is a locally defined variable, GFC_DECL_SAVED_DESCRIPTOR is NULL. The reason that it didn't show up before is that most of the local arrays do not have DECL_LANG_SPECIFIC – this one does because it is a character string (used for the string length). Applied as obvious after building and regtesting. Cheers, Tobias commit 627d59b6b3062de921fbdd80b2b48de18f599d03 Author: Tobias Burnus Date: Tue Jan 28 11:54:57 2020 +0100 [Fortran] avoid ICE in gfc_omp_check_optional_argument (PR93464) PR fortran/93464 * openmp.c (gfc_omp_check_optional_argument): Avoid ICE when DECL_LANG_SPECIFIC and GFC_DESCRIPTOR_TYPE_P but not GFC_DECL_SAVED_DESCRIPTOR as for local allocatable character vars. PR fortran/93464 * gfortran.dg/goacc/pr93464.f90: New. diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog index 4040ff284b3..eb8842b0ab8 100644 --- a/gcc/fortran/ChangeLog +++ b/gcc/fortran/ChangeLog @@ -1,9 +1,16 @@ +2020-01-28 Tobias Burnus + + PR fortran/93464 + * openmp.c (gfc_omp_check_optional_argument): Avoid ICE when + DECL_LANG_SPECIFIC and GFC_DESCRIPTOR_TYPE_P but not + GFC_DECL_SAVED_DESCRIPTOR as for local allocatable character vars. + 2020-01-28 Tobias Burnus * gfortran.texi (Runtime): Remove tailing '.' in @menu. 2020-01-27 Tobias Burnus PR fortran/85781 * trans-expr.c (gfc_conv_substring): Handle non-ARRAY_TYPE strings of Bind(C) procedures. diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index fd60bbbed5d..9550499 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -95,19 +95,20 @@ gfc_omp_check_optional_argument (tree decl, bool for_present_check) /* For assumed-shape arrays, a local decl with arg->data is used. */ if (TREE_CODE (decl) != PARM_DECL && (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)) || GFC_ARRAY_TYPE_P (TREE_TYPE (decl { is_array_type = true; decl = GFC_DECL_SAVED_DESCRIPTOR (decl); } - if (TREE_CODE (decl) != PARM_DECL + if (decl == NULL_TREE + || TREE_CODE (decl) != PARM_DECL || !DECL_LANG_SPECIFIC (decl) || !GFC_DECL_OPTIONAL_ARGUMENT (decl)) return NULL_TREE; /* Scalars with VALUE attribute which are passed by value use a hidden argument to denote the present status. They are passed as nonpointer type with one exception: 'type(c_ptr), value' as 'void*'. */ /* Cf. trans-expr.c's gfc_conv_expr_present. */ if (TREE_CODE (TREE_TYPE (decl)) != POINTER_TYPE diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index eac18206b12..d9441cb0a2e 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,9 +1,14 @@ +2020-01-28 Tobias Burnus + + PR fortran/93464 + * gfortran.dg/goacc/pr93464.f90: New. + 2020-01-28 Richard Sandiford PR tree-optimization/93434 * gcc.c-torture/execute/pr93434.c: New test. 2020-01-28 Richard Sandiford PR testsuite/93460 * gcc.dg/torture/pr93170.c: Add -Wpsabi. diff --git a/gcc/testsuite/gfortran.dg/goacc/pr93464.f90 b/gcc/testsuite/gfortran.dg/goacc/pr93464.f90 new file mode 100644 index 000..922106540f9 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/goacc/pr93464.f90 @@ -0,0 +1,16 @@ +! { dg-do compile } +! +! PR fortran/93464 +! +! Contributed by G. Steinmetz +! +program p + character :: c(2) = 'a' + character, allocatable :: z(:) + !$acc parallel + !$omp target + z = c + !$acc end parallel + !$omp end target + print *, z +end
Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]
Jeff Law writes: > On Mon, 2020-01-27 at 16:41 +, Richard Sandiford wrote: >> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have: >> >> Failed to match this instruction: >> (set (reg:SI 95) >> (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> If we perform the natural simplification to: >> >> (set (reg:SI 95) >> (ashift:SI (sign_extract:SI (reg:SI 97) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> then the pattern matches. And it turns out that we do have a >> simplification like that already, but it would only kick in for >> extractions from a reg, not a subreg. E.g.: > Yea. I ran into similar problems with the extract/extend bits in > combine. And we know it's a fairly general problem that we don't > handle SUBREGs anywhere near as consistently as REGs. > > >> >> (set (reg:SI 95) >> (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> would simplify to: >> >> (set (reg:SI 95) >> (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> IMO the subreg case is even more obviously a simplification >> than the bare reg case, since the net effect is to remove >> either one or two subregs, rather than simply change the >> position of a subreg/truncation. >> >> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c >> for -m32 on x86_64-linux-gnu, because we could then simplify >> a :HI zero_extract to a :QI one. The associated *testqi_ext_3 >> pattern did already seem to want to handle QImode extractions: >> >> "ix86_match_ccmode (insn, CCNOmode) >>&& ((TARGET_64BIT && GET_MODE (operands[2]) == DImode) >>|| GET_MODE (operands[2]) == SImode >>|| GET_MODE (operands[2]) == HImode >>|| GET_MODE (operands[2]) == QImode) >> >> but I'm not sure how often the QI case would trigger in practice, >> since the zero_extract mode was restricted to HI and above. I checked >> the other x86 patterns and couldn't see any other instances of this. >> >> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu, >> OK to install? >> >> Richard >> >> >> 2020-01-27 Richard Sandiford >> >> gcc/ >> PR rtl-optimization/87763 >> * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract >> simplification to handle subregs as well as bare regs. >> * config/i386/i386.md (*testqi_ext_3): Match QI extracts too. > Do you need to check for and reject paradoxicals here? If not, OK as- > is. If you need to check, then that's pre-approved as well. Thanks, pushed. On the paradoxical thing: the outer subreg is always non-paradoxical for simplify_truncation, so we don't need to check that specifically. For the inner subreg we need to handle the paradoxical case for the PR. I wondered at one point about punting for non-paradoxical inner subregs, but there didn't seem to be a good reason to keep an expression like (truncate (op (truncate...))) over (op (truncate ...)). If it turns out there is a good reason though, changing SUBREG_P to paradoxical_subreg_p would be fine in terms of this PR. Richard
Re: [RFC] [c-family] PR92867 - Add returns_arg attribute
On Mon, 27 Jan 2020 at 17:36, Richard Biener wrote: > > On Fri, Jan 24, 2020 at 11:53 PM Joseph Myers wrote: > > > > On Fri, 24 Jan 2020, Prathamesh Kulkarni wrote: > > > > > The middle-end representation issue of ERF_RETURNS_ARG still remains, > > > which restricts the attribute till first four args. The patch simply > > > emits sorry(), for arguments beyond first four.. > > > > I think this should be fixed (e.g. make the middle-end check for the > > attribute, or something like that). > > Since it's pure optimization you can also simply chose to ignore this without > notice. > > Note ERF_RETURN_ARG_MASK is limited to the number of available > bits in an int and practically the only current setter was via "fn spec" > which uses a single digit [1-9] to denote the argument (so limiting to > three is indeed an odd choice but matches builtins using this at the moment). > > Feel free to up ERF_RETURN_ARG_MASK (but then you need to adjust > the ERF_ flag defines). Hi, Thanks for the suggestions. In the attached patch I bumped up value of ERF_RETURNS_ARG_MASK to UINT_MAX >> 2, and use highest two bits for ERF_NOALIAS and ERF_RETURNS_ARG. And use fn spec "Z" to store the argument number in fn-spec format. Does that look OK ? In gimple_call_return_flags, I didn't remove the existing fn spec "0-3" in this patch, since RET1 (and possibly others?) depend on it. I will remove that and adjust other cases to use fn-spec "Z" if that's OK, in follow-up patch. Thanks, Prathamesh > > > The language semantics of the > > attribute should not be driven by such internal implementation details; > > rather, implementation details should be determined by the language > > semantics to be implemented. > > > > The sorry () has coding style issues. Diagnostics should not end with '.' > > or '\n', should use full words (attribute not attr, arguments not args) > > and programming language text in them should be surrounded by %<%> (so > > %). > > > > -- > > Joseph S. Myers > > jos...@codesourcery.com diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index dc9579c5c60..2ed41ed136d 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -150,6 +150,7 @@ static tree handle_designated_init_attribute (tree *, tree, tree, int, bool *); static tree handle_patchable_function_entry_attribute (tree *, tree, tree, int, bool *); static tree handle_copy_attribute (tree *, tree, tree, int, bool *); +static tree handle_returns_arg_attribute (tree *, tree, tree, int, bool *); /* Helper to define attribute exclusions. */ #define ATTR_EXCL(name, function, type, variable) \ @@ -484,6 +485,8 @@ const struct attribute_spec c_common_attribute_table[] = handle_noinit_attribute, attr_noinit_exclusions }, { "access", 1, 3, false, true, true, false, handle_access_attribute, NULL }, + { "returns_arg", 1, 1, true, false, false, false, + handle_returns_arg_attribute, NULL }, { NULL, 0, 0, false, false, false, false, NULL, NULL } }; @@ -4603,6 +4606,53 @@ handle_patchable_function_entry_attribute (tree *, tree name, tree args, return NULL_TREE; } +/* Handle a "returns_arg" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_returns_arg_attribute (tree *node, tree name, tree args, + int flags, bool *no_add_attrs) +{ + tree decl = *node; + tree rettype = TREE_TYPE (decl); + + if (TREE_CODE (rettype) == METHOD_TYPE + || TREE_CODE (rettype) == FUNCTION_TYPE) +rettype = TREE_TYPE (rettype); + + if (VOID_TYPE_P (rettype)) +{ + warning_at (DECL_SOURCE_LOCATION (decl), OPT_Wattributes, + "%qE attribute ignored on a function returning %qT.", + name, rettype); + *no_add_attrs = true; + return NULL_TREE; +} + + if (!positional_argument (TREE_TYPE (decl), name, TREE_VALUE (args), + TREE_CODE (rettype))) +{ + *no_add_attrs = true; + return NULL_TREE; +} + + *no_add_attrs = false; + gcc_assert (args); + tree val = TREE_VALUE (args); + unsigned HOST_WIDE_INT argnum = tree_to_uhwi (val); + char *s = (char *) xmalloc (sizeof (char) * BITS_PER_WORD); + s[0] = 'Z'; + sprintf (s + 1, "%lu", argnum); + + tree attr = tree_cons (get_identifier ("fn spec"), + build_tree_list (NULL_TREE, + build_string (strlen (s), s)), + NULL_TREE); + decl_attributes (node, attr, flags); + free (s); + return NULL_TREE; +} + /* Attempt to partially validate a single attribute ATTR as if it were to be applied to an entity OPER. */ diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index ec99c38a607..3531e0c8292 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -3566,6 +3566,19 @@ diagnosed. Because a pure function cannot have any observable side effects it does not make sense for such a function to return @code{void}. Declaring such a function is diagnosed. +@item returns_arg (@var{position}) +@cindex @code{returns_
Re: [RFC] [c-family] PR92867 - Add returns_arg attribute
On Tue, Jan 28, 2020 at 04:56:59PM +0530, Prathamesh Kulkarni wrote: > Thanks for the suggestions. In the attached patch I bumped up value of > ERF_RETURNS_ARG_MASK > to UINT_MAX >> 2, and use highest two bits for ERF_NOALIAS and > ERF_RETURNS_ARG. > And use fn spec "Z" to store the argument number in fn-spec format. > Does that look OK ? No. +#define ERF_RETURN_ARG_MASK(UINT_MAX >> 2) /* Nonzero if the return value is equal to the argument number flags & ERF_RETURN_ARG_MASK. */ -#define ERF_RETURNS_ARG(1 << 2) +#define ERF_RETURNS_ARG(1 << (BITS_PER_WORD - 2)) How is size of host int related to BITS_PER_WORD? Not to mention that if BITS_PER_WORD is 64 and host int is 32-bit, 1 << (64 - 2) is UB. Jakub
Re: [RFC] [c-family] PR92867 - Add returns_arg attribute
On Tue, 28 Jan 2020 at 17:00, Jakub Jelinek wrote: > > On Tue, Jan 28, 2020 at 04:56:59PM +0530, Prathamesh Kulkarni wrote: > > Thanks for the suggestions. In the attached patch I bumped up value of > > ERF_RETURNS_ARG_MASK > > to UINT_MAX >> 2, and use highest two bits for ERF_NOALIAS and > > ERF_RETURNS_ARG. > > And use fn spec "Z" to store the argument number in fn-spec format. > > Does that look OK ? > > No. > > +#define ERF_RETURN_ARG_MASK(UINT_MAX >> 2) > > /* Nonzero if the return value is equal to the argument number > flags & ERF_RETURN_ARG_MASK. */ > -#define ERF_RETURNS_ARG(1 << 2) > +#define ERF_RETURNS_ARG(1 << (BITS_PER_WORD - 2)) > > How is size of host int related to BITS_PER_WORD? Not to mention that > if BITS_PER_WORD is 64 and host int is 32-bit, 1 << (64 - 2) is UB. Oops sorry. I should have used HOST_BITS_PER_INT. I assume that'd be correct ? Thanks, Prathamesh > > Jakub >
Re: [PR47785] COLLECT_AS_OPTIONS
On Fri, Jan 24, 2020 at 7:04 AM Prathamesh Kulkarni wrote: > > On Mon, 20 Jan 2020 at 15:44, Richard Biener > wrote: > > > > On Wed, Jan 8, 2020 at 11:20 AM Prathamesh Kulkarni > > wrote: > > > > > > On Tue, 5 Nov 2019 at 17:38, Richard Biener > > > wrote: > > > > > > > > On Tue, Nov 5, 2019 at 12:17 AM Kugan Vivekanandarajah > > > > wrote: > > > > > > > > > > Hi, > > > > > Thanks for the review. > > > > > > > > > > On Tue, 5 Nov 2019 at 03:57, H.J. Lu wrote: > > > > > > > > > > > > On Sun, Nov 3, 2019 at 6:45 PM Kugan Vivekanandarajah > > > > > > wrote: > > > > > > > > > > > > > > Thanks for the reviews. > > > > > > > > > > > > > > > > > > > > > On Sat, 2 Nov 2019 at 02:49, H.J. Lu wrote: > > > > > > > > > > > > > > > > On Thu, Oct 31, 2019 at 6:33 PM Kugan Vivekanandarajah > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > On Wed, 30 Oct 2019 at 03:11, H.J. Lu > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > On Sun, Oct 27, 2019 at 6:33 PM Kugan Vivekanandarajah > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > Hi Richard, > > > > > > > > > > > > > > > > > > > > > > Thanks for the review. > > > > > > > > > > > > > > > > > > > > > > On Wed, 23 Oct 2019 at 23:07, Richard Biener > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Oct 21, 2019 at 10:04 AM Kugan Vivekanandarajah > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Richard, > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the pointers. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Oct 2019 at 22:33, Richard Biener > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Oct 11, 2019 at 6:15 AM Kugan > > > > > > > > > > > > > > Vivekanandarajah > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Richard, > > > > > > > > > > > > > > > Thanks for the review. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 2 Oct 2019 at 20:41, Richard Biener > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 2, 2019 at 10:39 AM Kugan > > > > > > > > > > > > > > > > Vivekanandarajah > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As mentioned in the PR, attached patch adds > > > > > > > > > > > > > > > > > COLLECT_AS_OPTIONS for > > > > > > > > > > > > > > > > > passing assembler options specified with -Wa, > > > > > > > > > > > > > > > > > to the link-time driver. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The proposed solution only works for uniform > > > > > > > > > > > > > > > > > -Wa options across all > > > > > > > > > > > > > > > > > TUs. As mentioned by Richard Biener, > > > > > > > > > > > > > > > > > supporting non-uniform -Wa flags > > > > > > > > > > > > > > > > > would require either adjusting partitioning > > > > > > > > > > > > > > > > > according to flags or > > > > > > > > > > > > > > > > > emitting multiple object files from a single > > > > > > > > > > > > > > > > > LTRANS CU. We could > > > > > > > > > > > > > > > > > consider this as a follow up. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Bootstrapped and regression tests on > > > > > > > > > > > > > > > > > arm-linux-gcc. Is this OK for trunk? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > While it works for your simple cases it is > > > > > > > > > > > > > > > > unlikely to work in practice since > > > > > > > > > > > > > > > > your implementation needs the assembler options > > > > > > > > > > > > > > > > be present at the link > > > > > > > > > > > > > > > > command line. I agree that this might be the > > > > > > > > > > > > > > > > way for people to go when > > > > > > > > > > > > > > > > they face the issue but then it needs to be > > > > > > > > > > > > > > > > documented somewhere > > > > > > > > > > > > > > > > in the manual. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > That is, with COLLECT_AS_OPTION (why singular? > > > > > > > > > > > > > > > > I'd expected > > > > > > > > > > > > > > > > COLLECT_AS_OPTIONS) available to cc1 we could > > > > > > > > > > > > > > > > stream this string > > > > > > > > > > > > > > > > to lto_options and re-materialize it at link > > > > > > > > > > > > > > > > time (and diagnose mismatches > > > > > > > > > > > > > > > > even if we like). > > > > > > > > > > > > > > > OK. I will try to implement this. So the idea is > > > > > > > > > > > > > > > if we provide > > > > > > > > > > > > > > > -Wa,options as part of the lto compile, this > > > > > > > > > > > > > > > should be available > > > > > > > > > > > > > > > during link time. Like in: >
Re: [PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)
On Tue, Nov 26, 2019 at 3:18 PM Wilco Dijkstra wrote: > > Hi, > > While code hoisting generally improves codesize, it can affect performance > negatively. Benchmarking shows it doesn't help SPEC and negatively affects > embedded benchmarks. Since the impact is relatively small with -O2 and mainly > affects -O3, the simplest option is to disable code hoisting for -O3 and > higher. > > OK for commit? > > ChangeLog: > 2019-11-26 Wilco Dijkstra > > PR tree-optimization/80155 > * common/config/arm/arm-common.c (arm_option_optimization_table): > Disable -fcode-hoisting with -O3. > -- > > diff --git a/gcc/common/config/arm/arm-common.c > b/gcc/common/config/arm/arm-common.c > index > b761d3abd670a144a593c4b410b1e7fbdcb52f56..3e11f21b7dd76cc071b645c32a6fdb4a92511279 > 100644 > --- a/gcc/common/config/arm/arm-common.c > +++ b/gcc/common/config/arm/arm-common.c > @@ -39,6 +39,8 @@ static const struct default_options > arm_option_optimization_table[] = > /* Enable section anchors by default at -O1 or higher. */ > { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 }, > { OPT_LEVELS_FAST, OPT_fsched_pressure, NULL, 1 }, > +/* Disable code hoisting with -O3 or higher. */ > +{ OPT_LEVELS_3_PLUS, OPT_fcode_hoisting, NULL, 0 }, > { OPT_LEVELS_NONE, 0, NULL, 0 } >}; > What are the cases in O3 where this is slower with embedded benchmarks and how can we fix it ? Keeping the target "different" in this manner doesn't augur well for the long term. Ramana
Re: [RFC] [c-family] PR92867 - Add returns_arg attribute
On Tue, Jan 28, 2020 at 05:09:36PM +0530, Prathamesh Kulkarni wrote: > On Tue, 28 Jan 2020 at 17:00, Jakub Jelinek wrote: > > > > On Tue, Jan 28, 2020 at 04:56:59PM +0530, Prathamesh Kulkarni wrote: > > > Thanks for the suggestions. In the attached patch I bumped up value of > > > ERF_RETURNS_ARG_MASK > > > to UINT_MAX >> 2, and use highest two bits for ERF_NOALIAS and > > > ERF_RETURNS_ARG. > > > And use fn spec "Z" to store the argument number in fn-spec > > > format. > > > Does that look OK ? > > > > No. > > > > +#define ERF_RETURN_ARG_MASK(UINT_MAX >> 2) > > > > /* Nonzero if the return value is equal to the argument number > > flags & ERF_RETURN_ARG_MASK. */ > > -#define ERF_RETURNS_ARG(1 << 2) > > +#define ERF_RETURNS_ARG(1 << (BITS_PER_WORD - 2)) > > > > How is size of host int related to BITS_PER_WORD? Not to mention that > > if BITS_PER_WORD is 64 and host int is 32-bit, 1 << (64 - 2) is UB. > Oops sorry. I should have used HOST_BITS_PER_INT. > I assume that'd be correct ? It still wouldn't, 1 << (HOST_BITS_PER_INT - 1) is negative number, you'd need either 1U and verify all ERF_* flags uses, or avoid using the sign bit. The patch has other issues, you don't verify that the argnum fits into the bits (doesn't overflow into the other ERF_* bits), in + char *s = (char *) xmalloc (sizeof (char) * BITS_PER_WORD); + s[0] = 'Z'; + sprintf (s + 1, "%lu", argnum); 1) sizeof (char) is 1 by definition 2) it is pointless to allocate it and then deallocate (just use automatic array) 3) it is unclear how is BITS_PER_WORD related to the length of decimal encoded string + Z char + terminating '\0'. The usual way is for unsigned numbers to estimate number of digits by counting 3 digits per each 8 bits, in your case of course + 2. More importantly, the "fn spec" attribute isn't used just in gimple_call_return_flags, but also in e.g. gimple_call_arg_flags which assumes that the return stuff in there is a single char and the reaming chars are for argument descriptions, or in decl_return_flags which you haven't modified. I must say I fail to see the point in trying to glue this together into the "fn spec" argument so incompatibly, why can't we handle the fn spec with its 1-4 returns_arg and returns_arg attribute with arbitrary position side-by-side? Jakub
Re: [PATCH] doc: clarify the situation with pointer arithmetic
Am Dienstag, den 28.01.2020, 10:20 +0300 schrieb Alexander Monakov: > On Tue, 28 Jan 2020, Uecker, Martin wrote: > > > > (*) this also shows the level of "obfuscation" needed to fool compilers > > > to lose provenance knowledge is hard to predict. > > > > Well, this is exactly the problem we want to address by defining > > a clear way to do this. Casting to an integer would be the way > > to state: "consider the pointer as escaped and forget the > > provenance" and casting an integer to a pointer would > > mean "this pointer may point to all objects whose pointer has > > escaped". This would give the programmer explicit control about > > this aspect and make most existing code using pointer-to-integer > > casts well-defined. At the same time, this should be simple > > to add to existing points-to analysis. > > Can you explain why you make it required for the compiler to treat the > points-to set unnecessarily broader than it could prove? In the Matlab > example, there's a simple chain of computations that the compiler is > following to prove that the pointer resulting from the final cast is > derived from exactly one other pointer (no other pointers have > participated in the computations). > > Or, in other words: > > is there an example where a programmer can distinguish between the > requirement you explain above vs. the fine-grained interpretation > that GIMPLE aims to implement (at least as I understand it), which is: > > when the program creates a pointer by means of non-pointer computations > (casts, representation access, etc), the resulting pointer may point to: > > * any object which address could have participated in the computation > (which is at worst the entire set of "exposed" objects up to that > program point, but can be much narrower if the compiler can see > the entire chain of computations) > > * any objects which is not "exposed" but could have known address, e.g. > because it is placed at a specific address during linking Unfortunately, this is not as simple. It is not trivial to define the set of objects whose "address could have participated in the computation" int a = ... random number int b = &y; if (a == b) { int *p = (int*)a; Did '&y' participate in the computation? What if you output and integer using I/O and read it back in? What if you copy an integer using control flow? There are many similar questions like this. If we want to make this part of the standard, we need to formulate rules for all integer operations about how the provenance flows. There are several problems with this. A compiler needs to be able to compute the complete points-to set. If it might miss an object which is allowed to be used it has to be conservative with aliasing - the analysis become useless. So we always have the trade-off between making the rules simpler and restrict the tracking or specify more complicated rules that allow sophisticated tracking but then all compilers that want to do this optimization have to implement this complicated rules or need to fall back to being conservative. Finally, all integer operations would have a potential hidden second meaning when applied to addresses of objects, which makes it much easier to reason about. Assume you have a function int difference(int a, int b) { return a - b; } And later you replace an expression 'difference(a, a)' with '0' in the program. This seems a trivial and logical thing to do, but if this expression was applied to addresses, you might have broken a provenance chain and introduced a bug. In my opinion, integers should stay integers with simple logical properties. Gruß, Martin
Re: [PATCH][AArch64] Fix shrinkwrapping interactions with atomics (PR92692)
Hi! On Mon, Jan 27, 2020 at 04:28:29PM +, Wilco Dijkstra wrote: > > On Thu, Jan 16, 2020 at 12:50:14PM +, Wilco Dijkstra wrote: > >> The separate shrinkwrapping pass may insert stores in the middle > >> of atomics loops which can cause issues on some implementations. > >> Avoid this by delaying splitting of atomic patterns until after > >> prolog/epilog generation. > > > > Note that this isn't specific to sws at all: there isn't anything > > stopping later passes from doing this either. Is there anything that > > protects us from sched2 doing similar here, for example? > > The expansions create extra basic blocks and insert various barriers > that would stop any reasonable scheduler from doing it. And the > current scheduler is basic block based. Hrm, and probably no other passes after *logue can create new store insns (ignoring peepholes and splitters, that would be almost *asking* for trouble ;-) ). I see. Thanks for explaining. Segher
Re: [PATCH] doc: clarify the situation with pointer arithmetic
Am Dienstag, den 28.01.2020, 11:01 +0100 schrieb Richard Biener: > On Tue, Jan 28, 2020 at 8:20 AM Alexander Monakov wrote: > > > > On Tue, 28 Jan 2020, Uecker, Martin wrote: > > > > > > (*) this also shows the level of "obfuscation" needed to fool compilers > > > > to lose provenance knowledge is hard to predict. > > > > > > Well, this is exactly the problem we want to address by defining > > > a clear way to do this. Casting to an integer would be the way > > > to state: "consider the pointer as escaped and forget the > > > provenance" and casting an integer to a pointer would > > > mean "this pointer may point to all objects whose pointer has > > > escaped". This would give the programmer explicit control about > > > this aspect and make most existing code using pointer-to-integer > > > casts well-defined. At the same time, this should be simple > > > to add to existing points-to analysis. > > > > Can you explain why you make it required for the compiler to treat the > > points-to set unnecessarily broader than it could prove? In the Matlab > > example, there's a simple chain of computations that the compiler is > > following to prove that the pointer resulting from the final cast is > > derived from exactly one other pointer (no other pointers have > > participated in the computations). > > > > Or, in other words: > > > > is there an example where a programmer can distinguish between the > > requirement you explain above vs. the fine-grained interpretation > > that GIMPLE aims to implement (at least as I understand it), which is: > > > > when the program creates a pointer by means of non-pointer computations > > (casts, representation access, etc), the resulting pointer may point to: > > > > * any object which address could have participated in the computation > > (which is at worst the entire set of "exposed" objects up to that > > program point, but can be much narrower if the compiler can see > > the entire chain of computations) > > > > * any objects which is not "exposed" but could have known address, e.g. > > because it is placed at a specific address during linking > > Note for the current PTA implementation there's almost no cases we can > handle conservatively enough. Consider the simple > > int a[4]; > int *p = &a[1]; > uintptr_t pi = (uintptr_t)p; > pi += 4; > int *q = (int *)pi; > > our PTA knows that p points to a (but not the exact offset), same for pi > (the cast doesn't change the value). Now you add 4 - this could lead > you outside of 'a' so the points-to set becomes 'a and anything'. PVNI would say that 'a' gets exposed in the first case and then 'q' can point to all exposed objects because of the second cast. A correct and conservative implementation would do this: In the PTA you would need to mark the address of a escaped and then later assign a conservative points-to set (all escaped objects) to q. Yes, this limits optimizations, but I do not think this is terrible. (optimizations could be re-enabled with a compiler option) > I'm also not sure what PVNI does to > > int a[4]; > int *p = &a[1]; > p += 10; > uintptr_t pi = (uintptr_t)p; > p = (int *)pi; > > we assume that p points to a even after p += 10 (but it of course points > outside of the object - obvious here, but not necessarily in more > obfuscated cases). This is UB because a pointer to outside of the object is formed. > Now, can we assume pi points to a? The cast isn't value-changing. Do we have > to assume (int *)pi points to anything? So, is > > p = (int *)(uintptr_t)p; > > something like "laundering" a pointer? We don't preserve such simple > round-trip > casts since they are value-preserving. Are they provenance preserving? Yes, such a pair would be "laundering" as it allows 'p' to then point to any exposed object provenance-wise. For such casts the FE would maybe add a marker. Maybe a calls to builtin functions 'builtin_expose(a)' and 'builting_bless'. (having those builtins would be interesting on its own, btw). Having said this, some optimizations could still be allowed using the "as-if" rule and other lines of reasoning. Specifally, PVNI states that 'p' gets assigned the provenance of the object the integer values is the address of. So if the compiler can proof that the address belongs to certain objects it can reassign the points-to set to the new 'p'. Only if there is ambiguity between which objects the address belongs to, the reasoning needs to be more conservative. For example: int a[3]; int b[3]; &b; // b also exposed int *p = (int*)(uintptr_t)&a[3]; Here, p could point to the one-after address of 'a' or the first element of 'b'. (but only because 'b' was also exposed). If the compiler can prove that something like this can not happen (e.g. by considering offsets), it can still do some tracking of points-to sets. Best, Martin > Richard. > > > Thanks. > > Alexander
[PATCH] tree-optimization/93439 move clique bookkeeping to OMP expansion
Autopar was doing clique bookkeeping too early when creating destination functions but then later introducing new cliques via versioning loops. The following moves the bookkeeping to the actual outlining process. Bootstrapped & tested on x86_64-unknown-linux-gnu, pushed. Richard. 2020-01-28 Richard Biener PR tree-optimization/93439 * tree-parloops.c (create_loop_fn): Move clique bookkeeping... * tree-cfg.c (move_sese_region_to_fn): ... here. (verify_types_in_gimple_reference): Verify used cliques are tracked. * gfortran.dg/graphite/pr93439.f90: New testcase. --- gcc/testsuite/gfortran.dg/graphite/pr93439.f90 | 21 + gcc/tree-cfg.c | 17 + gcc/tree-parloops.c| 1 - 3 files changed, 38 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gfortran.dg/graphite/pr93439.f90 diff --git a/gcc/testsuite/gfortran.dg/graphite/pr93439.f90 b/gcc/testsuite/gfortran.dg/graphite/pr93439.f90 new file mode 100644 index 000..e815ab929e1 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/graphite/pr93439.f90 @@ -0,0 +1,21 @@ +! { dg-additional-options "-O2 -floop-parallelize-all -floop-unroll-and-jam -ftree-parallelize-loops=2" } + +module ai + integer, parameter :: dp = 8 +contains + subroutine qu(ja, nq, en, p5) +real(kind = dp) :: nq(ja), en(ja), p5(ja) +call tl(ja, nq, en, p5) + end subroutine qu + + subroutine tl(ja, nq, en, p5) +real(kind = dp) :: nq(9), en(9 * ja), p5(3 * ja) +do mc = 1, ja + do mb = 1, 9 + do ma = 1, 3 + p5((mc - 1) * 3 + ma) = p5((mc - 1) * 3 + ma) - 1 + end do + end do +end do + end subroutine tl +end module ai diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index fd69b366bf4..f7b817d94e6 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -3226,6 +3226,13 @@ verify_types_in_gimple_reference (tree expr, bool require_lvalue) debug_generic_stmt (expr); return true; } + if (MR_DEPENDENCE_CLIQUE (expr) != 0 + && MR_DEPENDENCE_CLIQUE (expr) > cfun->last_clique) + { + error ("invalid clique in %qs", code_name); + debug_generic_stmt (expr); + return true; + } } else if (TREE_CODE (expr) == TARGET_MEM_REF) { @@ -3245,6 +3252,13 @@ verify_types_in_gimple_reference (tree expr, bool require_lvalue) debug_generic_stmt (expr); return true; } + if (MR_DEPENDENCE_CLIQUE (expr) != 0 + && MR_DEPENDENCE_CLIQUE (expr) > cfun->last_clique) + { + error ("invalid clique in %qs", code_name); + debug_generic_stmt (expr); + return true; + } } else if (TREE_CODE (expr) == INDIRECT_REF) { @@ -7744,6 +7758,9 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb, after = bb; } + /* Adjust the maximum clique used. */ + dest_cfun->last_clique = saved_cfun->last_clique; + loop->aux = NULL; loop0->aux = NULL; /* Loop sizes are no longer correct, fix them up. */ diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index d315b797a66..d9250d36c72 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2202,7 +2202,6 @@ create_loop_fn (location_t loc) DECL_ARGUMENTS (decl) = t; allocate_struct_function (decl, false); - DECL_STRUCT_FUNCTION (decl)->last_clique = act_cfun->last_clique; /* The call to allocate_struct_function clobbers CFUN, so we need to restore it. */ -- 2.16.4
Re: [PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)
On Tue, Jan 28, 2020 at 10:42:16AM +0100, Richard Biener wrote: > On Mon, Jan 27, 2020 at 10:47 PM Segher Boessenkool > wrote: > > On Tue, Jan 21, 2020 at 02:10:21PM +, Wilco Dijkstra wrote: > > > While code hoisting generally improves codesize, it can affect performance > > > negatively. Benchmarking shows it doesn't help SPEC and negatively affects > > > embedded benchmarks. Since the impact is relatively small with -O2 and > > > mainly > > > affects -O3, the simplest option is to disable code hoisting for -O3 and > > > higher. > > > > Should this be a generic thing, not target-specific? > > The change doesn't make sense - I'm sure if you'd look at the actual cases > it's not code hoisting in itself but "optimizations" enabled by it that cause > the issues. It's probably the same thing as PRE causing issues downstream > for some benchmarks but do you disable PRE then by default just because of > that? Well, that depends on how bad it is. And of course not if it is just because of benchmark peculiarities, we're not a banchmark compiler after all, we're meant to compile real code. But if PRE (just to continue your example) would mess that up more often than not, then yes, it should not be enabled by default. Segher
[PATCH] libstdc++: Replace glibc-specific check for clock_gettime (PR 93325)
It's wrong to assume that clock_gettime is unavailable on any *-*-linux* target that doesn't have glibc 2.17 or later. Use a generic test instead of using __GLIBC_PREREQ. Only do that test when is_hosted=yes so that we don't get an error for cross targets without a working linker. This ensures that C library's clock_gettime will be used on non-glibc targets, instead of an incorrect syscall to SYS_clock_gettime. PR libstdc++/93325 * acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Use AC_SEARCH_LIBS for clock_gettime instead of explicit glibc version check. * configure: Regenerate. Tested powerpc64le-linux, built on centos7 and centos6 as well to check the results are correct for various versions of glibc. Committed to master. I plan to backport this later too. commit 759812fddc81c0c131d4633b2a7f56412ce8dbed Author: Jonathan Wakely Date: Tue Jan 28 13:24:09 2020 + libstdc++: Replace glibc-specific check for clock_gettime (PR 93325) It's wrong to assume that clock_gettime is unavailable on any *-*-linux* target that doesn't have glibc 2.17 or later. Use a generic test instead of using __GLIBC_PREREQ. Only do that test when is_hosted=yes so that we don't get an error for cross targets without a working linker. This ensures that C library's clock_gettime will be used on non-glibc targets, instead of an incorrect syscall to SYS_clock_gettime. PR libstdc++/93325 * acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Use AC_SEARCH_LIBS for clock_gettime instead of explicit glibc version check. * configure: Regenerate. diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 index 39147916e94..ee5e0336f2c 100644 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -1422,20 +1422,14 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [ ac_has_nanosleep=yes ;; gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu) -AC_MSG_CHECKING([for at least GNU libc 2.17]) -AC_TRY_COMPILE( - [#include ], - [ - #if ! __GLIBC_PREREQ(2, 17) - #error - #endif - ], - [glibcxx_glibc217=yes], [glibcxx_glibc217=no]) -AC_MSG_RESULT($glibcxx_glibc217) - -if test x"$glibcxx_glibc217" = x"yes"; then - ac_has_clock_monotonic=yes - ac_has_clock_realtime=yes +# Don't use link test for freestanding library, in case gcc_no_link=yes +if test x"$is_hosted" = xyes; then + # Versions of glibc before 2.17 needed -lrt for clock_gettime. + AC_SEARCH_LIBS(clock_gettime, [rt]) + if test x"$ac_cv_search_clock_gettime" = x"none required"; then +ac_has_clock_monotonic=yes +ac_has_clock_realtime=yes + fi fi ac_has_nanosleep=yes ac_has_sched_yield=yes
[COMMITTED] libstdc++: Avoid using sizeof with function types (PR 93470)
PR libstdc++/93470 * include/bits/refwrap.h (reference_wrapper::operator()): Restrict static assertion to object types. Tested powerpc64le-linux and verified with Clang. Committed to master. Backport to gcc-9-branch needed too. commit 72a9fd209b6db3b5f81fbb008e22ea93c00465e5 Author: Jonathan Wakely Date: Tue Jan 28 13:24:09 2020 + libstdc++: Avoid using sizeof with function types (PR 93470) PR libstdc++/93470 * include/bits/refwrap.h (reference_wrapper::operator()): Restrict static assertion to object types. diff --git a/libstdc++-v3/include/bits/refwrap.h b/libstdc++-v3/include/bits/refwrap.h index 2bcd44d3894..717aa01629c 100644 --- a/libstdc++-v3/include/bits/refwrap.h +++ b/libstdc++-v3/include/bits/refwrap.h @@ -343,7 +343,8 @@ _GLIBCXX_MEM_FN_TRAITS(&& noexcept, false_type, true_type) operator()(_Args&&... __args) const { #if __cplusplus > 201703L - static_assert(sizeof(type), "type must be complete"); + if constexpr (is_object_v) + static_assert(sizeof(type), "type must be complete"); #endif return std::__invoke(get(), std::forward<_Args>(__args)...); }
Re: [PATCH] Fix component mappings with derived types for OpenACC
On Fri, 24 Jan 2020 10:58:49 +0100 Tobias Burnus wrote: > Hi Julian, > > the gfortran part is rather obvious and it and the test case look > fine to me → OK. > The oacc-mem.c also looks okay, but I assume Thomas needs to > rubber-stamp it. I understand that Thomas is unavailable for the time being, so won't be able to use his rubber-stamp powers. I added the offending libgomp code to start with though, so I think I can go ahead and commit the patch. I'll hold off for 24 hours though in case there are any objections (Jakub?). Thanks, Julian > On 1/10/20 2:49 AM, Julian Brown wrote: > > This patch fixes a bug with mapping Fortran components (i.e. with > > the manual deep-copy support) which themselves have derived types. > > I've also added a couple of new tests to make sure such mappings > > are lowered correctly, and to check for the case that Tobias found > > in the message: > > > >https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00215.html > > > > The previous incorrect mapping was causing the error condition > > mentioned in that message to fail to trigger, and I think (my!) > > code in libgomp (goacc_exit_data_internal) to handle > > GOMP_MAP_STRUCT specially was papering over the bad mapping also. > > Oops! > > > > I haven't attempted to implement the (harder) sub-copy detection, if > > that is indeed supposed to be forbidden by the spec. This patch > > should get us to the same behaviour in Fortran as in C & C++ though. > > > > Tested with offloading to nvptx, also with the (uncommitted) > > reference-count self-checking patch enabled. > > > > OK? > > > > Thanks, > > > > Julian > > > > ChangeLog > > > > gcc/fortran/ > > * trans-openmp.c (gfc_trans_omp_clauses): Use inner not decl > > for components with derived types. > > > > gcc/testsuite/ > > * gfortran.dg/goacc/mapping-tests-3.f90: New test. > > * gfortran.dg/goacc/mapping-tests-4.f90: New test. > > > > libgomp/ > > * oacc-mem.c (goacc_exit_data_internal): Remove special > > (no-copyback) behaviour for GOMP_MAP_STRUCT. > > > > component-mappings-derived-types-1.diff > > > > commit 5e9d8846bbaa33a9bdb08adcf1ee9f224a8e8fc0 > > Author: Julian Brown > > Date: Wed Jan 8 15:57:46 2020 -0800 > > > > Fix component mappings with derived types for OpenACC > > > > gcc/fortran/ > > * trans-openmp.c (gfc_trans_omp_clauses): Use inner > > not decl for components with derived types. > > > > gcc/testsuite/ > > * gfortran.dg/goacc/mapping-tests-3.f90: New test. > > * gfortran.dg/goacc/mapping-tests-4.f90: New test. > > > > libgomp/ > > * oacc-mem.c (goacc_exit_data_internal): Remove > > special (no-copyback) behaviour for GOMP_MAP_STRUCT. > > > > diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c > > index 9835d2aae3c..efc392d7fa6 100644 > > --- a/gcc/fortran/trans-openmp.c > > +++ b/gcc/fortran/trans-openmp.c > > @@ -2783,9 +2783,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, > > gfc_omp_clauses *clauses, } > > else > > { > > - OMP_CLAUSE_DECL (node) = decl; > > + OMP_CLAUSE_DECL (node) = inner; > > OMP_CLAUSE_SIZE (node) > > - = TYPE_SIZE_UNIT (TREE_TYPE (decl)); > > + = TYPE_SIZE_UNIT (TREE_TYPE (inner)); > > } > > } > > else if (lastcomp->next > > diff --git a/gcc/testsuite/gfortran.dg/goacc/mapping-tests-3.f90 > > b/gcc/testsuite/gfortran.dg/goacc/mapping-tests-3.f90 new file mode > > 100644 index 000..312f596461e > > --- /dev/null > > +++ b/gcc/testsuite/gfortran.dg/goacc/mapping-tests-3.f90 > > @@ -0,0 +1,15 @@ > > +! { dg-options "-fopenacc -fdump-tree-omplower" } > > + > > +subroutine foo > > + type one > > +integer i, j > > + end type > > + type two > > +type(one) A, B > > + end type > > + > > + type(two) x > > + > > + !$acc enter data copyin(x%A) > > +! { dg-final { scan-tree-dump-times "omp target > > oacc_enter_exit_data map\\(struct:x \\\[len: 1\\\]\\) map\\(to:x.a > > \\\[len: \[0-9\]+\\\]\\)" 1 "omplower" } } +end diff --git > > a/gcc/testsuite/gfortran.dg/goacc/mapping-tests-4.f90 > > b/gcc/testsuite/gfortran.dg/goacc/mapping-tests-4.f90 new file mode > > 100644 index 000..6257af942df --- /dev/null > > +++ b/gcc/testsuite/gfortran.dg/goacc/mapping-tests-4.f90 > > @@ -0,0 +1,17 @@ > > +subroutine foo > > + type one > > +integer i, j > > + end type > > + type two > > +type(one) A, B > > + end type > > + > > + type(two) x > > + > > +! This is accepted at present, although it represents a > > probably-unintentional +! overlapping subcopy. > > + !$acc enter data copyin(x%A, x%A%i) > > +! But this raises an error. > > + !$acc enter data copyin(x%A, x%A%i, x%A%i) > > +! { dg-error ".x.a.i. appears more than once in map clauses" > > "" { target
[PATCH] diagnostic_metadata: unbreak xgettext
On Tue, 2020-01-28 at 11:16 +0100, Jakub Jelinek wrote: > On Wed, Dec 18, 2019 at 07:08:25PM -0500, David Malcolm wrote: > > This patch adds support for associating a diagnostic message with > > an > > optional diagnostic_metadata object, so that plugins can add extra > > data > > to their diagnostics (e.g. mapping a diagnostic to a taxonomy or > > coding > > standard such as from CERT or MISRA). > > > > Currently this only supports associating a CWE identifier with a > > diagnostic (which is what I'm using for the warnings in the > > analyzer > > patch kit), but adding a diagnostic_metadata class allows for > > future > > growth in this area without an explosion of further "warning_at" > > overloads for all of the different kinds of custom data that a > > plugin > > might want to add. > > > > This version of the patch renames the overly-general > > -fdiagnostics-show-metadata to -fdiagnostics-show-cwe and adds test > > coverage for it via a plugin. > > > > It also adds a note to the documentation that no GCC diagnostics > > currently use this; it's a feature for plugins (and, at some point, > > I hope, the analyzer). > > > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > > > Committed to trunk as r279556. > > Unfortunately, this patch broke the i18n. > $ make gcc.pot > /bin/sh ../../gcc/../mkinstalldirs po > make srcextra > make[1]: Entering directory '/usr/src/gcc/obj/gcc' > cp -p gengtype-lex.c ../../gcc > cp -p gengtype-lex.c ../../gcc > make[1]: Leaving directory '/usr/src/gcc/obj/gcc' > AWK=gawk /bin/sh ../../gcc/po/exgettext \ > /usr/bin/xgettext gcc ../../gcc > scanning for keywords, %e and %n strings... > emit_diagnostic_valist used incompatibly as both -- > keyword=emit_diagnostic_valist:4 > --flag=emit_diagnostic_valist:4:gcc-internal-format and -- > keyword=emit_diagnostic_valist:5 > --flag=emit_diagnostic_valist:5:gcc-internal-format > make: *** [Makefile:4338: po/gcc.pot] Error 1 > > While C++ can have overloads, xgettext can't deal with overloads that > have > different argument positions. > emit_diagnostic_valist with the new args (i.e. the :5 one) isn't used > right > now, so one way around at least for now is to remove it again, > another is to > rename it. > > Jakub Sorry about this. There are actually two broken names; here's the patch I'm currently testing (which fixes "make gcc.pot" for me); bootstrap is in progress: While C++ can have overloads, xgettext can't deal with overloads that have different argument positions, leading to two failures in "make gcc.pot": emit_diagnostic_valist used incompatibly as both --keyword=emit_diagnostic_valist:4 --flag=emit_diagnostic_valist:4:gcc-internal-format and --keyword=emit_diagnostic_valist:5 --flag=emit_diagnostic_valist:5:gcc-internal-format warning_at used incompatibly as both --keyword=warning_at:3 --flag=warning_at:3:gcc-internal-format and --keyword=warning_at:4 --flag=warning_at:4:gcc-internal-format The emit_diagnostic_valist overload isn't used anywhere (I think it's a leftover from an earlier iteration of the analyzer patch kit). The warning_at overload is used throughout the analyzer but nowhere else. Ideally I'd like to consolidate this argument with something constructable in various ways: - from a metadata and an int or - from an int (or, better an "enum opt_code"), so that the overload happens when implicitly choosing the ctor, but that feels like stage 1 material. In the meantime, fix xgettext by deleting the unused overload and renaming the used one. gcc/analyzer/ChangeLog: * region-model.cc (poisoned_value_diagnostic::emit): Update for renaming of warning_at overload to warning_with_metadata_at. * sm-file.cc (file_leak::emit): Likewise. * sm-malloc.cc (double_free::emit): Likewise. (possible_null_deref::emit): Likewise. (possible_null_arg::emit): Likewise. (null_deref::emit): Likewise. (null_arg::emit): Likewise. (use_after_free::emit): Likewise. (malloc_leak::emit): Likewise. (free_of_non_heap::emit): Likewise. * sm-sensitive.cc (exposure_through_output_file::emit): Likewise. * sm-signal.cc (signal_unsafe_call::emit): Likewise. * sm-taint.cc (tainted_array_index::emit): Likewise. gcc/ChangeLog: * diagnostic-core.h (warning_at): Rename overload to... (warning_with_metadata_at): ...this. (emit_diagnostic_valist): Delete decl of overload taking diagnostic_metadata. * diagnostic.c (emit_diagnostic_valist): Likewise for defn. (warning_at): Rename overload taking diagnostic_metadata to... (warning_with_metadata_at): ...this. --- gcc/analyzer/region-model.cc | 24 gcc/analyzer/sm-file.cc | 6 ++-- gcc/analyzer/sm-malloc.cc| 54 gcc/analyzer/sm-sensitive.cc | 7 +++-- gcc/analyzer/sm-signal.cc| 8 +++--- gcc/analyzer/sm-taint.cc | 27 +
Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX
On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak wrote: > > On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu wrote: > > > > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak wrote: > > > > > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu wrote: > > > > > > > > movaps/movups is one byte shorter than movdaq/movdqu. But it isn't the > > > > case for AVX nor AVX512. We should disable TARGET_SSE_TYPELESS_STORES > > > > for TARGET_AVX. > > > > > > > > gcc/ > > > > > > > > PR target/91461 > > > > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for > > > > TARGET_AVX. > > > > * config/i386/i386.md (*movoi_internal_avx): Remove > > > > TARGET_SSE_TYPELESS_STORES check. > > > > > > > > gcc/testsuite/ > > > > > > > > PR target/91461 > > > > * gcc.target/i386/pr91461-1.c: New test. > > > > * gcc.target/i386/pr91461-2.c: Likewise. > > > > * gcc.target/i386/pr91461-3.c: Likewise. > > > > * gcc.target/i386/pr91461-4.c: Likewise. > > > > * gcc.target/i386/pr91461-5.c: Likewise. > > > > --- > > > > gcc/config/i386/i386.h| 4 +- > > > > gcc/config/i386/i386.md | 4 +- > > > > gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 > > > > gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++ > > > > gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 +++ > > > > gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++ > > > > gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 + > > > > 7 files changed, 203 insertions(+), 4 deletions(-) > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c > > > > > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > > > index 943e9a5c783..c134b04c5c4 100644 > > > > --- a/gcc/config/i386/i386.h > > > > +++ b/gcc/config/i386/i386.h > > > > @@ -516,8 +516,10 @@ extern unsigned char > > > > ix86_tune_features[X86_TUNE_LAST]; > > > > #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \ > > > > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL] > > > > #define TARGET_SSE_SPLIT_REGS > > > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS] > > > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu. But it > > > > + isn't the case for AVX nor AVX512. */ > > > > #define TARGET_SSE_TYPELESS_STORES \ > > > > - ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES] > > > > + (!TARGET_AVX && > > > > ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]) > > > > > > This is wrong place to disable the feature. > > > > Like this? > > No. > > There is a mode attribute in i386.md/sse.md for relevant patterns. > Please adapt calculation of mode attributes instead. > Like this? -- H.J. From 1ba0c9ce5f764b8faa8c66b70e676af187a57415 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Mon, 27 Jan 2020 09:35:11 -0800 Subject: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX movaps/movups is one byte shorter than movdaq/movdqu. But it isn't the case for AVX nor AVX512. We should disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX. gcc/ PR target/91461 * config/i386/i386.md (*movoi_internal_avx): Remove TARGET_SSE_TYPELESS_STORES check. (*movti_internal): Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX. * config/i386/sse.md (mov_internal): Likewise. gcc/testsuite/ PR target/91461 * gcc.target/i386/pr91461-1.c: New test. * gcc.target/i386/pr91461-2.c: Likewise. * gcc.target/i386/pr91461-3.c: Likewise. * gcc.target/i386/pr91461-4.c: Likewise. * gcc.target/i386/pr91461-5.c: Likewise. --- gcc/config/i386/i386.md | 8 +-- gcc/config/i386/sse.md| 2 +- gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++ gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 +++ gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++ gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 + 7 files changed, 203 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index a125ab350bb..62aaf40a4af 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -1980,9 +1980,7 @@ (define_insn "*movoi_internal_avx" (and (eq_attr "alternative" "1") (match_test "TARGET_AVX512VL")) (const_string "XI") - (ior (match_test "TARGET_SSE
Re: [PATCH] diagnostic_metadata: unbreak xgettext
On Tue, Jan 28, 2020 at 09:30:17AM -0500, David Malcolm wrote: > * diagnostic-core.h (warning_at): Rename overload to... > (warning_with_metadata_at): ...this. I think this is just too long, especially for a function used at least in the analyzer pretty often, leading to lots of ugly formatting. Can't you use warning_meta or something similar instead? > (emit_diagnostic_valist): Delete decl of overload taking > diagnostic_metadata. > * diagnostic.c (emit_diagnostic_valist): Likewise for defn. > (warning_at): Rename overload taking diagnostic_metadata to... > (warning_with_metadata_at): ...this. Jakub
Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX
On Tue, Jan 28, 2020 at 3:32 PM H.J. Lu wrote: > > On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak wrote: > > > > On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu wrote: > > > > > > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak wrote: > > > > > > > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu wrote: > > > > > > > > > > movaps/movups is one byte shorter than movdaq/movdqu. But it isn't > > > > > the > > > > > case for AVX nor AVX512. We should disable TARGET_SSE_TYPELESS_STORES > > > > > for TARGET_AVX. > > > > > > > > > > gcc/ > > > > > > > > > > PR target/91461 > > > > > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for > > > > > TARGET_AVX. > > > > > * config/i386/i386.md (*movoi_internal_avx): Remove > > > > > TARGET_SSE_TYPELESS_STORES check. > > > > > > > > > > gcc/testsuite/ > > > > > > > > > > PR target/91461 > > > > > * gcc.target/i386/pr91461-1.c: New test. > > > > > * gcc.target/i386/pr91461-2.c: Likewise. > > > > > * gcc.target/i386/pr91461-3.c: Likewise. > > > > > * gcc.target/i386/pr91461-4.c: Likewise. > > > > > * gcc.target/i386/pr91461-5.c: Likewise. > > > > > --- > > > > > gcc/config/i386/i386.h| 4 +- > > > > > gcc/config/i386/i386.md | 4 +- > > > > > gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 > > > > > gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++ > > > > > gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 > > > > > +++ > > > > > gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++ > > > > > gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 + > > > > > 7 files changed, 203 insertions(+), 4 deletions(-) > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c > > > > > > > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > > > > index 943e9a5c783..c134b04c5c4 100644 > > > > > --- a/gcc/config/i386/i386.h > > > > > +++ b/gcc/config/i386/i386.h > > > > > @@ -516,8 +516,10 @@ extern unsigned char > > > > > ix86_tune_features[X86_TUNE_LAST]; > > > > > #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \ > > > > > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL] > > > > > #define TARGET_SSE_SPLIT_REGS > > > > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS] > > > > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu. But it > > > > > + isn't the case for AVX nor AVX512. */ > > > > > #define TARGET_SSE_TYPELESS_STORES \ > > > > > - ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES] > > > > > + (!TARGET_AVX && > > > > > ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]) > > > > > > > > This is wrong place to disable the feature. > > > > > > Like this? > > > > No. > > > > There is a mode attribute in i386.md/sse.md for relevant patterns. > > Please adapt calculation of mode attributes instead. > > > > Like this? Still no. You could move (match_test "TARGET_AVX") (const_string "TI") up to bypass the cases below. Uros. Uros. > > -- > H.J.
[PATCH] c++: Function declared with typedef with eh-specification.
We just need to handle the exception specification like other properties of a function typedef. Tested x86_64-pc-linux-gnu, applying to trunk/9. PR c++/90731 * decl.c (grokdeclarator): Propagate eh spec from typedef. --- gcc/cp/decl.c| 1 + gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C | 6 ++ 2 files changed, 7 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index e55de5dd53d..6ad558eef9e 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -12848,6 +12848,7 @@ grokdeclarator (const cp_declarator *declarator, memfn_quals |= type_memfn_quals (type); rqual = type_memfn_rqual (type); type_quals = TYPE_UNQUALIFIED; + raises = TYPE_RAISES_EXCEPTIONS (type); } } diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C b/gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C new file mode 100644 index 000..dd9924ff1b1 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type22.C @@ -0,0 +1,6 @@ +// PR c++/90731 +// { dg-do compile { target c++17 } } + +typedef void T() noexcept(true); +T t; +void t() noexcept(true); base-commit: a5ed4958a2c1b563e933b25ca3b481761cc40b07 -- 2.18.1
[PATCH] Add OpenACC acc_get_property support for AMD GCN
Hi, this patch adds full support for the OpenACC 2.6 acc_get_property and acc_get_property_string functions to the libgomp GCN plugin. This replaces the existing stub in libgomp/plugin-gcn.c. Andrew: The value returned for acc_property_memory ("size of device memory in bytes" according to the spec) is the HSA_REGION_INFO_SIZE of the agent's data_region. This has been adapted from a previous incomplete implementation that we had on the OG9 branch. Does that sound reasonable? I have tested the patch with amdgcn and nvptx offloading. Ok to commit this to the main branch? Best regards, Frederik From 6f1855281c38993a088f9b4af020a786f8e05fe9 Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Tue, 28 Jan 2020 08:01:00 +0100 Subject: [PATCH] Add OpenACC acc_get_property support for AMD GCN Add full support for the OpenACC 2.6 acc_get_property and acc_get_property_string functions to the libgomp GCN plugin. libgomp/ * plugin-gcn.c (struct agent_info): Add fields "name" and "vendor_name" ... (GOMP_OFFLOAD_init_device): ... and init from here. (struct hsa_context_info): Add field "driver_version_s" ... (init_hsa_contest): ... and init from here. (GOMP_OFFLOAD_openacc_get_property): Replace stub with a proper implementation. * testsuite/libgomp.oacc-c-c++-common/acc_get_property.c: Enable test execution for amdgcn and host offloading targets. * testsuite/libgomp.oacc-fortran/acc_get_property.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c (expect_device_properties): Split function into ... (expect_device_string_properties): ... this new function ... (expect_device_memory): ... and this new function. * testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c: Add test. --- libgomp/plugin/plugin-gcn.c | 63 +++-- .../acc_get_property-aux.c| 60 +--- .../acc_get_property-gcn.c| 132 ++ .../acc_get_property.c| 5 +- .../libgomp.oacc-fortran/acc_get_property.f90 | 2 - 5 files changed, 224 insertions(+), 38 deletions(-) create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c index 7854c142f05..0a09daaa0a4 100644 --- a/libgomp/plugin/plugin-gcn.c +++ b/libgomp/plugin/plugin-gcn.c @@ -425,7 +425,10 @@ struct agent_info /* The instruction set architecture of the device. */ gcn_isa device_isa; - + /* Name of the agent. */ + char name[64]; + /* Name of the vendor of the agent. */ + char vendor_name[64]; /* Command queues of the agent. */ hsa_queue_t *sync_queue; struct goacc_asyncqueue *async_queues, *omp_async_queue; @@ -544,6 +547,8 @@ struct hsa_context_info int agent_count; /* Array of agent_info structures describing the individual HSA agents. */ struct agent_info *agents; + /* Driver version string. */ + char driver_version_s[30]; }; /* Format of the on-device heap. @@ -1513,6 +1518,15 @@ init_hsa_context (void) GOMP_PLUGIN_error ("Failed to list all HSA runtime agents"); } + uint16_t minor, major; + status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MINOR, &minor); + if (status != HSA_STATUS_SUCCESS) +GOMP_PLUGIN_error ("Failed to obtain HSA runtime minor version"); + status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MAJOR, &major); + if (status != HSA_STATUS_SUCCESS) +GOMP_PLUGIN_error ("Failed to obtain HSA runtime major version"); + sprintf (hsa_context.driver_version_s, "HSA Runtime %d.%d", major, minor); + hsa_context.initialized = true; return true; } @@ -3410,15 +3424,19 @@ GOMP_OFFLOAD_init_device (int n) return hsa_error ("Error requesting maximum queue size of the GCN agent", status); - char buf[64]; status = hsa_fns.hsa_agent_get_info_fn (agent->id, HSA_AGENT_INFO_NAME, - &buf); + &agent->name); if (status != HSA_STATUS_SUCCESS) return hsa_error ("Error querying the name of the agent", status); - agent->device_isa = isa_code (buf); + agent->device_isa = isa_code (agent->name); if (agent->device_isa < 0) -return hsa_error ("Unknown GCN agent architecture.", HSA_STATUS_ERROR); +return hsa_error ("Unknown GCN agent architecture", HSA_STATUS_ERROR); + + status = hsa_fns.hsa_agent_get_info_fn (agent->id, HSA_AGENT_INFO_VENDOR_NAME, + &agent->vendor_name); + if (status != HSA_STATUS_SUCCESS) +return hsa_error ("Error querying the vendor name of the agent", status); status = hsa_fns.hsa_queue_create_fn (agent->id, queue_size, HSA_QUEUE_TYPE_MULTI, @@ -4115,12 +4133,37 @@ GOMP_OFFLOAD_openacc_async_dev2host (int device, void *dst, const void *src, union goacc_property_value GOMP_OFFLOAD_openacc_get_property (int device, enum goacc_property prop) { - /* Stub. Check device and return default value for unsupported properties. */ - /* TODO: Implement this function. */ - ge
Re: [Ping][GCC][IRA] Revert 11b8091fb to fix Bug 93221
On 1/28/20 4:30 AM, Joel Hutton wrote: On 28/01/2020 09:07, Eric Botcazou wrote: Ping! Eric, do you have any objections to reverting? See my comment posted in the audit trail of the TN on 01/20... Probably missing live range splitting or somesuch, as envisioned by Vladimir in its approval message. Feel free to eventually revert it. Great. Vladimir, Ok for trunk? Yes. Thank you.
Re: [PATCH] gcc: Add new configure options to allow static libraries to be selected
On 22/01/20 15:39 +, Andrew Burgess wrote: The motivation behind this change is to make it easier for a user to link against static libraries on a target where dynamic libraries are the default library type (for example GNU/Linux). Further, my motivation is really for linking libraries into GDB, however, the binutils-gdb/config/ directory is a copy of gcc/config/ so changes for GDB need to be approved by the GCC project first. After making this change in the gcc/config/ directory I've run autoreconf on all of the configure scripts in the GCC tree and a couple have been updated, so I'll use one of these to describe what my change does. Consider libcpp, this library links against libiconv. Currently if the user builds on a system with both static and dynamic libiconv installed then autotools will pick up the dynamic libiconv by default. This is almost certainly the right thing to do. However, if the user wants to link against static libiconv then things are a little harder, they could remove the dynamic libiconv from their system, but this is probably a bad idea (other things might depend on that library), Typically they would only need to remove the .so symlink used when linking, not the actual libiconv.so.NNN library that other things depend on. If they put the symlink back again, it's only a problem for other builds that might want the symlink, not for any already-linked software using the shared library. Still not ideal though. or the user can build their own version of libiconv, install it into a unique prefix, and then configure gcc using the --with-libiconv-prefix=DIR flag. This works fine, but is somewhat annoying, the static library available, I just can't get autotools to use it. A much simpler solution is: mkdir -p /tmp/iconv/lib ln -s $real_iconv_prefix/include /tmp/iconv/ ln -s $real_iconv_prefix/lib64/libiconv.a /tmp/iconv/lib64 And then configure with --with-libiconv-prefix=/tmp/iconv i.e. there's no need to build or install anything, just use symlinks to create a new path that doesn't have the shared library. It's still a bit more effort than with your new configure option, but it's pretty trivial.
[PATCH] diagnostic_metadata: unbreak xgettext (v2)
On Tue, 2020-01-28 at 15:36 +0100, Jakub Jelinek wrote: > On Tue, Jan 28, 2020 at 09:30:17AM -0500, David Malcolm wrote: > > * diagnostic-core.h (warning_at): Rename overload to... > > (warning_with_metadata_at): ...this. > > I think this is just too long, especially for a function used at > least > in the analyzer pretty often, leading to lots of ugly formatting. > Can't you use warning_meta or something similar instead? > > > (emit_diagnostic_valist): Delete decl of overload taking > > diagnostic_metadata. > > * diagnostic.c (emit_diagnostic_valist): Likewise for defn. > > (warning_at): Rename overload taking diagnostic_metadata to... > > (warning_with_metadata_at): ...this. > > Jakub Fair enough, thanks. Bootstrap in progress of v2: Changed in v2: - rename from warning_with_metadata_at to warning_meta - fix test plugins While C++ can have overloads, xgettext can't deal with overloads that have different argument positions, leading to two failures in "make gcc.pot": emit_diagnostic_valist used incompatibly as both --keyword=emit_diagnostic_valist:4 --flag=emit_diagnostic_valist:4:gcc-internal-format and --keyword=emit_diagnostic_valist:5 --flag=emit_diagnostic_valist:5:gcc-internal-format warning_at used incompatibly as both --keyword=warning_at:3 --flag=warning_at:3:gcc-internal-format and --keyword=warning_at:4 --flag=warning_at:4:gcc-internal-format The emit_diagnostic_valist overload isn't used anywhere (I think it's a leftover from an earlier iteration of the analyzer patch kit). The warning_at overload is used throughout the analyzer but nowhere else. Ideally I'd like to consolidate this argument with something constructable in various ways: - from a metadata and an int or - from an int (or, better an "enum opt_code"), so that the overload happens when implicitly choosing the ctor, but that feels like stage 1 material. In the meantime, fix xgettext by deleting the unused overload and renaming the used one. gcc/analyzer/ChangeLog: * region-model.cc (poisoned_value_diagnostic::emit): Update for renaming of warning_at overload to warning_meta. * sm-file.cc (file_leak::emit): Likewise. * sm-malloc.cc (double_free::emit): Likewise. (possible_null_deref::emit): Likewise. (possible_null_arg::emit): Likewise. (null_deref::emit): Likewise. (null_arg::emit): Likewise. (use_after_free::emit): Likewise. (malloc_leak::emit): Likewise. (free_of_non_heap::emit): Likewise. * sm-sensitive.cc (exposure_through_output_file::emit): Likewise. * sm-signal.cc (signal_unsafe_call::emit): Likewise. * sm-taint.cc (tainted_array_index::emit): Likewise. gcc/ChangeLog: * diagnostic-core.h (warning_at): Rename overload to... (warning_meta): ...this. (emit_diagnostic_valist): Delete decl of overload taking diagnostic_metadata. * diagnostic.c (emit_diagnostic_valist): Likewise for defn. (warning_at): Rename overload taking diagnostic_metadata to... (warning_meta): ...this. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic_plugin_test_metadata.c: Update for renaming of warning_at overload to warning_meta. * gcc.dg/plugin/diagnostic_plugin_test_paths.c: Likewise. --- gcc/analyzer/region-model.cc | 19 --- gcc/analyzer/sm-file.cc | 6 +-- gcc/analyzer/sm-malloc.cc | 49 ++- gcc/analyzer/sm-sensitive.cc | 7 +-- gcc/analyzer/sm-signal.cc | 8 +-- gcc/analyzer/sm-taint.cc | 24 - gcc/diagnostic-core.h | 9 ++-- gcc/diagnostic.c | 16 ++ .../plugin/diagnostic_plugin_test_metadata.c | 4 +- .../plugin/diagnostic_plugin_test_paths.c | 13 ++--- 10 files changed, 73 insertions(+), 82 deletions(-) diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index 62c96a6ceea..acaadcf9d3b 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -3827,27 +3827,26 @@ public: { diagnostic_metadata m; m.add_cwe (457); /* "CWE-457: Use of Uninitialized Variable". */ - return warning_at (rich_loc, m, -OPT_Wanalyzer_use_of_uninitialized_value, -"use of uninitialized value %qE", -m_expr); + return warning_meta (rich_loc, m, + OPT_Wanalyzer_use_of_uninitialized_value, + "use of uninitialized value %qE", + m_expr); } break; case POISON_KIND_FREED: { diagnostic_metadata m; m.add_cwe (416); /* "CWE-416: Use After Free". */ - return warning_at (rich_loc, m, -OPT_Wanalyzer
Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX
On Tue, Jan 28, 2020 at 6:45 AM Uros Bizjak wrote: > > On Tue, Jan 28, 2020 at 3:32 PM H.J. Lu wrote: > > > > On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak wrote: > > > > > > On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu wrote: > > > > > > > > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak wrote: > > > > > > > > > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu wrote: > > > > > > > > > > > > movaps/movups is one byte shorter than movdaq/movdqu. But it isn't > > > > > > the > > > > > > case for AVX nor AVX512. We should disable > > > > > > TARGET_SSE_TYPELESS_STORES > > > > > > for TARGET_AVX. > > > > > > > > > > > > gcc/ > > > > > > > > > > > > PR target/91461 > > > > > > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable > > > > > > for > > > > > > TARGET_AVX. > > > > > > * config/i386/i386.md (*movoi_internal_avx): Remove > > > > > > TARGET_SSE_TYPELESS_STORES check. > > > > > > > > > > > > gcc/testsuite/ > > > > > > > > > > > > PR target/91461 > > > > > > * gcc.target/i386/pr91461-1.c: New test. > > > > > > * gcc.target/i386/pr91461-2.c: Likewise. > > > > > > * gcc.target/i386/pr91461-3.c: Likewise. > > > > > > * gcc.target/i386/pr91461-4.c: Likewise. > > > > > > * gcc.target/i386/pr91461-5.c: Likewise. > > > > > > --- > > > > > > gcc/config/i386/i386.h| 4 +- > > > > > > gcc/config/i386/i386.md | 4 +- > > > > > > gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 > > > > > > gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++ > > > > > > gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 > > > > > > +++ > > > > > > gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++ > > > > > > gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 + > > > > > > 7 files changed, 203 insertions(+), 4 deletions(-) > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c > > > > > > > > > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > > > > > index 943e9a5c783..c134b04c5c4 100644 > > > > > > --- a/gcc/config/i386/i386.h > > > > > > +++ b/gcc/config/i386/i386.h > > > > > > @@ -516,8 +516,10 @@ extern unsigned char > > > > > > ix86_tune_features[X86_TUNE_LAST]; > > > > > > #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \ > > > > > > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL] > > > > > > #define TARGET_SSE_SPLIT_REGS > > > > > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS] > > > > > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu. But > > > > > > it > > > > > > + isn't the case for AVX nor AVX512. */ > > > > > > #define TARGET_SSE_TYPELESS_STORES \ > > > > > > - ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES] > > > > > > + (!TARGET_AVX && > > > > > > ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]) > > > > > > > > > > This is wrong place to disable the feature. > > > > > > > > Like this? > > > > > > No. > > > > > > There is a mode attribute in i386.md/sse.md for relevant patterns. > > > Please adapt calculation of mode attributes instead. > > > > > > > Like this? > > Still no. > > You could move > > (match_test "TARGET_AVX") > (const_string "TI") > > up to bypass the cases below. > I don't think we can do that. There are 2 cases where we prefer movaps/movups: /* Use packed single precision instructions where posisble. I.e. movups instead of movupd. */ DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL, "sse_packed_single_insn_optimal", m_BDVER | m_ZNVER) /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores. */ DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores", m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC) We should always use movaps/movups for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL. It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with TARGET_AVX as m_BDVER | m_ZNVER support AVX. -- H.J.
Re: [PATCH] diagnostic_metadata: unbreak xgettext (v2)
On Tue, Jan 28, 2020 at 10:30:58AM -0500, David Malcolm wrote: > gcc/analyzer/ChangeLog: > * region-model.cc (poisoned_value_diagnostic::emit): Update for > renaming of warning_at overload to warning_meta. > * sm-file.cc (file_leak::emit): Likewise. > * sm-malloc.cc (double_free::emit): Likewise. > (possible_null_deref::emit): Likewise. > (possible_null_arg::emit): Likewise. > (null_deref::emit): Likewise. > (null_arg::emit): Likewise. > (use_after_free::emit): Likewise. > (malloc_leak::emit): Likewise. > (free_of_non_heap::emit): Likewise. > * sm-sensitive.cc (exposure_through_output_file::emit): Likewise. > * sm-signal.cc (signal_unsafe_call::emit): Likewise. > * sm-taint.cc (tainted_array_index::emit): Likewise. > > gcc/ChangeLog: > * diagnostic-core.h (warning_at): Rename overload to... > (warning_meta): ...this. > (emit_diagnostic_valist): Delete decl of overload taking > diagnostic_metadata. > * diagnostic.c (emit_diagnostic_valist): Likewise for defn. > (warning_at): Rename overload taking diagnostic_metadata to... > (warning_meta): ...this. > > gcc/testsuite/ChangeLog: > * gcc.dg/plugin/diagnostic_plugin_test_metadata.c: Update for > renaming of warning_at overload to warning_meta. > * gcc.dg/plugin/diagnostic_plugin_test_paths.c: Likewise. LGTM, thanks. Jakub
Re: [PATCH] Add OpenACC acc_get_property support for AMD GCN
On 28/01/2020 14:55, Harwath, Frederik wrote: Hi, this patch adds full support for the OpenACC 2.6 acc_get_property and acc_get_property_string functions to the libgomp GCN plugin. This replaces the existing stub in libgomp/plugin-gcn.c. Andrew: The value returned for acc_property_memory ("size of device memory in bytes" according to the spec) is the HSA_REGION_INFO_SIZE of the agent's data_region. This has been adapted from a previous incomplete implementation that we had on the OG9 branch. Does that sound reasonable? I don't think we can do better than that. I have tested the patch with amdgcn and nvptx offloading. Ok to commit this to the main branch? Best regards, Frederik @@ -544,6 +547,8 @@ struct hsa_context_info int agent_count; /* Array of agent_info structures describing the individual HSA agents. */ struct agent_info *agents; + /* Driver version string. */ + char driver_version_s[30]; }; /* Format of the on-device heap. @@ -1513,6 +1518,15 @@ init_hsa_context (void) GOMP_PLUGIN_error ("Failed to list all HSA runtime agents"); } + uint16_t minor, major; + status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MINOR, &minor); + if (status != HSA_STATUS_SUCCESS) +GOMP_PLUGIN_error ("Failed to obtain HSA runtime minor version"); + status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MAJOR, &major); + if (status != HSA_STATUS_SUCCESS) +GOMP_PLUGIN_error ("Failed to obtain HSA runtime major version"); + sprintf (hsa_context.driver_version_s, "HSA Runtime %d.%d", major, minor); + hsa_context.initialized = true; return true; } If we're going to use a fixed-size buffer then we should use snprintf and emit GCN_WARNING if the return value is greater than "sizeof(driver_version_s)", even though that is unlikely. Do the same in the testcase, but use a bigger buffer so that truncation causes a mismatch and test failure. @@ -75,6 +56,39 @@ void expect_device_properties "but was %s.\n", s); abort (); } +} + +void expect_device_memory +(acc_device_t dev_type, int dev_num, size_t expected_total_memory) +{ + + I realise that an existing function in this testcase uses this layout, but the code style does not normally have the parameter list on the next line, and certainly not in column 1. + +int main() +{ Space before (). Thanks Andrew
Re: [PATCH] analyzer: fix build with gcc 4.4 (PR 93276)
On Tue, 2020-01-28 at 11:13 +0100, Jakub Jelinek wrote: > On Fri, Jan 24, 2020 at 07:53:28PM -0500, David Malcolm wrote: > > This patch fixes various build failures seen with gcc 4.4 > > > > gcc prior to 4.6 complains about: > > > > error: #pragma GCC diagnostic not allowed inside functions > > > > for various uses of PUSH_IGNORE_WFORMAT and POP_IGNORE_WFORMAT. > > This patch makes them a no-op with such compilers. > > I think this is wrong. > All that is really needed is make sure you #include "diagnostic- > core.h" > before including pretty-print.h. By including > diagnostic-core.h first, you do: > #ifndef GCC_DIAG_STYLE > #define GCC_DIAG_STYLE __gcc_tdiag__ > #endif > and then pretty-print.h will do: > #ifdef GCC_DIAG_STYLE > #define GCC_PPDIAG_STYLE GCC_DIAG_STYLE > #else > #define GCC_PPDIAG_STYLE __gcc_diag__ > #endif > If instead pretty-print.h is included first, then it will use > __gcc_diag__ > instead of __gcc_tdiag__ and thus will assume %E/%D etc. can't be > handled. > > I've so far just tested that in stage3 with this patch analyzer > builds > without any -Wformat/-Wformat-extra-args warnings. Aha! Thanks - that's much better. > Ok for trunk if it passes bootstrap/regtest? LGTM. Dave
[PR 93542] Make __has_include a builtin macro
The clever hack of '#define __has_include __has_include' breaks -dD and -fdirectives-only, because that emits definitions. This turns __has_include into a proper builtin macro. Thus it's never emitted via -dD, and because use outside of directive processing is undefined, we can just expand it anywhere. I did try down this path the first time round, but it started looking rat holey. Turns out not so rat holey after all. nathan -- Nathan Sidwell 2020-01-28 Nathan Sidwell PR preprocessor/93452 * internal.h (struct spec_nodes): Drop n__has_include{,_next}. * directives.c (lex_macro_node): Don't check __has_include redef. * expr.c (eval_token): Drop __has_include eval. (parse_has_include): Move to ... * macro.c (builtin_has_include): ... here. (_cpp_builtin_macro_text): Eval __has_include{,_next}. * include/cpplib.h (enum cpp_builtin_type): Add BT_HAS_INCLUDE{,_NEXT}. * init.c (builtin_array): Add them. (cpp_init_builtins): Drop __has_include{,_next} init here ... * pch.c (cpp_read_state): ... and here. * traditional.c (enum ls): Drop has_include states ... (_cpp_scan_out_logical_line): ... and here. gcc/testsuite/c-c++-common/cpp/pr93452-1.c | 10 ++ gcc/testsuite/c-c++-common/cpp/pr93452-2.c | 11 ++ libcpp/ChangeLog | 16 + libcpp/directives.c| 4 +-- libcpp/expr.c | 58 -- libcpp/include/cpplib.h| 4 ++- libcpp/init.c | 13 ++- libcpp/internal.h | 2 -- libcpp/macro.c | 56 + libcpp/pch.c | 2 -- libcpp/traditional.c | 20 +++ 11 files changed, 103 insertions(+), 93 deletions(-) diff --git c/gcc/testsuite/c-c++-common/cpp/pr93452-1.c w/gcc/testsuite/c-c++-common/cpp/pr93452-1.c new file mode 100644 index 000..f0986e49238 --- /dev/null +++ w/gcc/testsuite/c-c++-common/cpp/pr93452-1.c @@ -0,0 +1,10 @@ +/* { dg-do preprocess } */ +/* { dg-additional-options "-fdirectives-only" } */ + +int main () +{ + return 0; +} + +/* A regexp that doesn't match itself! */ +/* { dg-final { scan-file-not pr93452-1.i {_[_]has_include} } } */ diff --git c/gcc/testsuite/c-c++-common/cpp/pr93452-2.c w/gcc/testsuite/c-c++-common/cpp/pr93452-2.c new file mode 100644 index 000..c9ab0e9f763 --- /dev/null +++ w/gcc/testsuite/c-c++-common/cpp/pr93452-2.c @@ -0,0 +1,11 @@ +/* { dg-do preprocess } */ +/* { dg-additional-options "-dD" } */ + +#if __has_include ("who cares" ) +#endif +int main () +{ + return 0; +} + +/* { dg-final { scan-file-not pr93452-2.i {__has_include} } } */ diff --git c/libcpp/directives.c w/libcpp/directives.c index 10735c8c668..bbfdfcd3368 100644 --- c/libcpp/directives.c +++ w/libcpp/directives.c @@ -596,9 +596,7 @@ lex_macro_node (cpp_reader *pfile, bool is_def_or_undef) cpp_hashnode *node = token->val.node.node; if (is_def_or_undef - && (node == pfile->spec_nodes.n_defined - || node == pfile->spec_nodes.n__has_include - || node == pfile->spec_nodes.n__has_include_next)) + && node == pfile->spec_nodes.n_defined) cpp_error (pfile, CPP_DL_ERROR, "\"%s\" cannot be used as a macro name", NODE_NAME (node)); diff --git c/libcpp/expr.c w/libcpp/expr.c index 6c56803e3b0..2ae9be07c1a 100644 --- c/libcpp/expr.c +++ w/libcpp/expr.c @@ -64,8 +64,6 @@ static unsigned int interpret_float_suffix (cpp_reader *, const uchar *, size_t) static unsigned int interpret_int_suffix (cpp_reader *, const uchar *, size_t); static void check_promotion (cpp_reader *, const struct op *); -static cpp_num parse_has_include (cpp_reader *, cpp_hashnode *, include_type); - /* Token type abuse to create unary plus and minus operators. */ #define CPP_UPLUS ((enum cpp_ttype) (CPP_LAST_CPP_OP + 1)) #define CPP_UMINUS ((enum cpp_ttype) (CPP_LAST_CPP_OP + 2)) @@ -1159,10 +1157,6 @@ eval_token (cpp_reader *pfile, const cpp_token *token, case CPP_NAME: if (token->val.node.node == pfile->spec_nodes.n_defined) return parse_defined (pfile); - else if (token->val.node.node == pfile->spec_nodes.n__has_include) - return parse_has_include (pfile, token->val.node.node, IT_INCLUDE); - else if (token->val.node.node == pfile->spec_nodes.n__has_include_next) - return parse_has_include (pfile, token->val.node.node, IT_INCLUDE_NEXT); else if (CPP_OPTION (pfile, cplusplus) && (token->val.node.node == pfile->spec_nodes.n_true || token->val.node.node == pfile->spec_nodes.n_false)) @@ -2189,55 +2183,3 @@ num_div_op (cpp_reader *pfile, cpp_num lhs, cpp_num rhs, enum cpp_ttype op, return lhs; } -/* Handle meeting "__has_include" in a preprocessor expression. */ -static cpp_num -parse_has_include (cpp_reader *pfile, cpp_hashnode *op, include_type type) -{ - cpp_num result; - - result.unsignedp = false; -
[PATCH] coroutines: Fix ICE on invalid (PR93458).
Hi, Since coroutine-ness is discovered lazily, we encounter the diagnostics during each keyword parse. We were not handling the case where a user code failed to include fundamental information (e.g. the traits) in a graceful manner (nor tolerating that level of fail for subsequent diagnostics). Once we've emitted an error for this level of fail, then we suppress additional copies (otherwise the same thing will be reported for every coroutine keyword seen). tested on x86_64-darwin16, OK for trunk? thanks iain gcc/cp/ChangeLog: 2020-01-28 Iain Sandoe * coroutines.cc (get_coroutine_info): Tolerate fatal errors that might leave the info unset. (find_coro_traits_template_decl): Note we emitted the error and suppress duplicates. (coro_promise_type_found_p): Reorder initialization so that we check for the traits and their usability before allocation of the info table. Check for a suitable return type and emit a diagnostic for here instead of relying on the lookup machinery. This allows the error to have a better location, and means we can suppress multiple copies. (coro_function_valid_p): Re-check for a valid promise (and thus the traits) before proceeding. Tolerate missing info as a fatal error. gcc/testsuite/ChangeLog: 2020-01-28 Iain Sandoe * g++.dg/coroutines/pr93458-1-missing-traits.C: New test. * g++.dg/coroutines/pr93458-2-bad-coro-type.C: New test. --- gcc/cp/coroutines.cc | 76 ++- .../coroutines/pr93458-1-missing-traits.C | 10 +++ .../coroutines/pr93458-2-bad-coro-type.C | 12 +++ 3 files changed, 78 insertions(+), 20 deletions(-) create mode 100644 gcc/testsuite/g++.dg/coroutines/pr93458-1-missing-traits.C create mode 100644 gcc/testsuite/g++.dg/coroutines/pr93458-2-bad-coro-type.C diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc index e8a6a4033f6..ca86c7e6950 100644 --- a/gcc/cp/coroutines.cc +++ b/gcc/cp/coroutines.cc @@ -169,7 +169,8 @@ get_or_insert_coroutine_info (tree fn_decl) coroutine_info * get_coroutine_info (tree fn_decl) { - gcc_checking_assert (coroutine_info_table != NULL); + if (coroutine_info_table == NULL) +return NULL; coroutine_info **slot = coroutine_info_table->find_slot_with_hash (fn_decl, coroutine_info_hasher::hash (fn_decl), NO_INSERT); @@ -255,11 +256,16 @@ static GTY(()) tree void_coro_handle_type; static tree find_coro_traits_template_decl (location_t kw) { + static bool error_emitted = false; tree traits_decl = lookup_qualified_name (std_node, coro_traits_identifier, 0, true); - if (traits_decl == NULL_TREE || traits_decl == error_mark_node) + /* If we are missing fundmental information, such as the traits, then don't + emit this for every keyword in a TU. */ + if (!error_emitted && + (traits_decl == NULL_TREE || traits_decl == error_mark_node)) { error_at (kw, "cannot find % template"); + error_emitted = true; return NULL_TREE; } else @@ -370,34 +376,45 @@ coro_promise_type_found_p (tree fndecl, location_t loc) { gcc_assert (fndecl != NULL_TREE); - /* Save the coroutine data on the side to avoid the overhead on every - function decl. */ - - /* We only need one entry per coroutine in a TU, the assumption here is that - there are typically not 1000s. */ if (!coro_initialized) { - gcc_checking_assert (coroutine_info_table == NULL); - /* A table to hold the state, per coroutine decl. */ - coroutine_info_table = - hash_table::create_ggc (11); - /* Set up the identifiers we will use. */ - gcc_checking_assert (coro_traits_identifier == NULL); + /* Trees we only need to create once. +Set up the identifiers we will use. */ coro_init_identifiers (); - /* Trees we only need to create once. */ + /* Coroutine traits template. */ coro_traits_templ = find_coro_traits_template_decl (loc); - gcc_checking_assert (coro_traits_templ != NULL); + if (coro_traits_templ == NULL_TREE + || coro_traits_templ == error_mark_node) + return false; + /* coroutine_handle<> template. */ coro_handle_templ = find_coro_handle_template_decl (loc); - gcc_checking_assert (coro_handle_templ != NULL); + if (coro_handle_templ == NULL_TREE + || coro_handle_templ == error_mark_node) + return false; + /* We can also instantiate the void coroutine_handle<> */ void_coro_handle_type = instantiate_coro_handle_for_promise_type (loc, NULL_TREE); - gcc_checking_assert (void_coro_handle_type != NULL); + if (void_coro_handle_type == NULL_TREE + || void_coro_handle_type == error_mark_node) + return false; + + /* A table to hold the state, per coroutine decl. */ + gcc_checking_assert (coroutine_info_table
Re: [committed, amdgcn] Fix ICE on unsupported FP comparison
On 24/01/2020 14:58, Andrew Stubbs wrote: I've committed this patch to fix an ICE building the gcc.dg/vect/fast-math-pr55281.c testcase. Oops, I got that crossed. This was the fix for gcc.dg/pr50310-2.c. The fast-math-pr55281.c fix will be posted shortly. The problem was that the combine pass was trying to use the "unle" and "ungt" FP comparison operators. There's no hardware support for these, so the operators should have been rejected, but the predicates were too loose. Andrew
Re: [patch, fortran] PR93473 - ICE on valid with long module + submodule names
On Tuesday, January 28, 2020 9:27:58 AM PST Tobias Burnus wrote: > On 1/28/20 12:41 AM, Andrew Benson wrote: > > The problem occurs in set_syms_host_assoc() where the "parent1" and > > "parent2" variables have a maximum length of GFC_MAX_SYMBOL_LEN+1. This > > is insufficient when the parent names are a module+submodule name > > concatenated with a ".". The patch above fixes this by increasing their > > length to 2*GFC_MAX_SYMBOL_LEN+2. > > > > A patch to fix this is attached. The patch regression tests cleanly - ok > > to > > commit? > > The patch is okay, but can you add a comment – similar to the other > patch – which makes clear why one needs twice the normal > > GFC_MAX_SYMBOL_LEN (+2)? You currently have: > > const char dot[2] = "."; > > > > - char parent1[GFC_MAX_SYMBOL_LEN + 1]; > > - char parent2[GFC_MAX_SYMBOL_LEN + 1]; > > + char parent1[2 * GFC_MAX_SYMBOL_LEN + 2]; > > + char parent2[2 * GFC_MAX_SYMBOL_LEN + 2]; Yes - I've added a comment here explaining the naming convention. An updated patch is attached. -Andrew -- * Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html * Galacticus: https://github.com/galacticusorg/galacticus 2020-01-28 Andrew Benson PR fortran/93473 * parse.c: Increase length of char variables to allow them to hold a concatenated module + submodule name. * gfortran.dg/pr93473.f90: New test. diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c index 4bff0c8..f71a95d 100644 --- a/gcc/fortran/parse.c +++ b/gcc/fortran/parse.c @@ -6045,8 +6045,9 @@ set_syms_host_assoc (gfc_symbol *sym) { gfc_component *c; const char dot[2] = "."; - char parent1[GFC_MAX_SYMBOL_LEN + 1]; - char parent2[GFC_MAX_SYMBOL_LEN + 1]; + /* Symbols take the form module.submodule_ or module.name_. */ + char parent1[2 * GFC_MAX_SYMBOL_LEN + 2]; + char parent2[2 * GFC_MAX_SYMBOL_LEN + 2]; if (sym == NULL) return; diff --git a/gcc/testsuite/gfortran.dg/pr93473.f90 b/gcc/testsuite/gfortran.dg/pr93473.f90 new file mode 100644 index 000..dda8525 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr93473.f90 @@ -0,0 +1,28 @@ +! { dg-do compile } +! { dg-options "-ffree-line-length-none" } +! PR fortran/93473 +module aModestlyLongModuleName + + type :: aTypeWithASignificantlyLongNameButStillAllowedOK + end type aTypeWithASignificantlyLongNameButStillAllowedOK + + interface + module function aFunctionWithALongButStillAllowedName(parameters) result(self) + type(aTypeWithASignificantlyLongNameButStillAllowedOK) :: self + end function aFunctionWithALongButStillAllowedName + end interface + +end module aModestlyLongModuleName + +submodule (aModestlyLongModuleName) aTypeWithASignificantlyLongNameButStillAllowedOK_ + +contains + + module procedure aFunctionWithALongButStillAllowedName + class(*), pointer :: genericObject + end procedure aFunctionWithALongButStillAllowedName + +end submodule aTypeWithASignificantlyLongNameButStillAllowedOK_ + +submodule (aModestlyLongModuleName:aTypeWithASignificantlyLongNameButStillAllowedOK_) aSubmoduleWithASignificantlyLongButStillAllowedName__ +end submodule aSubmoduleWithASignificantlyLongButStillAllowedName__
Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule
Hi Tobias, > > The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to > > GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus > > function name (plus the additional "_"'s that get prepended), but > > insufficient if a submodule name is included. The name then gets > > truncated and can lead to two different functions having the same > > (truncated) symbol name. > > > > The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which > > allows for the submodule name plus the "." added between module and > > submodule names. > > > > I've attached a patch for this which includes a new test case for this PR. > > The patch regression tests cleanly. > > > > OK to commit? > > Can you also update the comment before the #define? It currently has: > > /* Mangled symbols take the form __module__name. */ > -#define GFC_MAX_MANGLED_SYMBOL_LEN (GFC_MAX_SYMBOL_LEN*2+4) > +#define GFC_MAX_MANGLED_SYMBOL_LEN (GFC_MAX_SYMBOL_LEN*3+5) Thanks for the suggestion. An updated patch is attached. > PS: I wonder whether there are relevant systems which will fail because they > do not handle that long symbol names... Good question. Should I hold off on committing the patch until this can be tested further? Or should I just go ahead and commit and deal with any such problems if they show up? Also, Richard Biener raised the question (in the PR) of whether this patch would be an ABI change. I can see that it probably would be, but don't know for sure. If it would change the ABI is there anything else that I need to include in the patch? Thanks, Andrew -- * Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html * Galacticus: https://github.com/galacticusorg/galacticus diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h index 52bc045..69171f3 100644 --- a/gcc/fortran/trans.h +++ b/gcc/fortran/trans.h @@ -23,8 +23,8 @@ along with GCC; see the file COPYING3. If not see #include "predict.h" /* For enum br_predictor and PRED_*. */ -/* Mangled symbols take the form __module__name. */ -#define GFC_MAX_MANGLED_SYMBOL_LEN (GFC_MAX_SYMBOL_LEN*2+4) +/* Mangled symbols take the form __module__name or __module.submodule__name. */ +#define GFC_MAX_MANGLED_SYMBOL_LEN (GFC_MAX_SYMBOL_LEN*3+5) /* Struct for holding a block of statements. It should be treated as an opaque entity and not modified directly. This allows us to change the diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 b/gcc/testsuite/gfortran.dg/pr93461.f90 new file mode 100644 index 000..3bef326 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr93461.f90 @@ -0,0 +1,22 @@ +! { dg-do compile } +! PR fortran/93461 +module aModuleWithAnAllowedName + interface + module subroutine aShortName() + end subroutine aShortName + end interface +end module aModuleWithAnAllowedName + +submodule (aModuleWithAnAllowedName) aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName +contains + subroutine aShortName() +call aSubroutineWithAVeryLongNameThatWillCauseAProblem() +call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso() + end subroutine aShortName + + subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem() + end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem + + subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso() + end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso +end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName 2020-01-28 Andrew Benson PR fortran/93461 * trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name, plus the "." between module and submodule names. * gfortran.dg/pr93461.f90: New test.
Re: [GCC][PATCH][AArch32] ACLE intrinsics bfloat16 vmmla and vfma for AArch32 AdvSIMD
Ping. From: Delia Burduv Sent: 22 January 2020 17:26 To: gcc-patches@gcc.gnu.org Cc: ni...@redhat.com ; Richard Earnshaw ; Ramana Radhakrishnan ; Kyrylo Tkachov Subject: Re: [GCC][PATCH][AArch32] ACLE intrinsics bfloat16 vmmla and vfma for AArch32 AdvSIMD Ping. I have read Richard Sandiford's comments on the AArch64 patches and I will apply what is relevant to this patch as well. Particularly, I will change the tests to use the exact input and output registers and I will change the types of the rtl patterns. On 12/20/19 6:44 PM, Delia Burduv wrote: > This patch adds the ARMv8.6 ACLE intrinsics for vmmla, vfmab and vfmat > as part of the BFloat16 extension. > (https://developer.arm.com/docs/101028/latest.) > The intrinsics are declared in arm_neon.h and the RTL patterns are > defined in neon.md. > Two new tests are added to check assembler output and lane indices. > > This patch depends on the Arm back-end patche. > (https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html) > > Tested for regression on arm-none-eabi and armeb-none-eabi. I don't have > commit rights, so if this is ok can someone please commit it for me? > > gcc/ChangeLog: > > 2019-11-12 Delia Burduv > > * config/arm/arm_neon.h (vbfmmlaq_f32): New. >(vbfmlalbq_f32): New. >(vbfmlaltq_f32): New. >(vbfmlalbq_lane_f32): New. >(vbfmlaltq_lane_f32): New. > (vbfmlalbq_laneq_f32): New. >(vbfmlaltq_laneq_f32): New. > * config/arm/arm_neon_builtins.def (vbfmmla): New. >(vbfmab): New. >(vbfmat): New. >(vbfmab_lane): New. >(vbfmat_lane): New. >(vbfmab_laneq): New. >(vbfmat_laneq): New. > * config/arm/iterators.md (BF_MA): New int iterator. >(bt): New int attribute. >(VQXBF): Copy of VQX with V8BF. >(V_HALF): Added V8BF. >* config/arm/neon.md (neon_vbfmmlav8hi): New insn. >(neon_vbfmav8hi): New insn. >(neon_vbfma_lanev8hi): New insn. >(neon_vbfma_laneqv8hi): New expand. >(neon_vget_high): Changed iterator to VQXBF. > * config/arm/unspecs.md (UNSPEC_BFMMLA): New UNSPEC. >(UNSPEC_BFMAB): New UNSPEC. >(UNSPEC_BFMAT): New UNSPEC. > > 2019-11-12 Delia Burduv > > * gcc.target/arm/simd/bf16_ma_1.c: New test. > * gcc.target/arm/simd/bf16_ma_2.c: New test. > * gcc.target/arm/simd/bf16_mmla_1.c: New test.
Re: ACLE intrinsics: BFloat16 store (vst{q}_bf16) intrinsics for AArch32
Ping. From: Delia Burduv Sent: 22 January 2020 17:29 To: gcc-patches@gcc.gnu.org Cc: ni...@redhat.com ; Richard Earnshaw ; Kyrylo Tkachov ; Ramana Radhakrishnan Subject: Re: ACLE intrinsics: BFloat16 store (vst{q}_bf16) intrinsics for AArch32 Ping. I will change the tests to use the exact input and output registers as Richard Sandiford suggested for the AArch64 patches. On 12/20/19 6:46 PM, Delia Burduv wrote: > This patch adds the ARMv8.6 ACLE BFloat16 store intrinsics > vst{q}_bf16 as part of the BFloat16 extension. > (https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics) > > The intrinsics are declared in arm_neon.h . > A new test is added to check assembler output. > > This patch depends on the Arm back-end patche. > (https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html) > > Tested for regression on arm-none-eabi and armeb-none-eabi. I don't have > commit rights, so if this is ok can someone please commit it for me? > > gcc/ChangeLog: > > 2019-11-14 Delia Burduv > > * config/arm/arm_neon.h (bfloat16_t): New typedef. > (bfloat16x4x2_t): New typedef. > (bfloat16x8x2_t): New typedef. > (bfloat16x4x3_t): New typedef. > (bfloat16x8x3_t): New typedef. > (bfloat16x4x4_t): New typedef. > (bfloat16x8x4_t): New typedef. > (vst2_bf16): New. > (vst2q_bf16): New. > (vst3_bf16): New. > (vst3q_bf16): New. > (vst4_bf16): New. > (vst4q_bf16): New. > * config/arm/arm-builtins.c (E_V2BFmode): New mode. > (VAR13): New. > (arm_simd_types[Bfloat16x2_t]):New type. > * config/arm/arm-modes.def (V2BF): New mode. > * config/arm/arm-simd-builtin-types.def > (Bfloat16x2_t): New entry. > * config/arm/arm_neon_builtins.def > (vst2): Changed to VAR13 and added v4bf, v8bf > (vst3): Changed to VAR13 and added v4bf, v8bf > (vst4): Changed to VAR13 and added v4bf, v8bf > * config/arm/iterators.md (VDXBF): New iterator. > (VQ2BF): New iterator. > (V_elem): Added V4BF, V8BF. > (V_sz_elem): Added V4BF, V8BF. > (V_mode_nunits): Added V4BF, V8BF. > (q): Added V4BF, V8BF. > *config/arm/neon.md (vst2): Used new iterators. > (vst3): Used new iterators. > (vst3qa): Used new iterators. > (vst3qb): Used new iterators. > (vst4): Used new iterators. > (vst4qa): Used new iterators. > (vst4qb): Used new iterators. > > > gcc/testsuite/ChangeLog: > > 2019-11-14 Delia Burduv > > * gcc.target/arm/simd/bf16_vstn_1.c: New test.
Re: ACLE intrinsics: BFloat16 load intrinsics for AArch32
Ping. From: Delia Burduv Sent: 22 January 2020 17:31 To: gcc-patches@gcc.gnu.org Cc: ni...@redhat.com ; Richard Earnshaw ; Kyrylo Tkachov ; Ramana Radhakrishnan Subject: Re: ACLE intrinsics: BFloat16 load intrinsics for AArch32 Ping. I will change the tests to use the exact input and output registers as Richard Sandiford suggested for the AArch64 patches. On 12/20/19 6:48 PM, Delia Burduv wrote: > This patch adds the ARMv8.6 ACLE BFloat16 load intrinsics vld{q}_bf16 > as part of the BFloat16 extension. > (https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics) > > The intrinsics are declared in arm_neon.h . > A new test is added to check assembler output. > > This patch depends on the Arm back-end patche. > (https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html) > > Tested for regression on arm-none-eabi and armeb-none-eabi. I don't have > commit rights, so if this is ok can someone please commit it for me? > > gcc/ChangeLog: > > 2019-11-14 Delia Burduv > > * config/arm/arm_neon.h (bfloat16_t): New typedef. > (bfloat16x4x2_t): New typedef. > (bfloat16x8x2_t): New typedef. > (bfloat16x4x3_t): New typedef. > (bfloat16x8x3_t): New typedef. > (bfloat16x4x4_t): New typedef. > (bfloat16x8x4_t): New typedef. > (vld2_bf16): New. > (vld2q_bf16): New. > (vld3_bf16): New. > (vld3q_bf16): New. > (vld4_bf16): New. > (vld4q_bf16): New. > (vld2_dup_bf16): New. > (vld2q_dup_bf16): New. > (vld3_dup_bf16): New. > (vld3q_dup_bf16): New. > (vld4_dup_bf16): New. > (vld4q_dup_bf16): New. > * config/arm/arm-builtins.c (E_V2BFmode): New mode. > (VAR13): New. > (arm_simd_types[Bfloat16x2_t]):New type. > * config/arm/arm-modes.def (V2BF): New mode. > * config/arm/arm-simd-builtin-types.def > (Bfloat16x2_t): New entry. > * config/arm/arm_neon_builtins.def > (vld2): Changed to VAR13 and added v4bf, v8bf > (vld2_dup): Changed to VAR8 and added v4bf, v8bf > (vld3): Changed to VAR13 and added v4bf, v8bf > (vld3_dup): Changed to VAR8 and added v4bf, v8bf > (vld4): Changed to VAR13 and added v4bf, v8bf > (vld4_dup): Changed to VAR8 and added v4bf, v8bf > * config/arm/iterators.md (VDXBF): New iterator. > (VQ2BF): New iterator. > (V_elem): Added V4BF, V8BF. > (V_sz_elem): Added V4BF, V8BF. > (V_mode_nunits): Added V4BF, V8BF. > (q): Added V4BF, V8BF. > *config/arm/neon.md (vld2): Used new iterators. > (vld2_dup): Used new iterators. > (vld2_dupv8bf): New. > (vst3): Used new iterators. > (vst3qa): Used new iterators. > (vst3qb): Used new iterators. > (vld3_dup): Used new iterators. > (vld3_dupv8bf): New. > (vst4): Used new iterators. > (vst4qa): Used new iterators. > (vst4qb): Used new iterators. > (vld4_dup): Used new iterators. > (vld4_dupv8bf): New. > > > gcc/testsuite/ChangeLog: > > 2019-11-14 Delia Burduv > > * gcc.target/arm/simd/bf16_vldn_1.c: New test.
[PATCH, ivopts] Fix fast-math-pr55281.c ICE
This patch fixes an ICE compiling fast-math-pr55281.c for amdgcn. The problem is that an "iv" is created in which both base and step are pointer types, but the base is converted to sizetype without also converting the step to a non-pointer type. Later in the compilation this results in a PLUS_EXPR with a pointer argument, which is invalid gimple. The patch fixes the problem by ensuring that the step is converted at the same point the base is. This seems like it ought to be correct. If the step is not a pointer type then no conversion occurs. I don't really understand why I only see this issue on amdgcn, but it might be because the pointer in question is in a MASK_LOAD which is perhaps not that commonly used? I've tested this on amdgcn, and done a full bootstrap and test on x86_64 also. OK to commit? Thanks Andrew Fix fast-math-pr55281.c ICE. 2020-01-28 Andrew Stubbs gcc/ * tree-ssa-loop-ivopts.c (add_iv_candidate_for_use): Convert step to basestep similarly to basetype. diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index a21f3077e74..1abeb13bb53 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -3472,7 +3472,7 @@ add_iv_candidate_for_use (struct ivopts_data *data, struct iv_use *use) { poly_uint64 offset; tree base; - tree basetype; + tree basetype, basestep; struct iv *iv = use->iv; add_candidate (data, iv->base, iv->step, false, use); @@ -3482,9 +3482,14 @@ add_iv_candidate_for_use (struct ivopts_data *data, struct iv_use *use) /* Record common candidate with initial value zero. */ basetype = TREE_TYPE (iv->base); + basestep = iv->step; if (POINTER_TYPE_P (basetype)) -basetype = sizetype; - record_common_cand (data, build_int_cst (basetype, 0), iv->step, use); +{ + basetype = sizetype; + if (POINTER_TYPE_P (TREE_TYPE (iv->step))) + basestep = fold_convert (basetype, iv->step); +} + record_common_cand (data, build_int_cst (basetype, 0), basestep, use); /* Compare the cost of an address with an unscaled index with the cost of an address with a scaled index and add candidate if useful. */
Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX
On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu wrote: > > You could move > > > > (match_test "TARGET_AVX") > > (const_string "TI") > > > > up to bypass the cases below. > > > > I don't think we can do that. There are 2 cases where we prefer > movaps/movups: > > /* Use packed single precision instructions where posisble. I.e. > movups instead of movupd. */ > DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL, > "sse_packed_single_insn_optimal", > m_BDVER | m_ZNVER) > > /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores. */ > DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores", > m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC) > > We should always use movaps/movups for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL. > It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with TARGET_AVX > as m_BDVER | m_ZNVER support AVX. The reason for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL on AMD target is only insn size, as advised in e.g. Software Optimization Guide for the AMD Family 15h Processors [1], section 7.1.2, where it is said: --quote-- 7.1.2 Reduce Instruction SizeOptimization Reduce the size of instructions when possible. Rationale Using smaller instruction sizes improves instruction fetch throughput. Specific examples include the following: *In SIMD code, use the single-precision (PS) form of instructions instead of the double-precision (PD) form. For example, for register to register moves, MOVAPS achieves the same result as MOVAPD, but uses one less byte to encode the instruction and has no prefix byte. Other examples in which single-precision forms can be substituted for double-precision forms include MOVUPS, MOVNTPS, XORPS, ORPS, ANDPS, and SHUFPS. ... --/quote-- Please note that this optimization applies only to non-AVX forms, as demonstrated by: 0: 0f 28 c8movaps %xmm0,%xmm1 3: 66 0f 28 c8 movapd %xmm0,%xmm1 7: c5 f8 28 d1 vmovaps %xmm1,%xmm2 b: c5 f9 28 d1 vmovapd %xmm1,%xmm2 Also note that MOVDQA is missing in the above optimization. It is harmful to substitute MOVDQA with MOVAPS, as it can (and does) introduce +1 cycle forwarding penalty between FLT (FPA/FPM) and INT (VALU) FP clusters. Following the above optimization, it is obvious that TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling was cargo-culted from one pattern to another. Its use should be reviewed and fixed where not appropriate. [1] https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf Uros.
Re: Return slot optimization for stack protector strong
On Mon, Jan 27, 2020 at 06:53:51PM +0100, Jakub Jelinek wrote: > On Mon, Jan 27, 2020 at 06:49:23PM +0100, Stefan Schulze Frielinghaus wrote: > > some function calls trigger the stack-protector-strong although such > > calls are later on implemented via calls to internal functions. > > Consider the following example: > > > > long double > > rintl_wrapper (long double x) > > { > > return rintl (x); > > } > > > > On s390x a return value of type `long double` is passed via a return > > slot. Thus according to function `stack_protect_return_slot_p` a > > function call like `rintl (x)` triggers the stack-protector-strong since > > rintl is not an internal function. However, in a later stage, during > > `expand_call_stmt`, such a call is implemented via a call to an internal > > function. This means in the example, the call `rintl (x)` is expanded > > into an assembler instruction with register operands only. Thus this > > late time decision renders the usage of the stack protector superfluous. > > I doubt your predicate gives any guarantees that the builtin will be > expanded inline rather than a library call. Some builtins might be expanded > inline or as a library call depending on various options, or depending on > particular arguments etc. My predicate is more or less just a copy of what happens in `expand_call_stmt` where we have decl = gimple_call_fndecl (stmt); if (gimple_call_lhs (stmt) && !gimple_has_side_effects (stmt) && (optimize || (decl && called_as_built_in (decl { internal_fn ifn = replacement_internal_fn (stmt); if (ifn != IFN_LAST) { expand_internal_call (ifn, stmt); return; } } There a decision is made whether a call is implemented as a call to an internal function or not. Thus using the very same logic it should be possible to decide at an earlier stage that a call will be implemented as a call to an internal function. Since an internal function has no linkage, it is therefore guaranteed that it will be inlined. Do I miss something? Cheers, Stefan
Re: [patch, fortran] PR93473 - ICE on valid with long module + submodule names
LGTM now. Thanks, Tobias On 1/28/20 5:41 PM, Andrew Benson wrote: On Tuesday, January 28, 2020 9:27:58 AM PST Tobias Burnus wrote: On 1/28/20 12:41 AM, Andrew Benson wrote: The problem occurs in set_syms_host_assoc() where the "parent1" and "parent2" variables have a maximum length of GFC_MAX_SYMBOL_LEN+1. This is insufficient when the parent names are a module+submodule name concatenated with a ".". The patch above fixes this by increasing their length to 2*GFC_MAX_SYMBOL_LEN+2. A patch to fix this is attached. The patch regression tests cleanly - ok to commit? The patch is okay, but can you add a comment – similar to the other patch – which makes clear why one needs twice the normal GFC_MAX_SYMBOL_LEN (+2)? You currently have: const char dot[2] = "."; - char parent1[GFC_MAX_SYMBOL_LEN + 1]; - char parent2[GFC_MAX_SYMBOL_LEN + 1]; + char parent1[2 * GFC_MAX_SYMBOL_LEN + 2]; + char parent2[2 * GFC_MAX_SYMBOL_LEN + 2]; Yes - I've added a comment here explaining the naming convention. An updated patch is attached. -Andrew
Re: [PATCH] doc: clarify the situation with pointer arithmetic
On Tue, 28 Jan 2020, Uecker, Martin wrote: > Unfortunately, this is not as simple. It is not trivial to > define the set of objects whose "address could have participated > in the computation" > [...] I am not satisfied with the response, but I'm not sure I should continue arguing here. Alexander
Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule
Hi Andrew, On 1/28/20 5:41 PM, Andrew Benson wrote: Can you also update the comment before the #define? It currently has: Thanks for the suggestion. An updated patch is attached. Thanks. LGTM now. PS: I wonder whether there are relevant systems which will fail because they do not handle that long symbol names... Good question. Should I hold off on committing the patch until this can be tested further? Or should I just go ahead and commit and deal with any such problems if they show up? I think it is fine – as a user, one can always choose a shorter name if the system one is interested in doesn't support it. Besides, following the current scheme, we do not really have a choice without breaking the ABI. Also, Richard Biener raised the question (in the PR) of whether this patch would be an ABI change. I can see that it probably would be, but don't know for sure. If it would change the ABI is there anything else that I need to include in the patch? As just mentioned on the PR, I think it is a small ABI change – before a very long identified got cut off now it is available in full extent. — I do not see any simple way to avoid this ABI change and, frankly, I also do not think any real-world program uses that long identifiers. Namely, instead of using 3 times the length of 42 character ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnop") the standard permits 3 times the length of 63 characters (ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890_). Already 42 is quite long for a module, submodule or procedure name. And that does work even without the patch. And as only the sum counts, if one part only uses 10 characters (e.g. "my_module5"), now 2*58 characters available two others. If at the end it really causes problems (source code not available), the user still can modify the module file and change the procedure name to the trimmed value and continue. Thus, I do not think it is a real problem in practice. We could be nice, however, and add a note to the release notes (i.e. https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git about how to change WWW files and CC Gerald as web maintainer when submitting wwwdocs patches.] Thanks, Tobias patch.diff diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h index 52bc045..69171f3 100644 --- a/gcc/fortran/trans.h +++ b/gcc/fortran/trans.h @@ -23,8 +23,8 @@ along with GCC; see the file COPYING3. If not see #include "predict.h" /* For enum br_predictor and PRED_*. */ -/* Mangled symbols take the form __module__name. */ -#define GFC_MAX_MANGLED_SYMBOL_LEN (GFC_MAX_SYMBOL_LEN*2+4) +/* Mangled symbols take the form __module__name or __module.submodule__name. */ +#define GFC_MAX_MANGLED_SYMBOL_LEN (GFC_MAX_SYMBOL_LEN*3+5) /* Struct for holding a block of statements. It should be treated as an opaque entity and not modified directly. This allows us to change the diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 b/gcc/testsuite/gfortran.dg/pr93461.f90 new file mode 100644 index 000..3bef326 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr93461.f90 @@ -0,0 +1,22 @@ +! { dg-do compile } +! PR fortran/93461 +module aModuleWithAnAllowedName + interface + module subroutine aShortName() + end subroutine aShortName + end interface +end module aModuleWithAnAllowedName + +submodule (aModuleWithAnAllowedName) aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName +contains + subroutine aShortName() +call aSubroutineWithAVeryLongNameThatWillCauseAProblem() +call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso() + end subroutine aShortName + + subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem() + end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem + + subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso() + end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso +end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName ChangeLog 2020-01-28 Andrew Benson PR fortran/93461 * trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name, plus the "." between module and submodule names. * gfortran.dg/pr93461.f90: New test.
[PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES
On Tue, Jan 28, 2020 at 9:12 AM Uros Bizjak wrote: > > On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu wrote: > > > > You could move > > > > > > (match_test "TARGET_AVX") > > > (const_string "TI") > > > > > > up to bypass the cases below. > > > > > > > I don't think we can do that. There are 2 cases where we prefer > > movaps/movups: > > > > /* Use packed single precision instructions where posisble. I.e. > > movups instead of movupd. */ > > DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL, > > "sse_packed_single_insn_optimal", > > m_BDVER | m_ZNVER) > > > > /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores. > > */ > > DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores", > > m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC) > > > > We should always use movaps/movups for > > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL. > > It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with TARGET_AVX > > as m_BDVER | m_ZNVER support AVX. > > The reason for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL on AMD target is > only insn size, as advised in e.g. Software Optimization Guide for the > AMD Family 15h Processors [1], section 7.1.2, where it is said: > > --quote-- > 7.1.2 Reduce Instruction SizeOptimization > > Reduce the size of instructions when possible. > > Rationale > > Using smaller instruction sizes improves instruction fetch throughput. > Specific examples include the following: > > *In SIMD code, use the single-precision (PS) form of instructions > instead of the double-precision (PD) form. For example, for register > to register moves, MOVAPS achieves the same result as MOVAPD, but uses > one less byte to encode the instruction and has no prefix byte. Other > examples in which single-precision forms can be substituted for > double-precision forms include MOVUPS, MOVNTPS, XORPS, ORPS, ANDPS, > and SHUFPS. > ... > --/quote-- > > Please note that this optimization applies only to non-AVX forms, as > demonstrated by: > >0: 0f 28 c8movaps %xmm0,%xmm1 >3: 66 0f 28 c8 movapd %xmm0,%xmm1 >7: c5 f8 28 d1 vmovaps %xmm1,%xmm2 >b: c5 f9 28 d1 vmovapd %xmm1,%xmm2 > > Also note that MOVDQA is missing in the above optimization. It is > harmful to substitute MOVDQA with MOVAPS, as it can (and does) > introduce +1 cycle forwarding penalty between FLT (FPA/FPM) and INT > (VALU) FP clusters. > > Following the above optimization, it is obvious that > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling was cargo-culted from > one pattern to another. Its use should be reviewed and fixed where not > appropriate. > > [1] https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf > > Uros. Here is the updated patch which moves TARGET_AVX before TARGET_SSE_TYPELESS_STORES. OK for master if there is no regression? Thanks. -- H.J. From cbcf8b23b29588f12e464076dacacd4600d0059b Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Mon, 27 Jan 2020 09:35:11 -0800 Subject: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES movaps/movups is one byte shorter than movdaq/movdqu. But it isn't the case for AVX nor AVX512. This patch prefers TARGET_AVX over TARGET_SSE_TYPELESS_STORES and adjust vmovups checks in assembly ouputs. gcc/ PR target/91461 * config/i386/i386.md (*movoi_internal_avx): Remove TARGET_SSE_TYPELESS_STORES check. (*movti_internal): Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES. (*movtf_internal): Likewise. * config/i386/sse.md (mov_internal): Likewise. gcc/testsuite/ PR target/91461 * gcc.target/i386/avx256-unaligned-store-2.c: Don't check vmovups. * gcc.target/i386/avx256-unaligned-store-3.c: Likewise. * gcc.target/i386/pieces-memcpy-4.c: Likewise. * gcc.target/i386/pieces-memcpy-5.c: Likewise. * gcc.target/i386/pieces-memcpy-6.c: Likewise. * gcc.target/i386/pieces-strcpy-2.c: Likewise. * gcc.target/i386/pr90980-1.c: Likewise. * gcc.target/i386/pr87317-4.c: Check "\tvmovd\t" instead of "vmovd" to avoid matching "vmovdqu". * gcc.target/i386/pr87317-5.c: Likewise. * gcc.target/i386/pr87317-7.c: Likewise. * gcc.target/i386/pr91461-1.c: New test. * gcc.target/i386/pr91461-2.c: Likewise. * gcc.target/i386/pr91461-3.c: Likewise. * gcc.target/i386/pr91461-4.c: Likewise. * gcc.target/i386/pr91461-5.c: Likewise. Xi --- gcc/config/i386/i386.md | 12 ++- gcc/config/i386/sse.md| 4 +- .../i386/avx256-unaligned-store-2.c | 4 +- .../i386/avx256-unaligned-store-3.c | 4 +- .../gcc.target/i386/pieces-memcpy-4.c | 3 +- .../gcc.target/i386/pieces-memcpy-5.c | 3 +- .../gcc.target/i386/pieces-memcpy-6.c | 3 +- .../gcc.target/i386/pieces-strcpy-2.c | 2 +- gcc/testsuite/gcc.target/i386/pr87317-4.c | 2 +- gcc/testsuite/gcc.target/i386/pr87317-5.c | 2 +- gcc/testsuite/gcc.target/i386/pr87317-7.c | 2 +- gcc/testsuite/gcc.target/i386/pr90980-1.c
Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES
On Tue, Jan 28, 2020 at 6:51 PM H.J. Lu wrote: > > On Tue, Jan 28, 2020 at 9:12 AM Uros Bizjak wrote: > > > > On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu wrote: > > > > > > You could move > > > > > > > > (match_test "TARGET_AVX") > > > > (const_string "TI") > > > > > > > > up to bypass the cases below. > > > > > > > > > > I don't think we can do that. There are 2 cases where we prefer > > > movaps/movups: > > > > > > /* Use packed single precision instructions where posisble. I.e. > > > movups instead of movupd. */ > > > DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL, > > > "sse_packed_single_insn_optimal", > > > m_BDVER | m_ZNVER) > > > > > > /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores. > > > */ > > > DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores", > > > m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC) > > > > > > We should always use movaps/movups for > > > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL. > > > It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with > > > TARGET_AVX > > > as m_BDVER | m_ZNVER support AVX. > > > > The reason for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL on AMD target is > > only insn size, as advised in e.g. Software Optimization Guide for the > > AMD Family 15h Processors [1], section 7.1.2, where it is said: > > > > --quote-- > > 7.1.2 Reduce Instruction SizeOptimization > > > > Reduce the size of instructions when possible. > > > > Rationale > > > > Using smaller instruction sizes improves instruction fetch throughput. > > Specific examples include the following: > > > > *In SIMD code, use the single-precision (PS) form of instructions > > instead of the double-precision (PD) form. For example, for register > > to register moves, MOVAPS achieves the same result as MOVAPD, but uses > > one less byte to encode the instruction and has no prefix byte. Other > > examples in which single-precision forms can be substituted for > > double-precision forms include MOVUPS, MOVNTPS, XORPS, ORPS, ANDPS, > > and SHUFPS. > > ... > > --/quote-- > > > > Please note that this optimization applies only to non-AVX forms, as > > demonstrated by: > > > >0: 0f 28 c8movaps %xmm0,%xmm1 > >3: 66 0f 28 c8 movapd %xmm0,%xmm1 > >7: c5 f8 28 d1 vmovaps %xmm1,%xmm2 > >b: c5 f9 28 d1 vmovapd %xmm1,%xmm2 > > > > Also note that MOVDQA is missing in the above optimization. It is > > harmful to substitute MOVDQA with MOVAPS, as it can (and does) > > introduce +1 cycle forwarding penalty between FLT (FPA/FPM) and INT > > (VALU) FP clusters. > > > > Following the above optimization, it is obvious that > > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling was cargo-culted from > > one pattern to another. Its use should be reviewed and fixed where not > > appropriate. > > > > [1] https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf > > > > Uros. > > Here is the updated patch which moves TARGET_AVX before > TARGET_SSE_TYPELESS_STORES. OK for master if there is > no regression? > > Thanks. + (match_test "TARGET_AVX") + (const_string "") (and (match_test " == 16") Only MODE_SIZE == 16 cases will be left here, since TARGET_AVX is necessary for MODE_SIZE > 16. This test can be removed. OK with the above change. Thanks, Uros. > -- > H.J.
Re: [patch, fortran] PR93473 - ICE on valid with long module + submodule names
Thanks - committed as a83b5cc5828ee34471de415e8893242dd3b0a91b https://gcc.gnu.org/git/gitweb.cgi? p=gcc.git;a=commit;h=a83b5cc5828ee34471de415e8893242dd3b0a91b I'll close the PR. Thanks, Andrew On Tuesday, January 28, 2020 6:32:21 PM PST Tobias Burnus wrote: > LGTM now. > > Thanks, > > Tobias > > On 1/28/20 5:41 PM, Andrew Benson wrote: > > On Tuesday, January 28, 2020 9:27:58 AM PST Tobias Burnus wrote: > >> On 1/28/20 12:41 AM, Andrew Benson wrote: > >>> The problem occurs in set_syms_host_assoc() where the "parent1" and > >>> "parent2" variables have a maximum length of GFC_MAX_SYMBOL_LEN+1. This > >>> is insufficient when the parent names are a module+submodule name > >>> concatenated with a ".". The patch above fixes this by increasing their > >>> length to 2*GFC_MAX_SYMBOL_LEN+2. > >>> > >>> A patch to fix this is attached. The patch regression tests cleanly - ok > >>> to > >>> commit? > >> > >> The patch is okay, but can you add a comment – similar to the other > >> patch – which makes clear why one needs twice the normal > >> > >> GFC_MAX_SYMBOL_LEN (+2)? You currently have: > >>> const char dot[2] = "."; > >>> > >>> - char parent1[GFC_MAX_SYMBOL_LEN + 1]; > >>> - char parent2[GFC_MAX_SYMBOL_LEN + 1]; > >>> + char parent1[2 * GFC_MAX_SYMBOL_LEN + 2]; > >>> + char parent2[2 * GFC_MAX_SYMBOL_LEN + 2]; > > > > Yes - I've added a comment here explaining the naming convention. An > > updated patch is attached. > > > > -Andrew -- * Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html * Galacticus: https://github.com/galacticusorg/galacticus
Re: [patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule
Hi Tobias, Thanks - committed as https://gcc.gnu.org/git/gitweb.cgi? p=gcc.git;a=commit;h=ad690d79cfbb905c5546c9333c5fd089d906505b I'll send a separate email about changes to the release notes. -Andrew On Tuesday, January 28, 2020 6:49:54 PM PST Tobias Burnus wrote: > Hi Andrew, > > On 1/28/20 5:41 PM, Andrew Benson wrote: > >> Can you also update the comment before the #define? It currently has: > > Thanks for the suggestion. An updated patch is attached. > > Thanks. LGTM now. > > >> PS: I wonder whether there are relevant systems which will fail because > >> they do not handle that long symbol names... > > > > Good question. Should I hold off on committing the patch until this can be > > tested further? Or should I just go ahead and commit and deal with any > > such > > problems if they show up? > > I think it is fine – as a user, one can always choose a shorter name if > the system one is interested in doesn't support it. Besides, following > the current scheme, we do not really have a choice without breaking the ABI. > > Also, Richard Biener raised the question (in the PR) of whether this patch > > would be an ABI change. I can see that it probably would be, but don't > > know > > for sure. If it would change the ABI is there anything else that I need to > > include in the patch? > > As just mentioned on the PR, I think it is a small ABI change – before a > very long identified got cut off now it is available in full extent. — I > do not see any simple way to avoid this ABI change and, frankly, I also > do not think any real-world program uses that long identifiers. > > Namely, instead of using 3 times the length of 42 character > ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnop") the standard permits 3 > times the length of 63 characters > (ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890_). > Already 42 is quite long for a module, submodule or procedure name. And > that does work even without the patch. And as only the sum counts, if > one part only uses 10 characters (e.g. "my_module5"), now 2*58 > characters available two others. > > If at the end it really causes problems (source code not available), the > user still can modify the module file and change the procedure name to > the trimmed value and continue. > > Thus, I do not think it is a real problem in practice. We could be nice, > however, and add a note to the release notes (i.e. > https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether > it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git > about how to change WWW files and CC Gerald as web maintainer when > submitting wwwdocs patches.] > > Thanks, > > Tobias > > > patch.diff > > > > diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h > > index 52bc045..69171f3 100644 > > --- a/gcc/fortran/trans.h > > +++ b/gcc/fortran/trans.h > > @@ -23,8 +23,8 @@ along with GCC; see the file COPYING3. If not see > > > > #include "predict.h" /* For enum br_predictor and PRED_*. */ > > > > -/* Mangled symbols take the form __module__name. */ > > -#define GFC_MAX_MANGLED_SYMBOL_LEN (GFC_MAX_SYMBOL_LEN*2+4) > > +/* Mangled symbols take the form __module__name or > > __module.submodule__name. */ +#define GFC_MAX_MANGLED_SYMBOL_LEN > > (GFC_MAX_SYMBOL_LEN*3+5) > > > > /* Struct for holding a block of statements. It should be treated as an > > > > opaque entity and not modified directly. This allows us to change > > the > > > > diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 > > b/gcc/testsuite/gfortran.dg/pr93461.f90 new file mode 100644 > > index 000..3bef326 > > --- /dev/null > > +++ b/gcc/testsuite/gfortran.dg/pr93461.f90 > > @@ -0,0 +1,22 @@ > > +! { dg-do compile } > > +! PR fortran/93461 > > +module aModuleWithAnAllowedName > > + interface > > + module subroutine aShortName() > > + end subroutine aShortName > > + end interface > > +end module aModuleWithAnAllowedName > > + > > +submodule (aModuleWithAnAllowedName) > > aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName +contains > > + subroutine aShortName() > > +call aSubroutineWithAVeryLongNameThatWillCauseAProblem() > > +call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso() > > + end subroutine aShortName > > + > > + subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem() > > + end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem > > + > > + subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso() > > + end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso > > +end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName > > > > ChangeLog > > > > 2020-01-28 Andrew Benson > > > > PR fortran/93461 > > * trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to > > GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name, > > plus the "." between module and submodule names. > > > > * gfortran.dg/pr93461.f90: New test. -- * Andrew Benson: http://
Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES
On Tue, Jan 28, 2020 at 10:04 AM Uros Bizjak wrote: > > On Tue, Jan 28, 2020 at 6:51 PM H.J. Lu wrote: > > > > On Tue, Jan 28, 2020 at 9:12 AM Uros Bizjak wrote: > > > > > > On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu wrote: > > > > > > > > You could move > > > > > > > > > > (match_test "TARGET_AVX") > > > > > (const_string "TI") > > > > > > > > > > up to bypass the cases below. > > > > > > > > > > > > > I don't think we can do that. There are 2 cases where we prefer > > > > movaps/movups: > > > > > > > > /* Use packed single precision instructions where posisble. I.e. > > > > movups instead of movupd. */ > > > > DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL, > > > > "sse_packed_single_insn_optimal", > > > > m_BDVER | m_ZNVER) > > > > > > > > /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit > > > > stores. */ > > > > DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores", > > > > m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC) > > > > > > > > We should always use movaps/movups for > > > > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL. > > > > It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with > > > > TARGET_AVX > > > > as m_BDVER | m_ZNVER support AVX. > > > > > > The reason for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL on AMD target is > > > only insn size, as advised in e.g. Software Optimization Guide for the > > > AMD Family 15h Processors [1], section 7.1.2, where it is said: > > > > > > --quote-- > > > 7.1.2 Reduce Instruction SizeOptimization > > > > > > Reduce the size of instructions when possible. > > > > > > Rationale > > > > > > Using smaller instruction sizes improves instruction fetch throughput. > > > Specific examples include the following: > > > > > > *In SIMD code, use the single-precision (PS) form of instructions > > > instead of the double-precision (PD) form. For example, for register > > > to register moves, MOVAPS achieves the same result as MOVAPD, but uses > > > one less byte to encode the instruction and has no prefix byte. Other > > > examples in which single-precision forms can be substituted for > > > double-precision forms include MOVUPS, MOVNTPS, XORPS, ORPS, ANDPS, > > > and SHUFPS. > > > ... > > > --/quote-- > > > > > > Please note that this optimization applies only to non-AVX forms, as > > > demonstrated by: > > > > > >0: 0f 28 c8movaps %xmm0,%xmm1 > > >3: 66 0f 28 c8 movapd %xmm0,%xmm1 > > >7: c5 f8 28 d1 vmovaps %xmm1,%xmm2 > > >b: c5 f9 28 d1 vmovapd %xmm1,%xmm2 > > > > > > Also note that MOVDQA is missing in the above optimization. It is > > > harmful to substitute MOVDQA with MOVAPS, as it can (and does) > > > introduce +1 cycle forwarding penalty between FLT (FPA/FPM) and INT > > > (VALU) FP clusters. > > > > > > Following the above optimization, it is obvious that > > > TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling was cargo-culted from > > > one pattern to another. Its use should be reviewed and fixed where not > > > appropriate. > > > > > > [1] https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf > > > > > > Uros. > > > > Here is the updated patch which moves TARGET_AVX before > > TARGET_SSE_TYPELESS_STORES. OK for master if there is > > no regression? > > > > Thanks. > > > + (match_test "TARGET_AVX") > + (const_string "") > (and (match_test " == 16") > > Only MODE_SIZE == 16 cases will be left here, since TARGET_AVX is > necessary for MODE_SIZE > 16. This test can be removed. > > OK with the above change. > This is the patch I am going to check in. Thanks. -- H.J. From 66c534dedc7a9a632aa38c32e3f7c251b8f2c778 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Mon, 27 Jan 2020 09:35:11 -0800 Subject: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES movaps/movups is one byte shorter than movdaq/movdqu. But it isn't the case for AVX nor AVX512. This patch prefers TARGET_AVX over TARGET_SSE_TYPELESS_STORES and adjust vmovups checks in assembly ouputs. gcc/ PR target/91461 * config/i386/i386.md (*movoi_internal_avx): Remove TARGET_SSE_TYPELESS_STORES check. (*movti_internal): Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES. (*movtf_internal): Likewise. * config/i386/sse.md (mov_internal): Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES. Remove " == 16" check from TARGET_SSE_TYPELESS_STORES. gcc/testsuite/ PR target/91461 * gcc.target/i386/avx256-unaligned-store-2.c: Don't check vmovups. * gcc.target/i386/avx256-unaligned-store-3.c: Likewise. * gcc.target/i386/pieces-memcpy-4.c: Likewise. * gcc.target/i386/pieces-memcpy-5.c: Likewise. * gcc.target/i386/pieces-memcpy-6.c: Likewise. * gcc.target/i386/pieces-strcpy-2.c: Likewise. * gcc.target/i386/pr90980-1.c: Likewise. * gcc.target/i386/pr87317-4.c: Check "\tvmovd\t" instead of "vmovd" to avoid matching "vmovdqu". * gcc.target/i386/pr87317-5.c: Likewise. * gcc.target/i386/pr87317-7.c: Likewise. * gcc.target
Re: [PATCH] diagnostic_metadata: unbreak xgettext (v2)
On Tue, 2020-01-28 at 16:34 +0100, Jakub Jelinek wrote: > On Tue, Jan 28, 2020 at 10:30:58AM -0500, David Malcolm wrote: > > gcc/analyzer/ChangeLog: > > * region-model.cc (poisoned_value_diagnostic::emit): Update for > > renaming of warning_at overload to warning_meta. > > * sm-file.cc (file_leak::emit): Likewise. > > * sm-malloc.cc (double_free::emit): Likewise. > > (possible_null_deref::emit): Likewise. > > (possible_null_arg::emit): Likewise. > > (null_deref::emit): Likewise. > > (null_arg::emit): Likewise. > > (use_after_free::emit): Likewise. > > (malloc_leak::emit): Likewise. > > (free_of_non_heap::emit): Likewise. > > * sm-sensitive.cc (exposure_through_output_file::emit): > > Likewise. > > * sm-signal.cc (signal_unsafe_call::emit): Likewise. > > * sm-taint.cc (tainted_array_index::emit): Likewise. > > > > gcc/ChangeLog: > > * diagnostic-core.h (warning_at): Rename overload to... > > (warning_meta): ...this. > > (emit_diagnostic_valist): Delete decl of overload taking > > diagnostic_metadata. > > * diagnostic.c (emit_diagnostic_valist): Likewise for defn. > > (warning_at): Rename overload taking diagnostic_metadata to... > > (warning_meta): ...this. > > > > gcc/testsuite/ChangeLog: > > * gcc.dg/plugin/diagnostic_plugin_test_metadata.c: Update for > > renaming of warning_at overload to warning_meta. > > * gcc.dg/plugin/diagnostic_plugin_test_paths.c: Likewise. > > LGTM, thanks Thanks. B&R succeeded, so pushed to master (6c8e584430bc5dc01b4438f3c38a2a10fcba7685).
Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES
On Tue, Jan 28, 2020 at 10:20:36AM -0800, H.J. Lu wrote: > From 66c534dedc7a9a632aa38c32e3f7c251b8f2c778 Mon Sep 17 00:00:00 2001 > From: "H.J. Lu" > Date: Mon, 27 Jan 2020 09:35:11 -0800 > Subject: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES > > movaps/movups is one byte shorter than movdaq/movdqu. But it isn't the > case for AVX nor AVX512. This patch prefers TARGET_AVX over > TARGET_SSE_TYPELESS_STORES and adjust vmovups checks in assembly ouputs. If you haven't committed yet, please fix the movdaq typo in the description (to movdqa). Jakub
Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES
On Tue, Jan 28, 2020 at 10:58 AM Jakub Jelinek wrote: > > On Tue, Jan 28, 2020 at 10:20:36AM -0800, H.J. Lu wrote: > > From 66c534dedc7a9a632aa38c32e3f7c251b8f2c778 Mon Sep 17 00:00:00 2001 > > From: "H.J. Lu" > > Date: Mon, 27 Jan 2020 09:35:11 -0800 > > Subject: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES > > > > movaps/movups is one byte shorter than movdaq/movdqu. But it isn't the > > case for AVX nor AVX512. This patch prefers TARGET_AVX over > > TARGET_SSE_TYPELESS_STORES and adjust vmovups checks in assembly ouputs. > > If you haven't committed yet, please fix the movdaq typo in the description > (to movdqa). > Will do. Thanks. -- H.J.
ipa: fix handling of multiple speculations (PR93318)
Hi, This patch started as work to resole Richard's comment on quadratic lookups in resolve_speculation. While doing it I however noticed multiple problems in the new speuclative call code which made the patch quite big. In particular: 1) Before applying speculation we consider only targets with at lest probability 1/2. If profile is sane at most two targets can have probability greater or equal to 1/2. So the new multi-target speculation code got enabled only in very special scenario when there ae precisely two target with precise probability 1/2 (which is tested by the single testcase). As a conseuqence the multiple target logic got minimal test coverage and this made us to miss several ICEs. 2) Profile updating in profile merging, tree-inline and indirect call expansion was wrong which led to inconsistent profiles (as already seen on the testcase). 3) Code responsible to turn speculative call to direct call was broken for anything with more than one target. 4) There were multiple cases where call_site_hash went out of sync which eventually leads to an ICE.. 5) Some code expects that all speculative call targets forms a sequence in the callee linked list but there is no code to maintain that invariant nor a verifier. Fixing this it became obvious that the current API of speculative_call_info is not useful because it really builds on fact tht there are precisely three components (direct call, ref and indirect call) in every speculative call sequence. I ended up replacing it with iterator API for direct call (first_speculative_call_target, next_speculative_call_target) and accessors for the other coponents updating comment in cgraph.h. Finally I made the work with call site hash more effetive by updating edge manipulation to keep them in sequence. So first one can be looked up from the hash and then they can be iterated by callee. There are other things that can be improved (for example the speculation should start with most common target first), but I will try to keep that for next stage1. This patch is mostly about getting rid of ICE and profile corruption which is a regression from GCC 9. lto-bootstrapped/regtested x86_64-linux and also tested with Firefox and hack for less conservative indirect call profile merging. Honza gcc/ChangeLog: 2020-01-28 Jan Hubicka PR lto/93318 * cgraph.c (cgraph_add_edge_to_call_site_hash): Update call site hash only when edge is first within the sequence. (cgraph_edge::set_call_stmt): Update handling of speculative calls. (symbol_table::create_edge): Do not set target_prob. (cgraph_edge::remove_caller): Watch for speculative calls when updating the call site hash. (cgraph_edge::make_speculative): Drop target_prob parameter. (cgraph_edge::speculative_call_info): Remove. (cgraph_edge::first_speculative_call_target): New member function. (update_call_stmt_hash_for_removing_direct_edge): New function. (cgraph_edge::resolve_speculation): Rewrite to new API. (cgraph_edge::speculative_call_for_target): New member function. (cgraph_edge::make_direct): Rewrite to new API; fix handling of multiple speculation targets. (cgraph_edge::redirect_call_stmt_to_callee): Likewise; fix updating of profile. (verify_speculative_call): Verify that targets form an interval. * cgraph.h (cgraph_edge::speculative_call_info): Remove. (cgraph_edge::first_speculative_call_target): New member function. (cgraph_edge::next_speculative_call_target): New member function. (cgraph_edge::speculative_call_target_ref): New member function. (cgraph_edge;:speculative_call_indirect_edge): New member funtion. (cgraph_edge): Remove target_prob. * cgraphclones.c (cgraph_node::set_call_stmt_including_clones): Fix handling of speculative calls. * ipa-devirt.c (ipa_devirt): Fix handling of speculative cals. * ipa-fnsummary.c (analyze_function_body): Likewise. * ipa-inline.c (speculation_useful_p): Use new speculative call API. * ipa-profile.c (dump_histogram): Fix formating. (ipa_profile_generate_summary): Watch for overflows. (ipa_profile): Do not require probablity to be 1/2; update to new API. * ipa-prop.c (ipa_make_edge_direct_to_target): Update to new API. (update_indirect_edges_after_inlining): Update to new API. * ipa-utils.c (ipa_merge_profiles): Rewrite merging of speculative call profiles. * profile-count.h: (profile_probability::adjusted): New. * tree-inline.c (copy_bb): Update to new speculative call API; fix updating of profile. * value-prof.c (gimple_ic_transform): Rename to ... (dump_ic_profile): ... this one; update dumping. (stream_in_histogram_value): Fix formating. (gimple_value_profile_transf
[COMMITTED] c++: Allow template rvalue-ref conv to bind to lvalue ref.
When I implemented the [over.match.ref] rule that a reference conversion function needs to match l/rvalue of the target reference type it changed our handling of this testcase. It seems to me that our current behavior is what the standard says, but it doesn't seem desirable, and all the other compilers have our old behavior. So let's limit the change to non-templates until there's some clarification from the committee. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/90546 * call.c (build_user_type_conversion_1): Allow a template conversion returning an rvalue reference to bind directly to an lvalue. --- gcc/cp/call.c | 2 ++ gcc/testsuite/g++.dg/cpp0x/rv-conv3.C | 15 +++ 2 files changed, 17 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp0x/rv-conv3.C diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 009cb85ad60..fde29f8f631 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -4115,6 +4115,8 @@ build_user_type_conversion_1 (tree totype, tree expr, int flags, EXPR_LOCATION (expr)); } else if (TYPE_REF_P (totype) && !ics->rvaluedness_matches_p + /* Limit this to non-templates for now (PR90546). */ + && !cand->template_decl && TREE_CODE (TREE_TYPE (totype)) != FUNCTION_TYPE) { /* If we are called to convert to a reference type, we are trying diff --git a/gcc/testsuite/g++.dg/cpp0x/rv-conv3.C b/gcc/testsuite/g++.dg/cpp0x/rv-conv3.C new file mode 100644 index 000..5f727fcc0d5 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/rv-conv3.C @@ -0,0 +1,15 @@ +// PR c++/90546 +// { dg-do link { target c++11 } } + +struct Foo { }; +void test(const Foo&) {} +Foo f; +struct Bar { + template operator T&&(); +}; +template<> Bar::operator const Foo&&() { +return static_cast(f); +} +int main() { + test(Bar{}); +} base-commit: 0968f7da26221bfc9203cd557c8636c0a0f24018 -- 2.18.1
profile: fix profile count dumping
Hi, this patch fixes dump of quality. Comitted as obvious. * profile-count.h (profile_quality_display_names): Fix ordering. diff --git a/gcc/profile-count.c b/gcc/profile-count.c index 0c792297826..c89914ff8a0 100644 --- a/gcc/profile-count.c +++ b/gcc/profile-count.c @@ -78,9 +78,9 @@ const char *profile_quality_display_names[] = "estimated locally", "estimated locally, globally 0", "estimated locally, globally 0 adjusted", - "adjusted", - "auto FDO", "guessed", + "auto FDO", + "adjusted", "precise" };
diagnostics: make error message lowercase.
Hi, I comitted this ias obvious. I also wonder if the next error should be info or should be part of the first error message. * coverage.c (read_counts_file): Make error message lowercase. diff --git a/gcc/coverage.c b/gcc/coverage.c index 5e961b26f66..f29ff640c43 100644 --- a/gcc/coverage.c +++ b/gcc/coverage.c @@ -265,7 +265,7 @@ read_counts_file (void) else if (entry->lineno_checksum != lineno_checksum || entry->cfg_checksum != cfg_checksum) { - error ("Profile data for function %u is corrupted", fn_ident); + error ("profile data for function %u is corrupted", fn_ident); error ("checksum is (%x,%x) instead of (%x,%x)", entry->lineno_checksum, entry->cfg_checksum, lineno_checksum, cfg_checksum);
Verify sanity of indirect call/topn profiles
Hi, I will try to reming this next stage1 since it is not regression fix. I found it useful to have bit of sanity checking of the topn profiles to work out the bugs in merging and updating that was there. Honza gcc/ChangeLog: 2020-01-28 Jan Hubicka * profile.c (compute_value_histograms): Verify profile sanity. diff --git a/gcc/profile.c b/gcc/profile.c index cd754c4c66a..782534e5ab4 100644 --- a/gcc/profile.c +++ b/gcc/profile.c @@ -856,10 +856,10 @@ compute_value_histograms (histogram_values values, unsigned cfg_checksum, gimple_add_histogram_value (cfun, stmt, hist); hist->hvalue.counters = XNEWVEC (gcov_type, hist->n_counters); for (j = 0; j < hist->n_counters; j++) -if (aact_count) - hist->hvalue.counters[j] = aact_count[j]; -else - hist->hvalue.counters[j] = 0; + if (aact_count) + hist->hvalue.counters[j] = aact_count[j]; + else + hist->hvalue.counters[j] = 0; if (hist->type == HIST_TYPE_TOPN_VALUES || hist->type == HIST_TYPE_INDIR_CALL) @@ -871,6 +871,26 @@ compute_value_histograms (histogram_values values, unsigned cfg_checksum, = RDIV (hist->hvalue.counters[2 * i + 2], GCOV_TOPN_VALUES); sort_hist_values (hist); + + /* Check profile sanity. */ + if (hist->hvalue.counters[2] != -1) + for (int i = 0; i < GCOV_TOPN_VALUES - 1 && ok; i++) + for (int j = i + 1; j < GCOV_TOPN_VALUES && ok; j++) + if ((hist->hvalue.counters[i * 2 + 1] +== hist->hvalue.counters[j * 2 + 1] +&& hist->hvalue.counters[i * 2 + 2] +&& hist->hvalue.counters[j * 2 + 2]) + || hist->hvalue.counters[i * 2 + 2] < 0) + { + if (hist->type == HIST_TYPE_TOPN_VALUES) + error_at (gimple_location (stmt), + "corrupted profile info:" + " invalid topn profile histogram"); + else + error_at (gimple_location (stmt), + "corrupted profile info:" + " indirect call profile histogram"); + } } /* Time profiler counter is not related to any statement,
Patch to fix PR93272
The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93272 The patch was successfully tested and bootstrapped on x86_64. Unfortunately it is hard to create a test case for the patch. So there is no test for this PR. commit 5c8a1211b9873a1b69ef7b2fddae181535bc3b0a (HEAD -> master, origin/master, origin/HEAD) Author: Vladimir N. Makarov Date: Tue Jan 28 15:43:44 2020 -0500 Fix for PR93272 - LRA: EH reg allocated to hold local variable 2020-01-28 Vladimir Makarov PR rtl-optimization/93272 * ira-lives.c (process_out_of_region_eh_regs): New function. (process_bb_node_lives): Call it. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 326250b9b96..8d60dcf0864 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2020-01-28 Vladimir Makarov + + PR rtl-optimization/93272 + * ira-lives.c (process_out_of_region_eh_regs): New function. + (process_bb_node_lives): Call it. + 2020-01-28 Jan Hubicka * coverage.c (read_counts_file): Make error message lowercase. diff --git a/gcc/ira-lives.c b/gcc/ira-lives.c index 155f825ea8d..f776fd2342f 100644 --- a/gcc/ira-lives.c +++ b/gcc/ira-lives.c @@ -1160,6 +1160,50 @@ non_conflicting_reg_copy_p (rtx_insn *insn) return SET_SRC (set); } +#ifdef EH_RETURN_DATA_REGNO + +/* Add EH return hard registers as conflict hard registers to allocnos + living at end of BB. For most allocnos it is already done in + process_bb_node_lives when we processing input edges but it does + not work when and EH edge is edge out of the current region. This + function covers such out of region edges. */ +static void +process_out_of_region_eh_regs (basic_block bb) +{ + edge e; + edge_iterator ei; + unsigned int i; + bitmap_iterator bi; + bool eh_p = false; + + FOR_EACH_EDGE (e, ei, bb->succs) +if ((e->flags & EDGE_EH) + && IRA_BB_NODE (e->dest)->parent != IRA_BB_NODE (bb)->parent) + eh_p = true; + + if (! eh_p) +return; + + EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb), FIRST_PSEUDO_REGISTER, i, bi) +{ + ira_allocno_t a = ira_curr_regno_allocno_map[i]; + for (int n = ALLOCNO_NUM_OBJECTS (a) - 1; n >= 0; n--) + { + ira_object_t obj = ALLOCNO_OBJECT (a, n); + for (int k = 0; ; k++) + { + unsigned int regno = EH_RETURN_DATA_REGNO (k); + if (regno == INVALID_REGNUM) + break; + SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); + SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); + } + } +} +} + +#endif + /* Process insns of the basic block given by its LOOP_TREE_NODE to update allocno live ranges, allocno hard register conflicts, intersected calls, and register pressure info for allocnos for the @@ -1213,6 +1257,10 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) EXECUTE_IF_SET_IN_BITMAP (reg_live_out, FIRST_PSEUDO_REGISTER, j, bi) mark_pseudo_regno_live (j); +#ifdef EH_RETURN_DATA_REGNO + process_out_of_region_eh_regs (bb); +#endif + freq = REG_FREQ_FROM_BB (bb); if (freq == 0) freq = 1;
[PATCH] coroutines: Prevent repeated error messages for missing promise.
While looking at Pr93458, I spotted the following non-fatal, but unhelpful diagnostic output. If the user's coroutine return type omits the mandatory promise type then we will currently restate that error each time we see a coroutine keyword, which doesn't provide any new information. This suppresses all but the first instance in each coroutine. tested on x86_64-darwin16 OK for master? thanks Iain gcc/cp/ChangeLog: 2020-01-28 Iain Sandoe * coroutines.cc (find_promise_type): Delete unused forward declaration. (struct coroutine_info): Add a bool for no promise type error. (coro_promise_type_found_p): Only emit the error for a missing promise once in each affected coroutine. gcc/testsuite/ChangeLog: 2020-01-28 Iain Sandoe * g++.dg/coroutines/coro-missing-promise.C: New test. --- gcc/cp/coroutines.cc | 7 +-- .../g++.dg/coroutines/coro-missing-promise.C | 20 +++ 2 files changed, 25 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/coroutines/coro-missing-promise.C diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc index 89ecea91b9a..a56bbc883ca 100644 --- a/gcc/cp/coroutines.cc +++ b/gcc/cp/coroutines.cc @@ -33,7 +33,6 @@ along with GCC; see the file COPYING3. If not see #include "gcc-rich-location.h" #include "hash-map.h" -static tree find_promise_type (tree); static bool coro_promise_type_found_p (tree, location_t); /* GCC C++ coroutines implementation. @@ -93,6 +92,7 @@ struct GTY((for_user)) coroutine_info function into a coroutine. */ /* Flags to avoid repeated errors for per-function issues. */ bool coro_ret_type_error_emitted; + bool coro_promise_error_emitted; }; struct coroutine_info_hasher : ggc_ptr_hash @@ -447,7 +447,10 @@ coro_promise_type_found_p (tree fndecl, location_t loc) /* If we don't find it, punt on the rest. */ if (coro_info->promise_type == NULL_TREE) { - error_at (loc, "unable to find the promise type for this coroutine"); + if (!coro_info->coro_promise_error_emitted) + error_at (loc, "unable to find the promise type for" + " this coroutine"); + coro_info->coro_promise_error_emitted = true; return false; } diff --git a/gcc/testsuite/g++.dg/coroutines/coro-missing-promise.C b/gcc/testsuite/g++.dg/coroutines/coro-missing-promise.C new file mode 100644 index 000..3fc21a4e2ca --- /dev/null +++ b/gcc/testsuite/g++.dg/coroutines/coro-missing-promise.C @@ -0,0 +1,20 @@ +// { dg-additional-options "-fsyntax-only -w" } + +#include "coro.h" + +// Diagnose completely missing promise. + +// { dg-error {no type named 'promise_type' in 'struct NoPromiseHere'} "" { target *-*-* } 0 } + +struct NoPromiseHere { + coro::coroutine_handle<> handle; + NoPromiseHere () : handle (nullptr) {} + NoPromiseHere (coro::coroutine_handle<> handle) : handle (handle) {} +}; + +NoPromiseHere +bar () +{ + co_yield 22; // { dg-error {unable to find the promise type for this coroutine} } + co_return 0; +} -- 2.24.1
[committed] analyzer: fix ICE when longjmp isn't marked 'noreturn' (PR 93316)
Comments 11-16 within PR analyzer/93316 discuss an ICE in some setjmp tests seen on AIX and powerpc-darwin9. The issue turned out to be an implicit assumption that longjmp is marked "noreturn". There are two places in engine.cc where the code attempted to locate the longjmp GIMPLE_CALL by finding the final stmt within its supernode, in one place casting it via "as_a ", in the other using it as the location_t of the "rewinding from longjmp..." event. When longjmp isn't marked noreturn, its basic block and hence supernode can have additional stmts after the longjmp; in the setjmp-3.c case this was a GIMPLE_RETURN, leading to a ICE when casting this to a GIMPLE_CALL. This patch fixes the two places in question to use the rewind_info_t::get_longjmp_call member function introduced by 342e14ffa30e9163a1a75e0a4fb21b6883d58dbe, fixing the ICE (and ensuring the correct location_t is used for events). Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Reproduced the bug and verified the fix with a stage 1 build on powerpc-ibm-aix7.2.0.0 (gcc111), where there were no remaining FAILs in analyzer.exp after the patch. Committed to master as r10-6305-g5aebfb71763c7c8d0bb96adcd0a5f94de96a2a13. gcc/analyzer/ChangeLog: PR analyzer/93316 * engine.cc (rewind_info_t::update_model): Get the longjmp call stmt via get_longjmp_call () rather than assuming it is the last stmt in the longjmp's supernode. (rewind_info_t::add_events_to_path): Get the location_t for the rewind_from_longjmp_event via get_longjmp_call () rather than from the supernode's get_end_location (). --- gcc/analyzer/engine.cc | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc index 2bc0aff6a6e..9acec704224 100644 --- a/gcc/analyzer/engine.cc +++ b/gcc/analyzer/engine.cc @@ -1333,21 +1333,13 @@ void rewind_info_t::update_model (region_model *model, const exploded_edge &eedge) { - const exploded_node &src_enode = *eedge.m_src; - const program_point &src_point = src_enode.get_point (); - - const gimple *last_stmt -= src_point.get_supernode ()->get_last_stmt (); - gcc_assert (last_stmt); - const gcall *longjmp_call = as_a (last_stmt); - const program_point &longjmp_point = eedge.m_src->get_point (); const program_point &setjmp_point = eedge.m_dest->get_point (); gcc_assert (longjmp_point.get_stack_depth () >= setjmp_point.get_stack_depth ()); - model->on_longjmp (longjmp_call, + model->on_longjmp (get_longjmp_call (), get_setjmp_call (), setjmp_point.get_stack_depth (), NULL); } @@ -1368,7 +1360,7 @@ rewind_info_t::add_events_to_path (checker_path *emission_path, emission_path->add_event (new rewind_from_longjmp_event - (&eedge, src_point.get_supernode ()->get_end_location (), + (&eedge, get_longjmp_call ()->location, src_point.get_fndecl (), src_stack_depth, this)); emission_path->add_event -- 2.21.0
[COMMITTED] c++: Fix guard variable and attribute weak.
My patch for PR 91476 worked for decls that are implicitly comdat/weak due to C++ linkage rules, but broke variables explicitly marked weak. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/93477 PR c++/91476 * decl2.c (copy_linkage): Do copy DECL_ONE_ONLY and DECL_WEAK. --- gcc/cp/decl2.c| 8 ++-- gcc/testsuite/g++.dg/abi/guard4.C | 11 +++ 2 files changed, 17 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/abi/guard4.C diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index 1ecf0b937d5..98d8e6a6b53 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -3228,8 +3228,12 @@ copy_linkage (tree guard, tree decl) { CP_DECL_THREAD_LOCAL_P (guard) = CP_DECL_THREAD_LOCAL_P (decl); set_decl_tls_model (guard, DECL_TLS_MODEL (decl)); - /* We can't rely on DECL_WEAK (decl) or DECL_ONE_ONLY (decl) here, as -they may not be set until import_export_decl at EOF. */ + if (DECL_ONE_ONLY (decl)) + make_decl_one_only (guard, cxx_comdat_group (guard)); + if (TREE_PUBLIC (decl)) + DECL_WEAK (guard) = DECL_WEAK (decl); + /* Also check vague_linkage_p, as DECL_WEAK and DECL_ONE_ONLY might not +be set until import_export_decl at EOF. */ if (vague_linkage_p (decl)) comdat_linkage (guard); DECL_VISIBILITY (guard) = DECL_VISIBILITY (decl); diff --git a/gcc/testsuite/g++.dg/abi/guard4.C b/gcc/testsuite/g++.dg/abi/guard4.C new file mode 100644 index 000..537c90524f3 --- /dev/null +++ b/gcc/testsuite/g++.dg/abi/guard4.C @@ -0,0 +1,11 @@ +// PR c++/93477 +// { dg-require-weak } + +namespace x { + struct s { +s() {} +static int a; + }; + // { dg-final { scan-assembler {.weak[^\n]*_ZGVN1x1bE} } } + struct s __attribute__((weak)) b = s(); +} base-commit: 5aebfb71763c7c8d0bb96adcd0a5f94de96a2a13 -- 2.18.1
[COMMITTED] c++: Fix return deduction of lambda in discarded stmt.
A return statement in a discarded statement is not used for return type deduction, but we still want to do deduction for a return statement in a lambda in a discarded statement. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/93442 * parser.c (cp_parser_lambda_expression): Clear in_discarded_stmt. --- gcc/cp/parser.c | 6 ++ .../g++.dg/cpp1z/constexpr-if-lambda1.C | 16 2 files changed, 22 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda1.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 72037ee7b46..b8327823777 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -10530,6 +10530,10 @@ cp_parser_lambda_expression (cp_parser* parser) parser->implicit_template_scope = 0; parser->auto_is_implicit_function_template_parm_p = false; +/* The body of a lambda in a discarded statement is not discarded. */ +bool discarded = in_discarded_stmt; +in_discarded_stmt = 0; + /* By virtue of defining a local class, a lambda expression has access to the private variables of enclosing classes. */ @@ -10560,6 +10564,8 @@ cp_parser_lambda_expression (cp_parser* parser) finish_struct (type, /*attributes=*/NULL_TREE); +in_discarded_stmt = discarded; + parser->num_template_parameter_lists = saved_num_template_parameter_lists; parser->in_statement = in_statement; parser->in_switch_statement_p = in_switch_statement_p; diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda1.C b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda1.C new file mode 100644 index 000..64c4cd27fe6 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda1.C @@ -0,0 +1,16 @@ +// PR c++/93442 +// { dg-do compile { target c++17 } } + +struct bar { +int foo(){return 0;} +}; + +int foobar() { +if constexpr(true) { +return 0; +} else { +return [](){ +return bar{}; +}().foo(); +} +} base-commit: 5aebfb71763c7c8d0bb96adcd0a5f94de96a2a13 -- 2.18.1
[committed] add test for PR 93437
Done in r10-6309-g4dd27b527c503aa50909fe1eb7d308266b1e103a. Martin
Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa
On Tue, 2019-10-08 at 18:04 +0300, Alexander Monakov wrote: > On Thu, 3 Oct 2019, Jeff Law wrote: > > > You may want to review the 2018 discussion: > > > > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg185287.html > > > > THe 2018 discussion was primarily concerned with fixing the CFG > > inaccuracies which would in turn fix 61118. But would not fix 21161. > > Well, PR 61118 was incorrectly triaged to begin with. > > Let me show starting from a simplified testcase: > > void bar(jmp_buf); > void foo(int n) > { > jmp_buf jb; > for (; n; n--) > if (setjmp(jb)) > bar(jb); > } > > Here we warn that "parameter 'n' might be clobbered..." despite that > "obviously" there is no way 'n' might be modified between setjmp and > bar... right? Not necessarily. You have to look at what objects are live. N is clearly going to be live throughout the entire loop and thus it will be subject to setjmp/longjmp warnings. The inaccuracies in our CFG at the setjmp call cause objects which are not actually live across the call to appear to be live. See PR 21161. > > Now suppose 'bar' looks like [ ... ] It doesn't really matter what bar looks like. > > > Still, I can imagine one can argue that the warning in the PR is bogus, as > the loop assigns invariant values to problematic variables: I'm not sure which PR you're referring to (21161, 65041 or some other?) > > void foo(struct ctx *arg) > { > while (arg->cond) { > void *var1 = &arg->field; > void (*var2)(void*) = globalfn; > jmp_buf jb; > if (!setjmp(jb)) > var2(var1); > } > etc(arg); > } > > and since "obviously" &arg->field is invariant, there's no way setjmp will > overwrite 'var1' with a different value, right?.. Except: > > If the while loop is not entered, a longjmp from inside 'etc' would restore > default (uninitialized) values; note the compiler doesn't pay attention to the > lifetime of jmp_buf, it simply assumes 'etc' may cause setjmp to return > abnormally (this is correct in case of other returns_twice functions such as > vfork). Whether or not we should warn for restoring an uninitialized state I think is somethign we haven't really considered. Depending on how many assignments to the pseudo there are and the exact control flow we may or may not warn. > > > So the bottom line here that setjmps in loops are extra tricky, warnings for > them that superficially appear to be false positives may indicate a real > issue, > and we can do a better job documenting that and recommending safer patterns. I don't think setjmps in loops add any real additional complexity. > > > Now regarding claims that we emit "incorrect" CFG that is causing extra phi > nodes, and that "fixing CFG" would magically eliminate those and help with > false positives. > > While in principle it is perhaps nicer to have IR that more closely models > the abnormal flow, I don't see how it would reduce phi nodes; we'd go from It certainly does move PHI nodes and it can reduce the number of false positives. It's not as effective as it could be because of lameness on the RTL side (again see PR21161). Jeff
Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa
On Thu, 2019-10-17 at 16:14 +0300, Alexander Monakov wrote: > On Tue, 8 Oct 2019, Alexander Monakov wrote: > > [massive snip] > > > So in my opinion our CFG is good enough, the real issues with -Wclobbered > > false > > positives are not due to phi nodes but other effects. > > > > If you agree: what would be the next steps? > > Hello, > > may I ping this discussion? I apologize for letting my cranky mood that day > leak > too badly into the message. No worries at all. Just too much going on. Happy to try and kick things moving. Particularly in the context of 21161 and 61118 which are regressions. jeff
Re: [PATCH] testsuite: XFAIL gcc.dg/torture/pr93133.c for powerpc*-*-* [PR93393]
On Tue, Jan 28, 2020 at 10:43:24AM +, Richard Sandiford wrote: > Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. > OK to install? Yes, thank you. Segher > 2020-01-28 Richard Sandiford > > gcc/testsuite/ > PR testsuite/93393 > * gcc.dg/torture/pr93133.c: XFAIL for powerpc*-*-*. > --- > gcc/testsuite/gcc.dg/torture/pr93133.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/gcc.dg/torture/pr93133.c > b/gcc/testsuite/gcc.dg/torture/pr93133.c > index 21eae1eb3dd..19af6b8ee6d 100644 > --- a/gcc/testsuite/gcc.dg/torture/pr93133.c > +++ b/gcc/testsuite/gcc.dg/torture/pr93133.c > @@ -1,4 +1,4 @@ > -/* { dg-do run } */ > +/* { dg-do run { xfail powerpc*-*-* } } */ > /* { dg-add-options ieee } */ > /* { dg-require-effective-target fenv_exceptions } */ > /* { dg-skip-if "fenv" { powerpc-ibm-aix* } } */
[PATCH] c++: Fix class NTTP with template arguments [PR92948]
This PR points out an ICE with an alias template and class NTTP, but I found that there are more issues. Trouble arise when we use a (non-type) template parameter as an argument to the template arg list of a template that accepts a class NTTP and a conversion to a class is involved, e.g. struct A { A(int) { } }; template struct B { }; template void fn () { B b; } Normally for such a conversion we create a TARGET_EXPR with an AGGR_INIT_EXPR that calls a __ct_comp with some arguments, but not in this case: when converting X to type A in convert_nontype_argument we are in a template and the template parameter 'X' is value-dependent, and AGGR_INIT_EXPR don't work in templates. So it just creates a TARGET_EXPR that contains "A::A(*this, X)", but with that overload resolution fails: error: no matching function for call to 'A::A(A*, int)' That happens because finish_call_expr for a BASELINK creates a dummy object, so we have 'this' twice. I thought for the value-dependent case we could use IMPLICIT_CONV_EXPR, as in the patch below. Note that I only do this when we convert to a different type than the type of the expr. The point is to avoid the call to a converting ctor. The second issue was an ICE in tsubst_copy: when there's a conversion like the above involved then /* Wrapper to make a C++20 template parameter object const. */ op = tsubst_copy (op, args, complain, in_decl); might not produce a _ZT... VAR_DECL with const type, so the assert gcc_assert (CP_TYPE_CONST_P (TREE_TYPE (op))); fails. Allowing IMPLICIT_CONV_EXPR there probably makes sense. And since convert_nontype_argument uses value_dependent_expression_p a lot, I used a dedicated bool to speed things up. Bootstrapped/regtested on x86_64-linux, ok for trunk and maybe 9? 2020-01-28 Marek Polacek PR c++/92948 - Fix class NTTP with template arguments. * pt.c (convert_nontype_argument): Use IMPLICIT_CONV_EXPR when converting a value-dependent expression to a class type. (tsubst_copy) : Allow IMPLICIT_CONV_EXPR as the result of the tsubst_copy call. * g++.dg/cpp2a/nontype-class28.C: New test. * g++.dg/cpp2a/nontype-class29.C: New test. --- gcc/cp/pt.c | 36 +++ gcc/testsuite/g++.dg/cpp2a/nontype-class28.C | 37 gcc/testsuite/g++.dg/cpp2a/nontype-class29.C | 26 ++ 3 files changed, 85 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class28.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class29.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index e889c800aca..fccf4e95453 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -7074,7 +7074,8 @@ convert_nontype_argument (tree type, tree expr, tsubst_flags_t complain) if (non_dep) expr = instantiate_non_dependent_expr_internal (expr, complain); - if (value_dependent_expression_p (expr)) + const bool val_dep_p = value_dependent_expression_p (expr); + if (val_dep_p) expr = canonicalize_expr_argument (expr, complain); /* 14.3.2/5: The null pointer{,-to-member} conversion is applied @@ -7100,10 +7101,17 @@ convert_nontype_argument (tree type, tree expr, tsubst_flags_t complain) a conversion function with a value-dependent argument, which could invoke taking the address of a temporary representing the result of the conversion. */ - if (COMPOUND_LITERAL_P (expr) - && CONSTRUCTOR_IS_DEPENDENT (expr) - && MAYBE_CLASS_TYPE_P (expr_type) - && TYPE_HAS_CONVERSION (expr_type)) + if (!same_type_ignoring_top_level_qualifiers_p (type, expr_type) + && ((COMPOUND_LITERAL_P (expr) + && CONSTRUCTOR_IS_DEPENDENT (expr) + && MAYBE_CLASS_TYPE_P (expr_type) + && TYPE_HAS_CONVERSION (expr_type)) + /* Similarly, converting e.g. an integer to a class +involves a constructor call. convert_like would +create a TARGET_EXPR, but in a template we can't +use AGGR_INIT_EXPR, and the TARGET_EXPR would lead +to a bogus error. */ + || (val_dep_p && MAYBE_CLASS_TYPE_P (type { expr = build1 (IMPLICIT_CONV_EXPR, type, expr); IMPLICIT_CONV_EXPR_NONTYPE_ARG (expr) = true; @@ -7189,8 +7197,7 @@ convert_nontype_argument (tree type, tree expr, tsubst_flags_t complain) /* Notice that there are constant expressions like '4 % 0' which do not fold into integer constants. */ - if (TREE_CODE (expr) != INTEGER_CST - && !value_dependent_expression_p (expr)) + if (TREE_CODE (expr) != INTEGER_CST && !val_dep_p) { if (complain & tf_error) { @@ -7265,7 +7272,7 @@ convert_nontype_argument (tree type, tree expr, tsubst_flags_t complain) Here, we do not
[patch, fortran] PR93486 - ICE on valid with nested submodules and long submodule names
I opened PR93486 for this problem: The following causes an ICE with revision ad690d79cfbb905c5546c9333c5fd089d906505b: module ivs interface l module procedure l_ end interface l contains function l_() end function l_ end module ivs module aModeratleyLongModuleName use ivs interface module subroutine cmo() end subroutine cmo end interface end module aModeratleyLongModuleName submodule (aModeratleyLongModuleName) aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill contains module procedure cmo end procedure cmo end submodule aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill submodule (aModeratleyLongModuleName:aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill) sb end submodule sb submodule (aModeratleyLongModuleName:sb) sc end submodule sc $ gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/data001/abenson/Galacticus/Tools_Devel_Install/bin/../ libexec/gcc/x86_64-pc-linux-gnu/10.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-git/configure --prefix=/home/abenson/Galacticus/ Tools_Devel --enable-languages=c,c++,fortran --disable-multilib Thread model: posix Supported LTO compression algorithms: zlib gcc version 10.0.1 20200128 (experimental) (GCC) $ gfortran -c test.F90 -o test.o f951: internal compiler error: Segmentation fault 0xe1bc9f crash_signal ../../gcc-git/gcc/toplev.c:328 0x7f98119c71ef ??? /data001/abenson/Galacticus/Tools/glibc-2.12.1/signal/../sysdeps/unix/ sysv/linux/x86_64/sigaction.c:0 0x84b3f0 gfc_use_modules() ../../gcc-git/gcc/fortran/module.c:7315 0x85acc8 use_modules ../../gcc-git/gcc/fortran/parse.c:114 0x866daa parse_module ../../gcc-git/gcc/fortran/parse.c:6111 0x86733c gfc_parse_file() ../../gcc-git/gcc/fortran/parse.c:6428 0x8b780f gfc_be_parse_file ../../gcc-git/gcc/fortran/f95-lang.c:210 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. The ICE goes away if the module and/or submodule names are made shorter. The problem is caused when loading (generic or operator) interfaces from a module file. The module name can include the submodule name, so a size of 2*GFC_MAX_SYMBOL_LEN+2 is required to read this in. The attached patch fixes this problem and regression tests cleanly *if* the patch for PR87103 is applied (otherwise that problem gets triggered by the new test case). PR87103 has been a long-standing issue - there's a patch that works (either my original ugly fix, or the better fix proposed by Bernhard at https:// gcc.gnu.org/ml/fortran/2018-09/msg00044.html ). It would be very nice to get one of these patches committed. -Andrew -- * Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html * Galacticus: https://github.com/galacticusorg/galacticus diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c index 4487f65..b6a4e87 100644 --- a/gcc/fortran/module.c +++ b/gcc/fortran/module.c @@ -4568,7 +4568,9 @@ static void load_operator_interfaces (void) { const char *p; - char name[GFC_MAX_SYMBOL_LEN + 1], module[GFC_MAX_SYMBOL_LEN + 1]; + /* "module" must be large enough for the case of submodules in which the name + has the form module.submodule */ + char name[GFC_MAX_SYMBOL_LEN + 1], module[2 * GFC_MAX_SYMBOL_LEN + 2]; gfc_user_op *uop; pointer_info *pi = NULL; int n, i; @@ -4624,7 +4626,9 @@ static void load_generic_interfaces (void) { const char *p; - char name[GFC_MAX_SYMBOL_LEN + 1], module[GFC_MAX_SYMBOL_LEN + 1]; + /* "module" must be large enough for the case of submodules in which the name + has the form module.submodule */ + char name[GFC_MAX_SYMBOL_LEN + 1], module[2 * GFC_MAX_SYMBOL_LEN + 2]; gfc_symbol *sym; gfc_interface *generic = NULL, *gen = NULL; int n, i, renamed; diff --git a/gcc/testsuite/gfortran.dg/pr93486.f90 b/gcc/testsuite/gfortran.dg/pr93486.f90 new file mode 100644 index 000..5037d40 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr93486.f90 @@ -0,0 +1,30 @@ +! { dg-do compile } +! PR fortran/93486 +module ivs + interface l + module procedure l_ + end interface l +contains + function l_() + end function l_ +end module ivs + +module aModeratleyLongModuleName + use ivs + interface + module subroutine cmo() + end subroutine cmo + end interface +end module aModeratleyLongModuleName + +submodule (aModeratleyLongModuleName) aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill +contains + module procedure cmo + end procedure cmo +end submodule aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill + +submodule (aModeratleyLongModuleName:aNameForASubmoduleThatIsVeryLongButWhichIsLegalStill) sb +end submodule sb + +submodule (aModeratleyLongModuleName:sb) sc +end submodule sc 2020-01-28 Andrew Benson PR fortran/93486 * module.c: Increase size o
Re: [patch, fortran, wwwdocs] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule
Hi Tobias, On Tuesday, January 28, 2020 6:49:54 PM PST Tobias Burnus wrote: > Thus, I do not think it is a real problem in practice. We could be nice, > however, and add a note to the release notes (i.e. > https://gcc.gnu.org/gcc-10/changes.html); I not completely sure whether > it is worthwhile but why not. [See https://gcc.gnu.org/about.html#git > about how to change WWW files and CC Gerald as web maintainer when > submitting wwwdocs patches.] I've attached a draft patch to update the release notes about this ABI breakage. I don't know if I've explained it sufficiently clearly though? -Andrew -- * Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html * Galacticus: https://github.com/galacticusorg/galacticus diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html index dcce6b8..5a959a1 100644 --- a/htdocs/gcc-10/changes.html +++ b/htdocs/gcc-10/changes.html @@ -446,6 +446,13 @@ a work-in-progress. objects with allocatable components. In this case, the optional arguments STAT= and ERRMSG= are currently ignored. + +The handling of module and submodule names has been reworked to allow the +full 63-character length mandated by the standard. Previously symbol names +were truncated if the combined length of module, submodule, and function +name exceeded 126 characters. This change therefore breaks the ABI, but only +for cases where this 126 character limit was exceeded. +
Re: [cris-decc0 8/9] cris: Move trivially from cc0 to reg:CC model, removing most optimizations.
> From: Segher Boessenkool > Date: Mon, 27 Jan 2020 23:52:21 +0100 > Hi! > > On Wed, Jan 22, 2020 at 07:11:27AM +0100, Hans-Peter Nilsson wrote: > > I intend to put back as many as I find use for, of those > > anonymous patterns in a controlled manner, with self-contained > > test-cases proving their usability, rather than symmetry with > > other instructions and similar addressing modes, which guided > > the original introduction. I've entered prX to track code > > performance regressions related to this transition, with focus > > on target-side causes and fixes; besides the function prologue > > special-case, there were some checking presence of the bit-test > > (btstq) instruction. > > That's PR93372 (not X :-) ). (Wow, someone read the log!) Doh! Thanks, will fix at next rebase. (FWIW, I just added a mail-send-hook to at least stop (and ask) me from sending email with that kind of placeholder in the body. :) > Do you have any estimate how much removing cc0 this way costs in > performance (or code size, or any other metric)? Not yet, by any quantifiable metrics, but I plan to do that once I'm happy with compare-elim.c doing its job for the most glaring cases. Still, looking at differences to the cc0 version for libgcc, it might even be an improvement due to better register allocation (fewer saved registers and register moves and the like). To wit, that effect has a good chance of dominating over negative fallout from missed compare-elimination opportunities and other simplifications, just judging from initial observations. Film at 11. brgds, H-P
Re: [committed] Add include hack to fix missing SCNuMAX defines in inttypes.h on hpux11.[01]*
On 1/25/20 9:34 AM, John David Anglin wrote: > +/* > + * Fix missing SCNuMAX defines in inttypes.h > + */ > +fix = { > +hackname = hpux_c99_inttypes4; > +mach = "hppa*-hp-hpux11.[01]*"; > +files = inttypes.h; > +sed = "/^[ \t]*#[ \t]*define[ \t]*SCNxMAX[ \t]*SCNx64/a\\\n" > + "#define SCNuMAX \t SCNu64\n"; > +sed = "/^[ \t]*#[ \t]*define[ \t]*SCNxMAX[ \t]*SCNx32/a\\\n" > + "#define SCNuMAX \t SCNu32\n"; > +test_text = "#define SCNxMAX SCNx64\n" > + "#define SCNxMAX SCNx32\n"; > +}; > + "" in the inserted text? OK. It works...
Re: GCC fixincludes bug
On 1/23/20 8:07 AM, Pétur Orri Ragnarsson wrote: > Hi. > > I ran into a problem with fixincludes when building GCC. Your email is > in the README file. > > > When building a crosscompiling GCC 8.3.0 for powerpc the following "fix" > is applied: > > $ diff /opt/pluto-targets/powerpc-rootfs/usr/include/open62541.h > /opt/plutotoolchain/lib/gcc/powerpc-marel-linux-gnu/8.3.0/include-fixed/open62541.h > 0a1,9 > > /* DO NOT EDIT THIS FILE. > > > > It has been auto-edited by fixincludes from: > > > > "/opt/pluto-targets/powerpc-rootfs/usr/include/open62541.h" > > > > This had to be done to correct non-standard usages in the > > original, manufacturer supplied header file. */ > > > 115c124 > < # ifndef __COUNTER__ /* PPC GCC fix */ > --- > > # ifndef __COUNTER__ /* __PPC__ GCC fix */ > > This fix is of course useless since it's editing a string inside a > comment. Unfortunately it's also harmful since it causes this header > file to be included in the compiler distribution in a way that it gets > used before the system header. Then I went and upgraded the open62541 > library which made incompatible changes to its API (from v0.3 to v1.0 if > it matters) causing builds to fail due to the wrong header being used. > > I can apply some hack to fix this locally in our packaging system so > it's not an urgent error, and it would also go away by building the > compiler with the new header installed. But I feel like those are > workarounds that are not very future-proof. > > Is this something you think can/should be fixed? Do you need more info > from me to figure out where and why this substitution happens? It'll be a little while before I can re-pull all of GCC and look at it. There's a substitution rule somewhere that effectively: sed 's/PPC/__PPC__/' somewhere. Add a constraint that prevents the application of that fix when it appears after a "/*" then re-generate the C code that does all the fixing. Cheers - Bruce > (I've sent this mail directly to you only instead of the mailing list > since I don't know if I'm including enough information.) Probably better to include gcc-patches as I have become less reliable on email since I was retired.
[RFA/RFC] Improve DSE/aliasing around __builtin_object_size (PR89689, P2 regression)
Here's two approaches to fix the regression in 89689. Note this only fixes the regression in the originally reported testcase. THe deeper issue Martin raises in C#1 is not addressed. Thus if we go forward with either patch ISTM that we would drop the regression markers, but keep the BZ open. So the key blocks in question as we enter DSE1: > > ;; basic block 2, loop depth 0, maybe hot > ;;prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) > ;;pred: ENTRY (FALLTHRU,EXECUTABLE) > h = l; > h$data_33 = l.data; > if (h$data_33 != &__sb_slop) > goto ; [70.00%] > else > goto ; [30.00%] > ;;succ: 3 [70.0% (guessed)] (TRUE_VALUE,EXECUTABLE) > ;;4 [30.0% (guessed)] (FALSE_VALUE,EXECUTABLE) > > ;; basic block 3, loop depth 0, maybe hot > ;;prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) > ;;pred: 2 [70.0% (guessed)] (TRUE_VALUE,EXECUTABLE) > *h$data_33 = 0; > ;;succ: 4 [always] (FALLTHRU,EXECUTABLE) > > ;; basic block 4, loop depth 0, maybe hot > ;;prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED) > ;;pred: 3 [always] (FALLTHRU,EXECUTABLE) > ;;2 [30.0% (guessed)] (FALSE_VALUE,EXECUTABLE) > _23 = __builtin_object_size (h$data_33, 0); > _25 = __builtin___memcpy_chk (h$data_33, "abcd", 4, _23); > __builtin_puts (h$data_33); > _6 = h$data_33 == &__sb_slop; > _7 = (int) _6; > printf ("%d\n", _7); > _9 = h$data_33 == &buf; > _10 = (int) _9; > printf ("%d\n", _10); > if (h$data_33 != &__sb_slop) > goto ; [70.00%] > else > goto ; [30.00%] > So vrp1 is going to want to thread the jump at the end of bb4 which requires duplicating bb4. One version of bb4 will only be reachable via the false edge from bb2. That in turn exposes a cprop opportunity to replace h$data_33 with &__sbloop and the type of &__sbloop is a char[1] array thus triggering the false positive warning. We can avoid all that by realizing the store in bb3 is dead. That in turn makes the conditional at the end of bb2 pointless as bb3 is empty and thus both arms ultimately transfer control to bb4 without any other side effects meaning we can eliminate that conditional which in turn eliminates the need for jump threading. Again, this only fixes the original testcase and if there were other statements in bb3 it wouldn't work and it won't work for the test in c#1. But the proposed changes are a general improvement. DSE misses this case because ref_maybe_used_by_stmt_p returns true for the virtual operand of the __builtin_object_size call. Opps! There's two easy ways to fix this, I've bootstrapped and regression tested both. First, the most conservative fix and perhaps the most appropriate for gcc-10 -- special casing __builtin_object_size in tree-ssa-dse.c. The second approach is more general and marks __builtin_object_size as const rather than just pure. Jakub and I kicked this around in IRC a bit. We've always marked __builtin_object_size as pure. It was never const. THere is some concern that code motion might cause _b_o_s to get different results, which would be particularly concerning for _b_o_s types #1 and #3 which do use some contextual information. However, ISTM that motion is still going to be limited by the SSA graph, though somewhat less because we wouldn't have a dependency on the virtual operands. So it *should* be safe, but there's some degree of hesitation to make this change at this point in gcc-10's lifecycle. Thoughts? If we go forward obviously I'd add a testcase and ChangeLog entries. Jeff diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c index 374143e5785..e88342f1b68 100644 --- a/gcc/tree-ssa-dse.c +++ b/gcc/tree-ssa-dse.c @@ -789,8 +789,13 @@ dse_classify_store (ao_ref *ref, gimple *stmt, phi_def = use_stmt; } } - /* If the statement is a use the store is not dead. */ - else if (ref_maybe_used_by_stmt_p (use_stmt, ref)) + /* If the statement is a use the store is not dead. Do not + consider calls to __builtin_object_size as a use. For gcc-11 + we plan mark __builtin_object_size as const. */ + else if ((!is_gimple_call (use_stmt) + || !gimple_call_builtin_p (as_a (use_stmt), + BUILT_IN_OBJECT_SIZE)) + && ref_maybe_used_by_stmt_p (use_stmt, ref)) { /* Handle common cases where we can easily build an ao_ref structure for USE_STMT and in doing so we find that the commit f4e5d3f4755b6a5846ac20b53008b90131a8bb7c Author: Jeff Law Date: Tue Jan 28 18:16:09 2020 -0500 Mark _b_o_s as constant diff --git a/gcc/builtins.def b/gcc/builtins.def index 5ab842c34c2..fa8b0641ab1 100644 --- a/gcc/builtins.def +++ b/gcc/builtins.def @@ -972,7 +972,7 @@ DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq") DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq") /* Object size checking builtins. */ -DEF_GCC_BUILTIN (BUILT_IN_OBJECT_