Re: [PATCH] fix ICEs in c-attribs.c (PR 88383, 89288, 89798, 89797)
On Tue, Apr 16, 2019 at 08:40:29PM -0600, Martin Sebor wrote: > --- gcc/tree.h(revision 270402) > +++ gcc/tree.h(working copy) > @@ -3735,9 +3735,9 @@ TYPE_VECTOR_SUBPARTS (const_tree node) >if (NUM_POLY_INT_COEFFS == 2) > { >poly_uint64 res = 0; > - res.coeffs[0] = 1 << (precision & 0xff); > + res.coeffs[0] = (unsigned HOST_WIDE_INT)1 << (precision & 0xff); >if (precision & 0x100) > - res.coeffs[1] = 1 << (precision & 0xff); > + res.coeffs[1] = (unsigned HOST_WIDE_INT)1 << (precision & 0xff); Instead of (unsigned HOST_WIDE_INT)1 one should use HOST_WIDE_INT_1U macro. Jakub
Re: [PATCH] backport r257541, r259936, r260294, r260623, r261098, r261333, r268585.
Hi Segher, On 2019/4/16 PM6:54, Segher Boessenkool wrote: > Hi Xiong, > > Sorry I took so long to review this. > > On Thu, Apr 04, 2019 at 02:49:29AM -0500, luo...@linux.ibm.com wrote: >> These patches are followed changes for r25 on testcases >> vsx-vector-6*.c. backport them to update file names and fix regressions >> for GCC7 on power9. > > (See e.g. https://gcc.gnu.org/ml/gcc-testresults/2019-04/msg01868.html for > the failures this patch fixes; the patch is for GCC 7). > >> gcc/ChangeLog: >> >> 2019-04-03 Xiong Hu Luo >> >> backport from trunk r260623. >> >> 2018-05-23 Segher Boessenkool >> >> * doc/sourcebuild.texi (Endianness): New subsubsection. > > We write the changelog like > > 2019-04-16 Xiong Hu Luo > > Backport from trunk > 2018-05-23 Segher Boessenkool > > * doc/sourcebuild.texi (Endianness): New subsubsection. > > (no revision number, capital on Backport, no empty line after it). > >> 2019-04-03 Xiong Hu Luo >> >> backport from trunk r257541. >> >> 2018-02-07 Will Schmidt >> >> * gcc.target/powerpc/vsx-vector-6-le.c: Update CPU target. >> * gcc.target/powerpc/vsx-vector-6-le.p9.c: New. > > Only one space after : please. > >> 2018-05-04 Carl Love > > Two spaces between date and name. > >> * gcc.target/powerpc/vsx-vector-6-le.c: Add le qualifiers as needed for >> the various instruction counts. Rename file to vsx-vector-6.p8.c. > > There's a tab after "to" here, should be a space. > > > Other than those nits, okay for the GCC 7 branch, thanks! I will modify all the ChangeLog nits by copy-paste, thanks. > > ("be" and "le" are essentially PowerPC-specific selectors on the 7 branch, > otherwise you'd need a release manager's approval as well). Do you mean move the "be" and "le" code from gcc/testsuite/lib/target-supports.exp to gcc/testsuite/gcc.target/powerpc/powerpc.exp here? I tried this, it can work. This require ChangeLog update as below? Or rewrite all the ChangeLog with mine signed-of-by? 2018-05-23 Segher Boessenkool * lib/target-supports.exp (check_effective_target_be): New. (check_effective_target_le): New. => 2018-05-23 Segher Boessenkool * gcc.target/powerpc/powerpc.exp (check_effective_target_be): New. (check_effective_target_le): New. Also need update the "doc/sourcebuild.texi" from "+@subsubsection Endianness" to "+@subsubsection Endianness For powerpc". Thanks Xiong Hu > > > Segher >
Re: [PATCH PR90078]Capping comp_cost computation in ivopts
On Wed, Apr 17, 2019 at 02:13:12PM +0800, bin.cheng wrote: > Hi, > As discussed in PR90078, this patch checks possible infinite_cost overflow in > ivopts. > Also as discussed, overflow happens mostly because of cost scaling wrto > bb_freq/loop_freq. > For the moment, we only implement capping in comp_cost operators, while in > next > stage1, we may instead implement capping in get_scaled_computation_cost_at > with > more supporting benchmark data. > > BTW, I think switching costs around comparison between infinite_cost is > unnecessary > since there will be no overflow in integer after capping with infinite_cost. > > Bootstrap and test on x86_64, is it OK? > > Thanks, > bin > > 2019-04-17 Bin Cheng > > PR tree-optimization/92078 > * tree-ssa-loop-ivopts.c (comp_cost::operator +,-,+=,-+,/=,*=): Add > checks for infinite_cost overflow. > > 2018-04-17 Bin Cheng > > PR tree-optimization/92078 > * gcc/testsuite/g++.dg/tree-ssa/pr90078.C: New test. --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -243,6 +243,9 @@ operator+ (comp_cost cost1, comp_cost cost2) if (cost1.infinite_cost_p () || cost2.infinite_cost_p ()) return infinite_cost; + if (cost1.cost + cost2.cost >= infinite_cost.cost) +return infinite_cost; As #define INFTY 1000 what is the reason to keep the previous condition as well? I mean, if cost1.cost == INFTY or cost2.cost == INFTY, cost1.cost + cost2.cost >= INFTY too. Unless costs can go negative. @@ -256,6 +259,8 @@ operator- (comp_cost cost1, comp_cost cost2) return infinite_cost; gcc_assert (!cost2.infinite_cost_p ()); + if (cost1.cost - cost2.cost >= infinite_cost.cost) +return infinite_cost; Unless costs can be negative, when you first bail out for cost1.cost == INFTY, then cost1.cost - cost2.cost won't be INFTY (but could get negative). So shouldn't there be a guard against that instead? Or, if costs can be negative, shouldn't there be also guards that it doesn't grow too negative (say smaller than -INFTY)? Jakub
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
Hi, thanks a lot for the extensive discussion :-) How should we now proceed, first for gcc 9, snd then for backporting? Use Richard‘s patch with the corresponding Fortran FE change? Regards Thomas
Re: [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095, take 2)
On Tue, 16 Apr 2019, Jakub Jelinek wrote: > On Tue, Apr 16, 2019 at 06:21:25PM +0200, Eric Botcazou wrote: > > > The runtime check assures that at runtime, the upper 32 bits of pseudo 104 > > > must be always 0 (in this case, in some other case could be sign bit > > > copies). > > > > OK, as Richard pointed out, that's not sufficient if we allow... > > > > > The question is if it would be valid say for forward propagation to first > > > propagate (or combine) the pseudo 97 into the (subreg/s/v:SI (reg:DI 104) > > > 0), then hoisting it before the jump_insn 16, have the subreg optimized > > > away and miscompile later on. > > > > ...this to happen. So we could clear SUBREG_PROMOTED_VAR_P as soon as the > > SUBREG is rewritten, but this looks quite fragile. The safest route is > > probably not to use SUBREG_PROMOTED_VAR_P in this conditional context. > > > > > That means either that the hoisting pass is buggy, or that > > > SUBREG_PROMOTED_* > > > is only safe at the function boundary (function arguments and return > > > value) > > > and not elsewhere. > > > > I think that Richard's characterization is correct: > > > > "Note that likely SUBREG_PROMOTED_VAR_P wasn't designed to communicate > > zero-extend info (can't you use a REG_EQUIV note somehow?) but it has > > to be information that is valid everywhere in the function unless > > data dependences force its motion (thus a conditional doesn't do)." > > > > i.e. this also works for a local variable that is always accessed with the > > SUBREG_PROMOTED_VAR_P semantics. > > Ok, here is a patch that just removes all of that SUBREG_PROMOTED_SET then, > as even for the opN_small_p we can't actually guarantee that for the whole > function, only for where the pseudo with the SSA_NAME for which we get the > range appears. On the bright side, the generated code at least for the > particular testcase has somewhat different RA decisions, but isn't > significantly worse. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Richard. > 2019-04-16 Jakub Jelinek > > PR middle-end/90095 > * internal-fn.c (expand_mul_overflow): Don't set SUBREG_PROMOTED_VAR_P > on lowpart SUBREGs. > > * gcc.dg/pr90095-1.c: New test. > * gcc.dg/pr90095-2.c: New test. > > --- gcc/internal-fn.c.jj 2019-04-15 19:45:22.38646 +0200 > +++ gcc/internal-fn.c 2019-04-16 15:18:56.614708804 +0200 > @@ -1753,22 +1753,9 @@ expand_mul_overflow (location_t loc, tre > /* If both op0 and op1 are sign (!uns) or zero (uns) extended from >hmode to mode, the multiplication will never overflow. We can >do just one hmode x hmode => mode widening multiplication. */ > - rtx lopart0s = lopart0, lopart1s = lopart1; > - if (GET_CODE (lopart0) == SUBREG) > - { > - lopart0s = shallow_copy_rtx (lopart0); > - SUBREG_PROMOTED_VAR_P (lopart0s) = 1; > - SUBREG_PROMOTED_SET (lopart0s, uns ? SRP_UNSIGNED : SRP_SIGNED); > - } > - if (GET_CODE (lopart1) == SUBREG) > - { > - lopart1s = shallow_copy_rtx (lopart1); > - SUBREG_PROMOTED_VAR_P (lopart1s) = 1; > - SUBREG_PROMOTED_SET (lopart1s, uns ? SRP_UNSIGNED : SRP_SIGNED); > - } > tree halfstype = build_nonstandard_integer_type (hprec, uns); > - ops.op0 = make_tree (halfstype, lopart0s); > - ops.op1 = make_tree (halfstype, lopart1s); > + ops.op0 = make_tree (halfstype, lopart0); > + ops.op1 = make_tree (halfstype, lopart1); > ops.code = WIDEN_MULT_EXPR; > ops.type = type; > rtx thisres > --- gcc/testsuite/gcc.dg/pr90095-1.c.jj 2019-04-16 13:45:22.614772955 > +0200 > +++ gcc/testsuite/gcc.dg/pr90095-1.c 2019-04-16 13:45:22.614772955 +0200 > @@ -0,0 +1,18 @@ > +/* PR middle-end/90095 */ > +/* { dg-do run } */ > +/* { dg-options "-Os -fno-tree-bit-ccp" } */ > + > +unsigned long long a; > +unsigned int b; > + > +int > +main () > +{ > + unsigned int c = 255, d = c |= b; > + if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4 || __SIZEOF_LONG_LONG__ != 8) > +return 0; > + d = __builtin_mul_overflow (-(unsigned long long) d, (unsigned char) - c, > &a); > + if (d != 0) > +__builtin_abort (); > + return 0; > +} > --- gcc/testsuite/gcc.dg/pr90095-2.c.jj 2019-04-16 15:20:14.728414325 > +0200 > +++ gcc/testsuite/gcc.dg/pr90095-2.c 2019-04-16 15:20:29.597167928 +0200 > @@ -0,0 +1,5 @@ > +/* PR middle-end/90095 */ > +/* { dg-do run } */ > +/* { dg-options "-Os -fno-tree-bit-ccp -fno-split-wide-types" } */ > + > +#include "pr90095-1.c" > > > Jakub > -- Richard Biener SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
Re: [PATCH] Don't ignore leading whitespace in ARM target attribute/pragma (PR target/89093)
On 4/16/19 6:50 PM, Jakub Jelinek wrote: On Fri, Apr 12, 2019 at 05:10:48PM +0100, Ramana Radhakrishnan wrote: > No, that's not right. we should get rid of this. Here is a patch for that. Bootstrapped/regtested on armv7hl-linux-gnueabi, ok for trunk? Ok. I don't think anyone relies on this behaviour. Thanks, Kyrill 2019-04-16 Jakub Jelinek PR target/89093 * config/arm/arm.c (arm_valid_target_attribute_rec): Don't skip whitespace at the start of target attribute string. * gcc.target/arm/pr89093-2.c: New test. --- gcc/config/arm/arm.c.jj 2019-04-13 17:20:07.353977370 +0200 +++ gcc/config/arm/arm.c 2019-04-15 19:50:31.386414421 +0200 @@ -30871,8 +30871,6 @@ arm_valid_target_attribute_rec (tree arg while ((q = strtok (argstr, ",")) != NULL) { - while (ISSPACE (*q)) ++q; - argstr = NULL; if (!strcmp (q, "thumb")) opts->x_target_flags |= MASK_THUMB; --- gcc/testsuite/gcc.target/arm/pr89093-2.c.jj 2019-04-15 19:53:23.740608673 +0200 +++ gcc/testsuite/gcc.target/arm/pr89093-2.c 2019-04-15 19:52:29.841486100 +0200 @@ -0,0 +1,9 @@ +/* PR target/89093 */ +/* { dg-do compile } */ + +__attribute__((target (" arm"))) void f1 (void) {} /* { dg-error "unknown target attribute or pragma ' arm'" } */ +__attribute__((target (" thumb"))) void f2 (void) {} /* { dg-error "unknown target attribute or pragma ' thumb'" } */ +__attribute__((target ("arm, thumb"))) void f3 (void) {} /* { dg-error "unknown target attribute or pragma ' thumb'" } */ +__attribute__((target ("thumb, arm"))) void f4 (void) {} /* { dg-error "unknown target attribute or pragma ' arm'" } */ +#pragma GCC target (" arm") /* { dg-error "unknown target attribute or pragma ' arm'" } */ +void f5 (void) {} Jakub
Re: [PATCH] Don't ignore leading whitespace in AArch64 target attribute/pragma (PR target/89093)
HI Jakub, On 4/16/19 7:32 PM, Jakub Jelinek wrote: On Tue, Apr 16, 2019 at 07:50:35PM +0200, Jakub Jelinek wrote: > On Fri, Apr 12, 2019 at 05:10:48PM +0100, Ramana Radhakrishnan wrote: > > No, that's not right. we should get rid of this. > > Here is a patch for that. > > Bootstrapped/regtested on armv7hl-linux-gnueabi, ok for trunk? And here is the same thing for aarch64. Bootstrapped/regtested on aarch64-linux, ok for trunk? FWIW this looks ok to me implementation-wise (since I wrote that code a few years ago). I think it is better not to accept any spaces in there, than accepting it only at the beginning and after , but not e.g. at the end of before , like the trunk currently does, furthermore, e.g. x86 or ppc don't allow spaces there. Thinking about it a bit more, I think it's a good idea to disallow leading and trailing whitespaces. But there could be a case for allowing whitespaces between separate target attributes. Personally, I would find it more readable to have a space after a comma. Similarly, spaces are allowed in the general attribute syntax, for example in our intrinsics header we have: __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) That said, distinguishing between the two classes of whitespace is probably more complexity than it's worth and if other targets don't allow it then I won't let it block this patch. Thanks, Kyrill 2019-04-16 Jakub Jelinek PR target/89093 * config/aarch64/aarch64.c (aarch64_process_one_target_attr): Don't skip whitespace at the start of target attribute string. * gcc.target/aarch64/pr89093.c: New test. * gcc.target/aarch64/pr63304_1.c: Remove space from target string. --- gcc/config/aarch64/aarch64.c.jj 2019-04-11 10:26:22.907293129 +0200 +++ gcc/config/aarch64/aarch64.c 2019-04-15 19:59:55.784226278 +0200 @@ -12536,10 +12536,6 @@ aarch64_process_one_target_attr (char *a char *str_to_check = (char *) alloca (len + 1); strcpy (str_to_check, arg_str); - /* Skip leading whitespace. */ - while (*str_to_check == ' ' || *str_to_check == '\t') - str_to_check++; - /* We have something like __attribute__ ((target ("+fp+nosimd"))). It is easier to detect and handle it explicitly here rather than going through the machinery for the rest of the target attributes in this --- gcc/testsuite/gcc.target/aarch64/pr89093.c.jj 2019-04-15 20:02:25.456788897 +0200 +++ gcc/testsuite/gcc.target/aarch64/pr89093.c 2019-04-15 20:02:04.433131260 +0200 @@ -0,0 +1,7 @@ +/* PR target/89093 */ +/* { dg-do compile } */ + +__attribute__((target (" no-strict-align"))) void f1 (void) {} /* { dg-error "is not valid" } */ +__attribute__((target (" general-regs-only"))) void f2 (void) {} /* { dg-error "is not valid" } */ +#pragma GCC target (" general-regs-only") /* { dg-error "is not valid" } */ +void f3 (void) {} --- gcc/testsuite/gcc.target/aarch64/pr63304_1.c.jj 2017-09-13 16:22:19.795513580 +0200 +++ gcc/testsuite/gcc.target/aarch64/pr63304_1.c 2019-04-15 20:27:17.724847578 +0200 @@ -1,7 +1,7 @@ /* { dg-do assemble } */ /* { dg-options "-O1 --save-temps" } */ #pragma GCC push_options -#pragma GCC target ("+nothing+simd, cmodel=small") +#pragma GCC target ("+nothing+simd,cmodel=small") int cal (double a) Jakub
Re: [PATCH] Don't ignore leading whitespace in AArch64 target attribute/pragma (PR target/89093)
On Wed, Apr 17, 2019 at 08:59:08AM +0100, Kyrill Tkachov wrote: > Similarly, spaces are allowed in the general attribute syntax, for example > in our intrinsics header we have: > > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) Well, that is how the C/C++ lexing works. We also allow __attribute__ ((__always_inline__ , __gnu_inline__ , __artificial__ )) etc. The whitespace skipping in the target string handling allowed target (" abc, def") but didn't allow target ("abc , def") or target ("abc ,def") etc. IMHO either we shouldn't allow any whitespace anywhere, or allow it everywhere (leading, trailing, before or after comma), but then for consistency all targets should do that. If one wants to do some whitespace, there is always an option to do target ("abc," "def") or target ("abc," "def") or #define C(x) #x target (C(abc) "," C(def)) or whatever else one wants to use. Jakub
Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).
On Tue, Apr 16, 2019 at 11:41 PM H.J. Lu wrote: > > On Tue, Apr 16, 2019 at 8:36 AM Martin Liška wrote: > > > > On 4/16/19 4:50 PM, H.J. Lu wrote: > > > On Tue, Apr 16, 2019 at 1:28 AM Martin Liška wrote: > > >> > > >> On 4/15/19 5:09 PM, H.J. Lu wrote: > > >>> On Mon, Apr 15, 2019 at 12:26 AM Martin Liška wrote: > > > > On 4/12/19 4:12 PM, H.J. Lu wrote: > > > On Fri, Apr 12, 2019 at 4:41 AM Martin Liška wrote: > > >> > > >> On 4/11/19 6:30 PM, H.J. Lu wrote: > > >>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška wrote: > > > > Hi. > > > > The patch is adding missing AVX512 ISAs for target and target_clone > > attributes. > > > > Patch can bootstrap on x86_64-linux-gnu and survives regression > > tests. > > > > Ready to be installed? > > Thanks, > > Martin > > > > gcc/ChangeLog: > > > > 2019-04-10 Martin Liska > > > > PR target/89929 > > * config/i386/i386.c (get_builtin_code_for_version): Add > > support for missing AVX512 ISAs. > > > > gcc/testsuite/ChangeLog: > > > > 2019-04-10 Martin Liska > > > > PR target/89929 > > * g++.target/i386/mv28.C: New test. > > * gcc.target/i386/mvc14.c: New test. > > --- > > gcc/config/i386/i386.c| 34 > > ++- > > gcc/testsuite/g++.target/i386/mv28.C | 30 +++ > > gcc/testsuite/gcc.target/i386/mvc14.c | 16 + > > 3 files changed, 79 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/g++.target/i386/mv28.C > > create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c > > > > > > >>> > > >> > > >> Hi. > > >> > > >>> Since any ISAs beyond AVX512F may be enabled individually, we > > >>> can't simply assign priorities to them. For GFNI, we can have > > >>> > > >>> 1. GFNI > > >>> 2. GFNI + AVX > > >>> 3. GFNI + AVX512F > > >>> 4. GFNI + AVX512F + AVX512VL > > >> > > >> Makes sense to me! I'm considering syntax extension where one would > > >> be > > >> able to come up with a priority. Eg. > > >> > > >> __attribute__((target("gfni,avx512bw", priority((3) > > >> > > >> Without that the ISA combinations are probably not comparable in a > > >> reasonable way. > > >> > > >>> > > >>> For this code, GFNI + AVX512BW is ignored: > > >>> > > >>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii > > >>> __attribute__((target("gfni"))) > > >>> int foo(int i) { > > >>> return 1; > > >>> } > > >>> __attribute__((target("gfni,avx512bw"))) > > >>> int foo(int i) { > > >>> return 4; > > >>> } > > >>> __attribute__((target("default"))) > > >>> int foo(int i) { > > >>> return 3; > > >>> } > > >>> int bar () > > >>> { > > >>> return foo(2); > > >>> } > > >> > > >> For 'target' attribute it works for me: > > >> > > >> 1) $ cat z.c && ./xg++ -B. z.c -c > > >> #include > > >> volatile __m512i x1, x2; > > >> volatile __mmask64 m64; > > >> > > >> __attribute__((target("gfni"))) > > >> int foo(int i) { > > >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); > > >> return 1; > > >> } > > >> __attribute__((target("gfni,avx512bw"))) > > >> int foo(int i) { > > >> return 4; > > >> } > > >> __attribute__((target("default"))) > > >> int foo(int i) { > > >> return 3; > > >> } > > >> int bar () > > >> { > > >> return foo(2); > > >> } > > >> In file included from ./include/immintrin.h:117, > > >> from ./include/x86intrin.h:32, > > >> from z.c:1: > > >> z.c: In function ‘int foo(int)’: > > >> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa > > >> option -m32 -mgfni -mavx512f > > >> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); > > >> | ^~~~ > > >> z.c:7:10: note: the ABI for passing parameters with 64-byte > > >> alignment has changed in GCC 4.6 > > >> > > >> 2) $ cat z.c && ./xg++ -B. z.c -c > > >> #include > > >> volatile __m512i x1, x2; > > >> volatile __mmask64 m64; > > >> > > >> __attribute__((target("gfni"))) > > >> int foo(int i) { > > >> return 1; > > >> } > > >> __attribute__((target("gfni,avx512bw"))) > > >> int foo(int i) { > > >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); > > >> return 4; > > >> } > > >> __attribute__((target("default"))) > > >>
[PR90048] Fortran OpenACC 'private' clause rejected for predetermined private loop iteration variable (was: [patch,gomp4] make fortran loop variables implicitly private in openacc)
Hi! On Mon, 11 Aug 2014 16:55:28 -0700, Cesar Philippidis wrote: > According to section 2.6.1 in the openacc spec, fortran loop variables > should be implicitly private like in openmp. More correctly, they are "predetermined private" (which cannot be overridden), not "implicit private" (which could be overridden with a different explicit clause). > This patch does just so. But it also introduced PR90048 "Fortran OpenACC 'private' clause rejected for predetermined private loop iteration variable". Instead of the patch "don't error on implicitly private induction variables in gfortran" proposed by Cesar, and then challenged by Jakub, I have now committed a different patch (more similar to the existing handling for OpenMP) to trunk in r270406 "[PR90048] Fortran OpenACC 'private' clause rejected for predetermined private loop iteration variable", see attached. I have a cleanup patch (for next GCC development stage 1), which will simply merge the special-case 'gfc_resolve_oacc_blocks' into the generic 'gfc_resolve_omp_parallel_blocks', see attached. > --- /dev/null > +++ b/gcc/testsuite/gfortran.dg/goacc/private-1.f95 > @@ -0,0 +1,39 @@ > +! { dg-do compile } > +! { dg-additional-options "-fdump-tree-omplower" } > + > +! test for implicit private clauses in do loops > + > +program test > + implicit none > + integer :: i, j, k > + logical :: l > + > + !$acc parallel > + !$acc loop > + do i = 1, 100 > + end do > + !$acc end parallel > + > + !$acc parallel > + !$acc loop > + do i = 1, 100 > + do j = 1, 100 > + end do > + end do > + !$acc end parallel > + > + !$acc parallel > + !$acc loop > + do i = 1, 100 > + do j = 1, 100 > +do k = 1, 100 > +end do > + end do > + end do > + !$acc end parallel > +end program test > +! { dg-prune-output "unimplemented" } > +! { dg-final { scan-tree-dump-times "pragma acc parallel" 3 "omplower" } } > +! { dg-final { scan-tree-dump-times "private\\(i\\)" 3 "omplower" } } > +! { dg-final { scan-tree-dump-times "private\\(j\\)" 2 "omplower" } } > +! { dg-final { scan-tree-dump-times "private\\(k\\)" 1 "omplower" } } I turned that one and 'gfortran.dg/goacc/private-2.f95' into more elaborate testcases, committed to trunk in r270405 "[PR90067, PR90114] Document Fortran OpenACC predetermined private status quo", see attached. Grüße Thomas >From b8d03885017763f914a48b19b6cb383239430b97 Mon Sep 17 00:00:00 2001 From: tschwinge Date: Wed, 17 Apr 2019 08:34:20 + Subject: [PATCH] [PR90048] Fortran OpenACC 'private' clause rejected for predetermined private loop iteration variable gcc/fortran/ PR fortran/90048 * openmp.c (gfc_resolve_do_iterator): Handle sharing_clauses for OpenACC, too. (gfc_resolve_oacc_blocks): Populate sharing_clauses with private clauses. gcc/testsuite/ PR fortran/90048 * gfortran.dg/goacc/private-explicit-kernels-1.f95: New file. * gfortran.dg/goacc/private-explicit-parallel-1.f95: Likewise. * gfortran.dg/goacc/private-explicit-routine-1.f95: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@270406 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/fortran/ChangeLog | 8 + gcc/fortran/openmp.c | 20 +- gcc/testsuite/ChangeLog | 5 + .../goacc/private-explicit-kernels-1.f95 | 248 ++ .../goacc/private-explicit-parallel-1.f95 | 247 + .../goacc/private-explicit-routine-1.f95 | 146 +++ 6 files changed, 671 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95 create mode 100644 gcc/testsuite/gfortran.dg/goacc/private-explicit-parallel-1.f95 create mode 100644 gcc/testsuite/gfortran.dg/goacc/private-explicit-routine-1.f95 diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog index e27743cac280..1ff03e1e85b5 100644 --- a/gcc/fortran/ChangeLog +++ b/gcc/fortran/ChangeLog @@ -1,3 +1,11 @@ +2019-04-17 Thomas Schwinge + + PR fortran/90048 + * openmp.c (gfc_resolve_do_iterator): Handle sharing_clauses for + OpenACC, too. + (gfc_resolve_oacc_blocks): Populate sharing_clauses with private + clauses. + 2019-04-14 Paul Thomas PR fortran/89843 diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index 9fc236760a1c..1c7bce6c3000 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -5510,8 +5510,7 @@ gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym, bool add_clause) if (!omp_current_ctx->is_openmp && !oacc_is_loop (omp_current_ctx->code)) return; - if (omp_current_ctx->is_openmp - && omp_current_ctx->sharing_clauses->contains (sym)) + if (omp_current_ctx->sharing_clauses->contains (sym)) return; if (! omp_current_ctx->private_iterators->add (sym) && add_clause) @@ -5971,19 +5970,34 @@ void gfc_resolve_oacc_blocks (gfc_code *code, gfc_namespace *ns) { fortran_omp_context ctx; + gfc_omp_clauses *omp_clauses = code->ext.omp_clauses; + gf
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
On Wed, Apr 17, 2019 at 9:19 AM Thomas König wrote: > > Hi, > > thanks a lot for the extensive discussion :-) > > How should we now proceed, first for gcc 9, snd then for backporting? > Use Richard‘s patch with the corresponding Fortran FE change? Btw, for the testcase the fortran FE could also simply opt to not make def_init TREE_READONLY. Or even better, for all-zero initialization omit the explicit initialization data and instead mark it specially in the vtable (just use a NULL initializer denoting zero-initialization?). Even .bss costs (runtime) memory. But yes, my patch would be a way to solve the middle-end issue of promoting a variable TREE_READONLY, preventing .bss use. And the FE could then "abuse" this feature. Note the middle-end already special-cases variables with an explicit section so the Fortran FE can already use that feature to put the initializer into .bss explicitely (set_decl_section_name (decl, ".bss"), conditional on availability (not 100% sure how to test that...). Your testcase probably will fail on targets w/o .bss section support. Richard. > Regards > > Thomas
Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).
On 4/17/19 10:14 AM, Hongtao Liu wrote: > Any other comments, I'll merge this to trunk? Hi. I don't understand you. The patch in its original version will no be installed to trunk and I'll rework it to not support AVX512* (except AVX512F) in target_clone attribute. Martin
Re: [PR 85762, 87008, 85459] Relax MEM_REF check in contains_vce_or_bfcref_p
Hello, On Sun, Mar 10 2019, Martin Jambor wrote: > Hi, > > after we have accidentally dropped the mailing list from our discussion > (my apologies for not spotting that in time), Richi has approved the > following patch which I have bootstrapped and tested on x86_64-linux > (all languages) and on i686-linux, aarch64-linux and ppc64-linux (C, C++ > and Fortran) and so I am about to commit it to trunk. > > It XFAILS three guality tests which pass at -O0, which means there are > three additional XPASSes - there already are 5 pre-existing XPASSes in > that testcase and 29 outright failures. I will come back to this next > in April and see whether I can make the tests pass by decoupling the > roles now played by cannot_scalarize_away_bitmap (or at least massage > the testcase to go make the XPASSes go away). But I won't have time to > do it next two weeks and this patch is important enough to have it in > trunk now. I intend to backport it to gcc 8 in April too. > > Thanks, > > Martin > > > 2019-03-08 Martin Jambor > > PR tree-optimization/85762 > PR tree-optimization/87008 > PR tree-optimization/85459 > * tree-sra.c (contains_vce_or_bfcref_p): New parameter, set the bool > it points to if there is a type changing MEM_REF. Adjust all callers. > (build_accesses_from_assign): Disable total scalarization if > contains_vce_or_bfcref_p returns true through the new parameter, for > both rhs and lhs. > > testsuite/ > * g++.dg/tree-ssa/pr87008.C: New test. > * gcc.dg/guality/pr54970.c: Xfail tests querying a[0] everywhere. this patch has been on trunk for over a month and at least so far nobody complained. I have applied it to gcc-8-branch and did a bootstrap and testing on an x86_64-linux machine and there were no problems either. Therefore I would propose to backport it - the other option being leaving the gcc 8 regression(s) unfixed. What do you think? Martin 2019-04-16 Martin Jambor Backport from mainline 2019-03-10 Martin Jambor PR tree-optimization/85762 PR tree-optimization/87008 PR tree-optimization/85459 * tree-sra.c (contains_vce_or_bfcref_p): New parameter, set the bool it points to if there is a type changing MEM_REF. Adjust all callers. (build_accesses_from_assign): Disable total scalarization if contains_vce_or_bfcref_p returns true through the new parameter, for both rhs and lhs. testsuite/ * g++.dg/tree-ssa/pr87008.C: New test. * gcc.dg/guality/pr54970.c: Xfail tests querying a[0] everywhere. --- gcc/testsuite/g++.dg/tree-ssa/pr87008.C | 17 gcc/testsuite/gcc.dg/guality/pr54970.c | 6 ++--- gcc/tree-sra.c | 36 ++--- 3 files changed, 47 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr87008.C diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr87008.C b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C new file mode 100644 index 000..eef521f9ad5 --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +extern void dontcallthis(); + +struct A { long a, b; }; +struct B : A {}; +templatevoid cp(T&a,T const&b){a=b;} +long f(B x){ + B y; cp(y,x); + B z; cp(z,x); + if (y.a - z.a) +dontcallthis (); + return 0; +} + +/* { dg-final { scan-tree-dump-not "dontcallthis" "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/guality/pr54970.c b/gcc/testsuite/gcc.dg/guality/pr54970.c index 1819d023e21..f12a9aac1d2 100644 --- a/gcc/testsuite/gcc.dg/guality/pr54970.c +++ b/gcc/testsuite/gcc.dg/guality/pr54970.c @@ -8,17 +8,17 @@ int main () { - int a[] = { 1, 2, 3 }; /* { dg-final { gdb-test 15 "a\[0\]" "1" } } */ + int a[] = { 1, 2, 3 }; /* { dg-final { gdb-test 15 "a\[0\]" "1" { xfail { *-*-* } } } } */ int *p = a + 2; /* { dg-final { gdb-test 15 "a\[1\]" "2" } } */ int *q = a + 1; /* { dg-final { gdb-test 15 "a\[2\]" "3" } } */ /* { dg-final { gdb-test 15 "*p" "3" } } */ asm volatile (NOP); /* { dg-final { gdb-test 15 "*q" "2" } } */ - *p += 10;/* { dg-final { gdb-test 20 "a\[0\]" "1" } } */ + *p += 10;/* { dg-final { gdb-test 20 "a\[0\]" "1" { xfail { *-*-* } } } } */ /* { dg-final { gdb-test 20 "a\[1\]" "2" } } */ /* { dg-final { gdb-test 20 "a\[2\]" "13" } } */ /* { dg-final { gdb-test 20 "*p" "13" } } */ asm volatile (NOP); /* { dg-final { gdb-test 20 "*q" "2" } } */ - *q += 10;/* { dg-final { gdb-test 25 "a\[0\]" "1" } } */ + *q += 10;/* { dg-final { gdb-test 25 "a\[0\]" "1" { xfail { *-*-* } } } } */ /* { dg-final {
[PATCH] rs6000: Improve the load/store-with-update patterns (PR17108)
Many of these patterns only worked in 32-bit mode, and some only worked in 64-bit mode. This patch makes these use Pmode, fixing the PR. On the other hand, the stack updates have to use the same mode for the stack pointer as for the value stored, so let's simplify that a bit. Many of these patterns pass the wrong mode to avoiding_indexed_address_p (it should be the mode of the datum accessed, not the mode of the pointer). Finally, I merge some patterns into one (using iterators). Tested on powerpc64-linux {-m32,-m64}. Committing. Segher 2019-04-17 Segher Boessenkool * config/rs6000/rs6000.c (rs6000_split_multireg_move): Adjust pattern name. (rs6000_emit_allocate_stack_1): Simplify condition. Adjust pattern name. * config/rs6000/rs6000.md (bits): Add entries for SF and DF. (*movdi_update1): Use Pmode. (movdi__update): Fix argument to avoiding_indexed_address_p. (movdi__update_stack): Rename to ... (movdi_update_stack): ... this. Fix comment. Change condition. Don't use Pmode. (*movsi_update1): Use Pmode. (*movsi_update2): Use Pmode. (movsi_update): Rename to ... (movsi__update): ... this. Use Pmode. (movsi_update_stack): Fix condition. (*movhi_update1): Use Pmode. Fix argument to avoiding_indexed_address_p. (*movhi_update2): Ditto. (*movhi_update3): Ditto. (*movhi_update4): Ditto. (*movqi_update1): Ditto. (*movqi_update2): Ditto. (*movqi_update3): Ditto. (*movsf_update1, *movdf_update1): Merge, rename to... (*mov_update1): This. Use Pmode. Fix argument to avoiding_indexed_address_p. Add "size" attribute. (*movsf_update2, *movdf_update2): Merge, rename to... (*mov_update2): This. Ditto. (*movsf_update3): Use Pmode. Fix argument to avoiding_indexed_address_p. (*movsf_update4): Ditto. (allocate_stack): Simplify condition. Adjust pattern names. --- gcc/config/rs6000/rs6000.c | 12 +- gcc/config/rs6000/rs6000.md | 274 2 files changed, 128 insertions(+), 158 deletions(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 9105253..ae2249b 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -24010,7 +24010,7 @@ rs6000_split_multireg_move (rtx dst, rtx src) emit_insn (TARGET_32BIT ? (TARGET_POWERPC64 ? gen_movdi_si_update (breg, breg, delta_rtx, nsrc) - : gen_movsi_update (breg, breg, delta_rtx, nsrc)) + : gen_movsi_si_update (breg, breg, delta_rtx, nsrc)) : gen_movdi_di_update (breg, breg, delta_rtx, nsrc)); used_update = true; } @@ -25486,16 +25486,16 @@ rs6000_emit_allocate_stack_1 (HOST_WIDE_INT size_int, rtx orig_sp) size_rtx = tmp_reg; } - if (Pmode == SImode) + if (TARGET_32BIT) insn = emit_insn (gen_movsi_update_stack (stack_pointer_rtx, stack_pointer_rtx, size_rtx, orig_sp)); else -insn = emit_insn (gen_movdi_di_update_stack (stack_pointer_rtx, -stack_pointer_rtx, -size_rtx, -orig_sp)); +insn = emit_insn (gen_movdi_update_stack (stack_pointer_rtx, + stack_pointer_rtx, + size_rtx, + orig_sp)); rtx par = PATTERN (insn); gcc_assert (GET_CODE (par) == PARALLEL); rtx set = XVECEXP (par, 0, 0); diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index b8dd859..6feaa10 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -563,7 +563,8 @@ (define_mode_attr wd [(QI"b") (TI"q")]) ;; How many bits in this mode? -(define_mode_attr bits [(QI "8") (HI "16") (SI "32") (DI "64")]) +(define_mode_attr bits [(QI "8") (HI "16") (SI "32") (DI "64") + (SF "32") (DF "64")]) ; DImode bits (define_mode_attr dbits [(QI "56") (HI "48") (SI "32")]) @@ -9083,13 +9084,13 @@ (define_expand "movmemsi" (define_insn "*movdi_update1" [(set (match_operand:DI 3 "gpc_reg_operand" "=r,r") - (mem:DI (plus:DI (match_operand:DI 1 "gpc_reg_operand" "0,0") -(match_operand:DI 2 "reg_or_aligned_short_operand" "r,I" - (set (match_operand:DI 0 "gpc_reg_operand" "=b,b") - (plus:DI (match_dup 1) (match_dup 2)))] + (mem:DI (plus:P (match_operand:P 1 "gpc_reg_
Re: [PR 85762, 87008, 85459] Relax MEM_REF check in contains_vce_or_bfcref_p
On Wed, 17 Apr 2019, Martin Jambor wrote: > Hello, > > On Sun, Mar 10 2019, Martin Jambor wrote: > > Hi, > > > > after we have accidentally dropped the mailing list from our discussion > > (my apologies for not spotting that in time), Richi has approved the > > following patch which I have bootstrapped and tested on x86_64-linux > > (all languages) and on i686-linux, aarch64-linux and ppc64-linux (C, C++ > > and Fortran) and so I am about to commit it to trunk. > > > > It XFAILS three guality tests which pass at -O0, which means there are > > three additional XPASSes - there already are 5 pre-existing XPASSes in > > that testcase and 29 outright failures. I will come back to this next > > in April and see whether I can make the tests pass by decoupling the > > roles now played by cannot_scalarize_away_bitmap (or at least massage > > the testcase to go make the XPASSes go away). But I won't have time to > > do it next two weeks and this patch is important enough to have it in > > trunk now. I intend to backport it to gcc 8 in April too. > > > > Thanks, > > > > Martin > > > > > > 2019-03-08 Martin Jambor > > > > PR tree-optimization/85762 > > PR tree-optimization/87008 > > PR tree-optimization/85459 > > * tree-sra.c (contains_vce_or_bfcref_p): New parameter, set the bool > > it points to if there is a type changing MEM_REF. Adjust all callers. > > (build_accesses_from_assign): Disable total scalarization if > > contains_vce_or_bfcref_p returns true through the new parameter, for > > both rhs and lhs. > > > > testsuite/ > > * g++.dg/tree-ssa/pr87008.C: New test. > > * gcc.dg/guality/pr54970.c: Xfail tests querying a[0] everywhere. > > this patch has been on trunk for over a month and at least so far nobody > complained. I have applied it to gcc-8-branch and did a bootstrap and > testing on an x86_64-linux machine and there were no problems either. > > Therefore I would propose to backport it - the other option being leaving > the gcc 8 regression(s) unfixed. What do you think? Let's go for the backport. Richard. > Martin > > > 2019-04-16 Martin Jambor > > Backport from mainline > 2019-03-10 Martin Jambor > > PR tree-optimization/85762 > PR tree-optimization/87008 > PR tree-optimization/85459 > * tree-sra.c (contains_vce_or_bfcref_p): New parameter, set the bool > it points to if there is a type changing MEM_REF. Adjust all callers. > (build_accesses_from_assign): Disable total scalarization if > contains_vce_or_bfcref_p returns true through the new parameter, for > both rhs and lhs. > > testsuite/ > * g++.dg/tree-ssa/pr87008.C: New test. > * gcc.dg/guality/pr54970.c: Xfail tests querying a[0] everywhere. > --- > gcc/testsuite/g++.dg/tree-ssa/pr87008.C | 17 > gcc/testsuite/gcc.dg/guality/pr54970.c | 6 ++--- > gcc/tree-sra.c | 36 ++--- > 3 files changed, 47 insertions(+), 12 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr87008.C > > diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr87008.C > b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C > new file mode 100644 > index 000..eef521f9ad5 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C > @@ -0,0 +1,17 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > + > +extern void dontcallthis(); > + > +struct A { long a, b; }; > +struct B : A {}; > +templatevoid cp(T&a,T const&b){a=b;} > +long f(B x){ > + B y; cp(y,x); > + B z; cp(z,x); > + if (y.a - z.a) > +dontcallthis (); > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-not "dontcallthis" "optimized" } } */ > diff --git a/gcc/testsuite/gcc.dg/guality/pr54970.c > b/gcc/testsuite/gcc.dg/guality/pr54970.c > index 1819d023e21..f12a9aac1d2 100644 > --- a/gcc/testsuite/gcc.dg/guality/pr54970.c > +++ b/gcc/testsuite/gcc.dg/guality/pr54970.c > @@ -8,17 +8,17 @@ > int > main () > { > - int a[] = { 1, 2, 3 }; /* { dg-final { gdb-test 15 "a\[0\]" "1" } } */ > + int a[] = { 1, 2, 3 }; /* { dg-final { gdb-test 15 "a\[0\]" "1" { > xfail { *-*-* } } } } */ >int *p = a + 2;/* { dg-final { gdb-test 15 "a\[1\]" "2" } } */ >int *q = a + 1;/* { dg-final { gdb-test 15 "a\[2\]" "3" } } */ > /* { dg-final { gdb-test 15 "*p" "3" } } */ >asm volatile (NOP);/* { dg-final { gdb-test 15 "*q" "2" } > } */ > - *p += 10; /* { dg-final { gdb-test 20 "a\[0\]" "1" } } */ > + *p += 10; /* { dg-final { gdb-test 20 "a\[0\]" "1" { > xfail { *-*-* } } } } */ > /* { dg-final { gdb-test 20 "a\[1\]" "2" } } */ > /* { dg-final { gdb-test 20 "a\[2\]" "13" } } */ > /* { dg-final { gdb-test 20 "*p" "13" } } */ >asm volatile (NOP)
Re: [PATCH] (RFA tree-tailcall) PR c++/82081 - tail call optimization breaks noexcept
On Tue, Apr 16, 2019 at 1:24 AM Richard Biener wrote: > On Mon, Apr 15, 2019 at 7:09 PM Andrew Pinski wrote: > > On Sun, Apr 14, 2019 at 11:50 PM Richard Biener > > wrote: > > > > > > On Sat, Apr 13, 2019 at 12:34 AM Jeff Law wrote: > > > > > > > > On 4/12/19 3:24 PM, Jason Merrill wrote: > > > > > If a noexcept function calls a function that might throw, doing the > > > > > tail > > > > > call optimization means that an exception thrown in the called > > > > > function > > > > > will propagate out, breaking the noexcept specification. So we need > > > > > to > > > > > prevent the optimization in that case. > > > > > > > > > > Tested x86_64-pc-linux-gnu. OK for trunk or hold for GCC 10? This > > > > > isn't a > > > > > regression, but it is a straightforward fix for a wrong-code bug. > > > > > > > > > > * tree-tailcall.c (find_tail_calls): Don't turn a call from a > > > > > nothrow function to a might-throw function into a tail call. > > > > I'd go on the trunk. It's a wrong-code issue, what we're doing is just > > > > plain wrong. One could even make a case for backporting to the > > > > branches. > > > > > > Hmm, how's this different from adding another indirection? That is, > > > I don't understand why the tailcall is the issue here, shouldn't unwind > > > still stop at the noexcept caller? Thus, isn't this wrong CFI instead? > > > > noexcept caller is no longer on the stack so the unwinder does not see it. > > It is not the tail call from a normal function to a noexcept that is > > an issue but rather inside a noexcept caller to a normal function. > > Hmm, OK, so essentially a tail-call cannot be represented in the CFI > program. Right. Because the "caller" frame no longer exists. > > > Of course I know to little about this. > > > > > > Btw, doesn't your check also prevent tail/sibling calls when > > > the caller wraps it into a try { } catch (...) {}? Or does unwind > > > not work in that case either? > > > > > > Btw, I'd like to see a runtime testcase that fails. > > > > There is one in the bug report. Though it would not work for the > > testsuite. It should not be hard to change it to be one that works > > for the testsuite. > > With dg-additional-sources and registering a custom std::terminate > it should work I guess (or by catching SIGABRT). > > The patch and the bug also suggests that an internally > throwing function cannot be tail-called either (can't find a testcase > we'd mark as tail-call here) If you mean a call wrapped in try/catch, that is correct. The tail-call optimization breaks all exception handlers, so the patch prevents it if the call can throw and is in an active exception region. Jason
[PATCH] auto-inc-dec: Set alignment properly
When auto-inc-dec creates a new mem to compute the cost of doing some transform, it forgets to copy over the alignment of the original mem. This gives wrong costs, for example, for rs6000 a floating point load or store is hugely expensive if unaligned. This patch fixes it. This doesn't fix any test case I'm aware of, but it is a very simple patch. Is it okay for trunk? Segher 2019-04-17 Segher Boessenkool * auto-inc-dec.c (attempt_change): Set the alignment of the temporary memory to that of the original. --- gcc/auto-inc-dec.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c index 43400cc..bdb6efa 100644 --- a/gcc/auto-inc-dec.c +++ b/gcc/auto-inc-dec.c @@ -471,6 +471,7 @@ attempt_change (rtx new_addr, rtx inc_reg) int regno; rtx mem = *mem_insn.mem_loc; machine_mode mode = GET_MODE (mem); + int align = MEM_ALIGN (mem); rtx new_mem; int old_cost = 0; int new_cost = 0; @@ -478,6 +479,7 @@ attempt_change (rtx new_addr, rtx inc_reg) PUT_MODE (mem_tmp, mode); XEXP (mem_tmp, 0) = new_addr; + set_mem_align (mem_tmp, align); old_cost = (set_src_cost (mem, mode, speed) + set_rtx_cost (PATTERN (inc_insn.insn), speed)); -- 1.8.3.1
Re: [Patch] [Aarch64] PR rtl-optimization/87763 - this patch fixes gcc.target/aarch64/lsl_asr_sbfiz.c
On 10/04/2019 23:03, Steve Ellcey wrote: > > Here is another patch to fix one of the failures > listed in PR rtl-optimization/87763. This change > fixes gcc.target/aarch64/lsl_asr_sbfiz.c by adding > an alternative version of *ashiftsi_extv_bfiz that > has a subreg in it. > > Tested with bootstrap and regression test run. > > OK for checkin? > > Steve Ellcey > > > 2018-04-10 Steve Ellcey > > PR rtl-optimization/87763 > * config/aarch64/aarch64.md (*ashiftsi_extv_bfiz_alt): > New Instruction. > > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index e0df975..04dc06f 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -5634,6 +5634,22 @@ >[(set_attr "type" "bfx")] > ) > > +(define_insn "*ashiftsi_extv_bfiz_alt" > + [(set (match_operand:SI 0 "register_operand" "=r") > + (ashift:SI > + (subreg:SI > + (sign_extract:DI > + (subreg:DI (match_operand:SI 1 "register_operand" "r") 0) > + (match_operand 2 "aarch64_simd_shift_imm_offset_si" "n") > + (const_int 0)) > + 0) > + (match_operand 3 "aarch64_simd_shift_imm_si" "n")))] > + "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), > + 1, GET_MODE_BITSIZE (SImode) - 1)" > + "sbfiz\\t%w0, %w1, %3, %2" > + [(set_attr "type" "bfx")] > +) > + > ;; When the bit position and width of the equivalent extraction add up to 32 > ;; we can use a W-reg LSL instruction taking advantage of the implicit > ;; zero-extension of the X-reg. > I don't think this is right for big-endian, where the subreg offset is not zero. Perhaps you should look at using subreg_lowpart_operator. Due to that, I think this also needs some test cases. R.
Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).
On Wed, Apr 17, 2019 at 4:48 PM Martin Liška wrote: > > On 4/17/19 10:14 AM, Hongtao Liu wrote: > > Any other comments, I'll merge this to trunk? > > Hi. > > I don't understand you. The patch in its original version will no be > installed to trunk > and I'll rework it to not support AVX512* (except AVX512F) in target_clone > attribute. > > Martin Sorry,I've sent the mail to the wrong address,please ignore it. -- BR, Hongtao
Re: Enable BF16 support (Please ignore my former email)
On Fri, Apr 12, 2019 at 11:18 PM H.J. Lu wrote: > > On Fri, Apr 12, 2019 at 3:19 AM Uros Bizjak wrote: > > > > On Fri, Apr 12, 2019 at 11:03 AM Hongtao Liu wrote: > > > > > > On Fri, Apr 12, 2019 at 3:30 PM Uros Bizjak wrote: > > > > > > > > On Fri, Apr 12, 2019 at 9:09 AM Liu, Hongtao > > > > wrote: > > > > > > > > > > Hi : > > > > > This patch is about to enable support for bfloat16 which will be > > > > > in Future Cooper Lake, Please refer to > > > > > https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference > > > > > for more details about BF16. > > > > > > > > > > There are 3 instructions for AVX512BF16: VCVTNE2PS2BF16, > > > > > VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural > > > > > Network Instructions supporting: > > > > > > > > > > - VCVTNE2PS2BF16: Convert Two Packed Single Data to One Packed > > > > > BF16 Data. > > > > > - VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 Data. > > > > > - VDPBF16PS: Dot Product of BF16 Pairs Accumulated into Packed > > > > > Single Precision. > > > > > > > > > > Since only BF16 intrinsics are supported, we treat it as HI for > > > > > simplicity. > > > > > > > > I think it was a mistake declaring cvtps2ph and cvtph2ps using HImode > > > > instead of HFmode. Is there a compelling reason not to introduce > > > > corresponding bf16_format supporting infrastructure and declare these > > > > intrinsics using half-binary (HBmode ?) mode instead? > > > > > > > > Uros. > > > > > > Bfloat16 isn't IEEE standard which we want to reserve HFmode for. > > > > True. > > > > > The IEEE 754 standard specifies a binary16 as having the following format: > > > Sign bit: 1 bit > > > Exponent width: 5 bits > > > Significand precision: 11 bits (10 explicitly stored) > > > > > > Bfloat16 has the following format: > > > Sign bit: 1 bit > > > Exponent width: 8 bits > > > Significand precision: 8 bits (7 explicitly stored), as opposed to 24 > > > bits in a classical single-precision floating-point format > > > > This is why I proposed to introduce HBmode (and corresponding > > bfloat16_format) to distingush between ieee HFmode and BFmode. > > > > Unless there is BF16 language level support, HBmode has no advantage > over HImode. We can add HBmode when we gain BF16 language support. > > -- > H.J. Any other comments, I'll merge this to trunk? -- BR, Hongtao
Re: [PATCH][RFC] Improve get_qualified_type linear list walk
On Tue, 16 Apr 2019, Michael Matz wrote: > Hi, > > On Tue, 16 Apr 2019, Richard Biener wrote: > > > Comments? > > I was quickly testing also with some early-outs but didn't get conclusive > performance results (but really only superficial testing) so I'm not > proposing it, like so: > > diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c > index 7045284..33f56f9 100644 > --- a/gcc/cp/typeck.c > +++ b/gcc/cp/typeck.c > @@ -1508,6 +1508,10 @@ same_type_ignoring_top_level_qualifiers_p (tree >if (type1 == error_mark_node || type2 == error_mark_node) > return false; > > + if (type1 == type2) > +return true; This one reduces the number of get_qualified_type calls by about 10%. Probably worth doing. Another smallish improvement is using strip_top_quals which does nothing for ARRAY_TYPE. Btw, the new get_qualified_type shows (with the above patch applied) if (TYPE_QUALS (type) == type_quals) return type; // 0.3% hit tree mv = TYPE_MAIN_VARIANT (type); if (check_qualified_type (mv, type, type_quals)) return mv; // 43.8% hit for the C++ FE the LRU cache effectively moves the unqualified variants first in the variant list. Since we always first build the unqualified variants before the qualified ones the unqualified ones tend to be at the end of the list. That's clearly bad for the C++ pattern of repeatedly looking up the unqualified type variant from a type. Of course a direct shortcut would be much cheaper here (but it obviously isn't the main variant due to TYPE_NAME differences). So do you think the change to get_qualified_type is OK? Or do we absolutely want to avoid changing the variant list from a function like this? Thanks, Richard.
Re: Enable BF16 support (Please ignore my former email)
On Wed, Apr 17, 2019 at 12:29 PM Hongtao Liu wrote: > > On Fri, Apr 12, 2019 at 11:18 PM H.J. Lu wrote: > > > > On Fri, Apr 12, 2019 at 3:19 AM Uros Bizjak wrote: > > > > > > On Fri, Apr 12, 2019 at 11:03 AM Hongtao Liu wrote: > > > > > > > > On Fri, Apr 12, 2019 at 3:30 PM Uros Bizjak wrote: > > > > > > > > > > On Fri, Apr 12, 2019 at 9:09 AM Liu, Hongtao > > > > > wrote: > > > > > > > > > > > > Hi : > > > > > > This patch is about to enable support for bfloat16 which will > > > > > > be in Future Cooper Lake, Please refer to > > > > > > https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference > > > > > > for more details about BF16. > > > > > > > > > > > > There are 3 instructions for AVX512BF16: VCVTNE2PS2BF16, > > > > > > VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural > > > > > > Network Instructions supporting: > > > > > > > > > > > > - VCVTNE2PS2BF16: Convert Two Packed Single Data to One > > > > > > Packed BF16 Data. > > > > > > - VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 > > > > > > Data. > > > > > > - VDPBF16PS: Dot Product of BF16 Pairs Accumulated into > > > > > > Packed Single Precision. > > > > > > > > > > > > Since only BF16 intrinsics are supported, we treat it as HI for > > > > > > simplicity. > > > > > > > > > > I think it was a mistake declaring cvtps2ph and cvtph2ps using HImode > > > > > instead of HFmode. Is there a compelling reason not to introduce > > > > > corresponding bf16_format supporting infrastructure and declare these > > > > > intrinsics using half-binary (HBmode ?) mode instead? > > > > > > > > > > Uros. > > > > > > > > Bfloat16 isn't IEEE standard which we want to reserve HFmode for. > > > > > > True. > > > > > > > The IEEE 754 standard specifies a binary16 as having the following > > > > format: > > > > Sign bit: 1 bit > > > > Exponent width: 5 bits > > > > Significand precision: 11 bits (10 explicitly stored) > > > > > > > > Bfloat16 has the following format: > > > > Sign bit: 1 bit > > > > Exponent width: 8 bits > > > > Significand precision: 8 bits (7 explicitly stored), as opposed to 24 > > > > bits in a classical single-precision floating-point format > > > > > > This is why I proposed to introduce HBmode (and corresponding > > > bfloat16_format) to distingush between ieee HFmode and BFmode. > > > > > > > Unless there is BF16 language level support, HBmode has no advantage > > over HImode. We can add HBmode when we gain BF16 language support. > > > > -- > > H.J. > > Any other comments, I'll merge this to trunk? It is not a regression, so please no. Uros.
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
* Richard Biener: > On Wed, Apr 17, 2019 at 9:19 AM Thomas König wrote: >> >> Hi, >> >> thanks a lot for the extensive discussion :-) >> >> How should we now proceed, first for gcc 9, snd then for backporting? >> Use Richard‘s patch with the corresponding Fortran FE change? > > Btw, for the testcase the fortran FE could also simply opt to not > make def_init TREE_READONLY. Or even better, for all-zero > initialization omit the explicit initialization data and instead > mark it specially in the vtable (just use a NULL initializer > denoting zero-initialization?). Even .bss costs (runtime) memory. Not just that, .bss adds to the commit charge, while .rodata would not. So it's not clear that using .bss for zero constants is always a win. Thanks, Florian
[PATCH] [ARC][COMMITTED] Fix diagnostic messages.
Apply upper/dot rule on diagnostic messages. gcc/ -xx-xx Claudiu Zissulescu * config/arc/arc.c (arc_init): Format diagnostic string. (arc_override_options): Likewise. (check_if_valid_regno_const): Likewise. (arc_reorg): Likewise. --- gcc/ChangeLog| 7 +++ gcc/config/arc/arc.c | 22 -- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 9480e693c08..3820fae8ee7 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2019-04-17 Claudiu Zissulescu + + * config/arc/arc.c (arc_init): Format diagnostic string. + (arc_override_options): Likewise. + (check_if_valid_regno_const): Likewise. + (arc_reorg): Likewise. + 2019-04-17 Segher Boessenkool PR target/17108 diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index 65eef30747a..1a04f9ef793 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -950,13 +950,13 @@ arc_init (void) /* FPX-4. No FPX extensions mixed with FPU extensions. */ if ((TARGET_DPFP_FAST_SET || TARGET_DPFP_COMPACT_SET || TARGET_SPFP) && TARGET_HARD_FLOAT) -error ("No FPX/FPU mixing allowed"); +error ("no FPX/FPU mixing allowed"); /* Warn for unimplemented PIC in pre-ARC700 cores, and disable flag_pic. */ if (flag_pic && TARGET_ARC600_FAMILY) { warning (0, - "PIC is not supported for %s. Generating non-PIC code only..", + "PIC is not supported for %s. Generating non-PIC code only", arc_cpu_string); flag_pic = 0; } @@ -1222,26 +1222,26 @@ arc_override_options (void) do { \ if ((!(arc_selected_cpu->arch_info->flags & CODE)) \ && (VAR == VAL))\ - error ("Option %s=%s is not available for %s CPU.", \ + error ("option %s=%s is not available for %s CPU", \ DOC0, DOC1, arc_selected_cpu->name); \ if ((arc_selected_cpu->arch_info->dflags & CODE) \ && (VAR != DEFAULT_##VAR) \ && (VAR != VAL))\ - warning (0, "Option %s is ignored, the default value %s" \ - " is considered for %s CPU.", DOC0, DOC1,\ + warning (0, "option %s is ignored, the default value %s" \ + " is considered for %s CPU", DOC0, DOC1, \ arc_selected_cpu->name); \ } while (0); #define ARC_OPT(NAME, CODE, MASK, DOC) \ do { \ if ((!(arc_selected_cpu->arch_info->flags & CODE)) \ && (target_flags & MASK)) \ - error ("Option %s is not available for %s CPU", \ + error ("option %s is not available for %s CPU", \ DOC, arc_selected_cpu->name); \ if ((arc_selected_cpu->arch_info->dflags & CODE) \ && (target_flags_explicit & MASK) \ && (!(target_flags & MASK)))\ - warning (0, "Unset option %s is ignored, it is always" \ - " enabled for %s CPU.", DOC, \ + warning (0, "unset option %s is ignored, it is always" \ + " enabled for %s CPU", DOC, \ arc_selected_cpu->name); \ } while (0); @@ -7268,7 +7268,8 @@ check_if_valid_regno_const (rtx *operands, int opno) case CONST_INT : return true; default: - error ("register number must be a compile-time constant. Try giving higher optimization levels"); + error ("register number must be a compile-time constant. " + "Try giving higher optimization levels"); break; } return false; @@ -8261,7 +8262,8 @@ arc_reorg (void) cfun->machine->ccfsm_current_insn = NULL_RTX; if (!INSN_ADDRESSES_SET_P()) - fatal_error (input_location, "Insn addresses not set after shorten_branches"); + fatal_error (input_location, + "insn addresses not set after shorten_branches"); for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) { -- 2.20.1
Re: [PATCH PR90078]Capping comp_cost computation in ivopts
On Wed, Apr 17, 2019 at 3:10 PM Jakub Jelinek wrote: > > On Wed, Apr 17, 2019 at 02:13:12PM +0800, bin.cheng wrote: > > Hi, > > As discussed in PR90078, this patch checks possible infinite_cost overflow > > in ivopts. > > Also as discussed, overflow happens mostly because of cost scaling wrto > > bb_freq/loop_freq. > > For the moment, we only implement capping in comp_cost operators, while in > > next > > stage1, we may instead implement capping in get_scaled_computation_cost_at > > with > > more supporting benchmark data. > > > > BTW, I think switching costs around comparison between infinite_cost is > > unnecessary > > since there will be no overflow in integer after capping with infinite_cost. > > > > Bootstrap and test on x86_64, is it OK? > > > > Thanks, > > bin > > > > 2019-04-17 Bin Cheng > > > > PR tree-optimization/92078 > > * tree-ssa-loop-ivopts.c (comp_cost::operator +,-,+=,-+,/=,*=): Add > > checks for infinite_cost overflow. > > > > 2018-04-17 Bin Cheng > > > > PR tree-optimization/92078 > > * gcc/testsuite/g++.dg/tree-ssa/pr90078.C: New test. > > --- a/gcc/tree-ssa-loop-ivopts.c > +++ b/gcc/tree-ssa-loop-ivopts.c > @@ -243,6 +243,9 @@ operator+ (comp_cost cost1, comp_cost cost2) >if (cost1.infinite_cost_p () || cost2.infinite_cost_p ()) > return infinite_cost; > > + if (cost1.cost + cost2.cost >= infinite_cost.cost) > +return infinite_cost; > > As > #define INFTY 1000 > what is the reason to keep the previous condition as well? > I mean, if cost1.cost == INFTY or cost2.cost == INFTY, > cost1.cost + cost2.cost >= INFTY too. > Unless costs can go negative. It's a bit complicated, but in general, costs can go negative. > > @@ -256,6 +259,8 @@ operator- (comp_cost cost1, comp_cost cost2) > return infinite_cost; > >gcc_assert (!cost2.infinite_cost_p ()); > + if (cost1.cost - cost2.cost >= infinite_cost.cost) > +return infinite_cost; > > Unless costs can be negative, when you first bail out > for cost1.cost == INFTY, then cost1.cost - cost2.cost won't > be INFTY (but could get negative). So shouldn't there be a guard against > that instead? Or, if costs can be negative, shouldn't there be also > guards that it doesn't grow too negative (say smaller than -INFTY)? Negative cost is kind of a result of booking cost cancellation at different place. For example, it mostly comes from in modeling auto increment/decrement addressing mode. To be specific, when IV's increment instruction can be merged into addressing mode, we cancel cost of IV increment operation in cand-use cost. Very likely 4 will be subtracted. In general, we wouldn't expect negative cost can go too big, so there is no -INFTY logic in ivopts at all. So this is the least invasive fix for the moment, I would consider capping bb_freq/loop_freq in the future which should rule out the overflow possibility in the first place. Thanks, bin > > Jakub
Re: Enable BF16 support (Please ignore my former email)
On Wed, Apr 17, 2019 at 1:03 PM Uros Bizjak wrote: > > On Wed, Apr 17, 2019 at 12:29 PM Hongtao Liu wrote: > > > > On Fri, Apr 12, 2019 at 11:18 PM H.J. Lu wrote: > > > > > > On Fri, Apr 12, 2019 at 3:19 AM Uros Bizjak wrote: > > > > > > > > On Fri, Apr 12, 2019 at 11:03 AM Hongtao Liu wrote: > > > > > > > > > > On Fri, Apr 12, 2019 at 3:30 PM Uros Bizjak wrote: > > > > > > > > > > > > On Fri, Apr 12, 2019 at 9:09 AM Liu, Hongtao > > > > > > wrote: > > > > > > > > > > > > > > Hi : > > > > > > > This patch is about to enable support for bfloat16 which will > > > > > > > be in Future Cooper Lake, Please refer to > > > > > > > https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference > > > > > > > for more details about BF16. > > > > > > > > > > > > > > There are 3 instructions for AVX512BF16: VCVTNE2PS2BF16, > > > > > > > VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural > > > > > > > Network Instructions supporting: > > > > > > > > > > > > > > - VCVTNE2PS2BF16: Convert Two Packed Single Data to One > > > > > > > Packed BF16 Data. > > > > > > > - VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 > > > > > > > Data. > > > > > > > - VDPBF16PS: Dot Product of BF16 Pairs Accumulated into > > > > > > > Packed Single Precision. > > > > > > > > > > > > > > Since only BF16 intrinsics are supported, we treat it as HI for > > > > > > > simplicity. > > > > > > > > > > > > I think it was a mistake declaring cvtps2ph and cvtph2ps using > > > > > > HImode > > > > > > instead of HFmode. Is there a compelling reason not to introduce > > > > > > corresponding bf16_format supporting infrastructure and declare > > > > > > these > > > > > > intrinsics using half-binary (HBmode ?) mode instead? > > > > > > > > > > > > Uros. > > > > > > > > > > Bfloat16 isn't IEEE standard which we want to reserve HFmode for. > > > > > > > > True. > > > > > > > > > The IEEE 754 standard specifies a binary16 as having the following > > > > > format: > > > > > Sign bit: 1 bit > > > > > Exponent width: 5 bits > > > > > Significand precision: 11 bits (10 explicitly stored) > > > > > > > > > > Bfloat16 has the following format: > > > > > Sign bit: 1 bit > > > > > Exponent width: 8 bits > > > > > Significand precision: 8 bits (7 explicitly stored), as opposed to 24 > > > > > bits in a classical single-precision floating-point format > > > > > > > > This is why I proposed to introduce HBmode (and corresponding > > > > bfloat16_format) to distingush between ieee HFmode and BFmode. > > > > > > > > > > Unless there is BF16 language level support, HBmode has no advantage > > > over HImode. We can add HBmode when we gain BF16 language support. > > > > > > -- > > > H.J. > > > > Any other comments, I'll merge this to trunk? > > It is not a regression, so please no. Ehm, "regression fix" ... Uros.
Re: [PATCH] [ARC][COMMITTED] Fix diagnostic messages.
On Wed, Apr 17, 2019 at 02:09:33PM +0300, Claudiu Zissulescu wrote: >/* Warn for unimplemented PIC in pre-ARC700 cores, and disable flag_pic. > */ >if (flag_pic && TARGET_ARC600_FAMILY) > { >warning (0, > -"PIC is not supported for %s. Generating non-PIC code only..", > +"PIC is not supported for %s. Generating non-PIC code only", > arc_cpu_string); I believe this is undesirable too. Either use something like "PIC is not supported for %s; generating non-PIC code only" or split that into two messages if (warning (0, "PIC is not supported for %s", arc_cpu_string)) inform (input_location, "generating non-PIC code only"); > @@ -1222,26 +1222,26 @@ arc_override_options (void) >do { \ > if ((!(arc_selected_cpu->arch_info->flags & CODE)) \ > && (VAR == VAL))\ > - error ("Option %s=%s is not available for %s CPU.",\ > + error ("option %s=%s is not available for %s CPU", \ >DOC0, DOC1, arc_selected_cpu->name); \ I think another complaint in the PR was that it is unclear what those DOC0/DOC1/DOC strings stand for, if they are keywords on what one writes on the command line or similar (then it should be quoted, %qs or %<%s=%s%>), if it is something different, then maybe it is not the right thing to construct a translatable sentence from that error/warning gmsgid string and one or more words that are inserted somewhere into the sentence. At least for the ARC_OPT the latter seems to be the case, given e.g.: ARC_OPT (FL_LL64, (1ULL << 5), MASK_LL64, "double load/store") ARC_OPT (FL_BS, (1ULL << 6), MASK_BARREL_SHIFTER,"barrel shifter") Is barrel shifter a keyword, or just random words added into the sentence? If the latter, then the translators might want to translate that too, but in that case together with the surroundings too. ARC_OPT (FL_SPFP, (1ULL << 12), MASK_SPFP_COMPACT_SET, "single precission FPX") ARC_OPT (FL_DPFP, (1ULL << 13), MASK_DPFP_COMPACT_SET, "double precission FPX") has spelling errors, s/precission/precision/g > if ((arc_selected_cpu->arch_info->dflags & CODE) \ > && (VAR != DEFAULT_##VAR) \ > && (VAR != VAL))\ > - warning (0, "Option %s is ignored, the default value %s" \ > -" is considered for %s CPU.", DOC0, DOC1,\ > + warning (0, "option %s is ignored, the default value %s" \ > +" is considered for %s CPU", DOC0, DOC1, \ > arc_selected_cpu->name); \ > } while (0); > #define ARC_OPT(NAME, CODE, MASK, DOC) \ >do { \ > if ((!(arc_selected_cpu->arch_info->flags & CODE)) \ > && (target_flags & MASK)) \ > - error ("Option %s is not available for %s CPU",\ > + error ("option %s is not available for %s CPU",\ >DOC, arc_selected_cpu->name); \ > if ((arc_selected_cpu->arch_info->dflags & CODE) \ > && (target_flags_explicit & MASK) \ > && (!(target_flags & MASK)))\ > - warning (0, "Unset option %s is ignored, it is always" \ > -" enabled for %s CPU.", DOC, \ > + warning (0, "unset option %s is ignored, it is always" \ > +" enabled for %s CPU", DOC, \ > arc_selected_cpu->name); \ >} while (0); > > @@ -7268,7 +7268,8 @@ check_if_valid_regno_const (rtx *operands, int opno) > case CONST_INT : >return true; > default: > - error ("register number must be a compile-time constant. Try giving > higher optimization levels"); > + error ("register number must be a compile-time constant. " > +"Try giving higher optimization levels"); Similarly to the above case. Jakub
Re: [PATCH PR90078]Capping comp_cost computation in ivopts
On Wed, Apr 17, 2019 at 07:14:05PM +0800, Bin.Cheng wrote: > > As > > #define INFTY 1000 > > what is the reason to keep the previous condition as well? > > I mean, if cost1.cost == INFTY or cost2.cost == INFTY, > > cost1.cost + cost2.cost >= INFTY too. > > Unless costs can go negative. > It's a bit complicated, but in general, costs can go negative. Ok, no objections from me then (but as I don't know anything about it, not an ack either; you are ivopts maintainer, so you don't need one). Jakub
Re: [PATCH] Fix up RTL DCE find_call_stack_args (PR rtl-optimization/89965)
Hi, On Tue, 16 Apr 2019, Jeff Law wrote: > So going back to Jakub's patch... I think the discussion points to > avoiding the REG_EQUIV notes for outgoing argument slots. In the long run definitely, but maybe his current solution is more amenable to stage 4, no idea. Ciao, Michael.
Re: [PATCH][RFC] Improve get_qualified_type linear list walk
Hi, On Wed, 17 Apr 2019, Richard Biener wrote: > for the C++ FE the LRU cache effectively moves the unqualified > variants first in the variant list. Since we always first > build the unqualified variants before the qualified ones > the unqualified ones tend to be at the end of the list. That's > clearly bad for the C++ pattern of repeatedly looking up the > unqualified type variant from a type. Of course a direct > shortcut would be much cheaper here (but it obviously isn't > the main variant due to TYPE_NAME differences). > > So do you think the change to get_qualified_type is OK? Or > do we absolutely want to avoid changing the variant list from > a function like this? I think changing the variant list in this accessor should be okay. For it not to be okay some callers would have to remember a particular subset of that list and also care about the order of that subset. That would be fragile no matter what. I had the additional idea to only move the non-qualified variant to the front, i.e. not really LRU. By that we would slowly establish the invariant that unqualified variants are early in the list; or alternatively add a combination of build_variant_type_copy+set_type_quals which would establish that invariant directly. But unlike a real LRU cache it's harder to see if this brings similar benefits as the scheme is then lopsided towards the specific case of looking up unqualified variants. Ciao, Michael.
Re: [PATCH][RFC] Improve get_qualified_type linear list walk
Hi, On Tue, 16 Apr 2019, Jakub Jelinek wrote: > > + if (type1 == type2) > > +return true; > > + if (TYPE_MAIN_VARIANT (type1) != TYPE_MAIN_VARIANT (type2)) > > +return false; > > Is this second one correct though? Doesn't comptypes return for various > cases true even if the TYPE_MAIN_VARIANT is different? Right, that was a thinko. As I said, I rushed this somewhat :) Ciao, Michael.
[PATCH] S/390: Fix PR89952 incorrect CFI
This patch fixes a cases where inconsistent CFI is generated. After restoring the hard frame pointer (r11) from an FPR we have to set the CFA register. In order to be able to set it back to the stack pointer (r15) we have to make sure that r15 has been restored already. The patch also adds a scheduler dependency to prevent the instruction scheduler from swapping the r11 and r15 restore again. gcc/ChangeLog: 2019-04-17 Andreas Krebbel PR target/89952 * config/s390/s390.c (s390_restore_gprs_from_fprs): Restore GPRs from FPRs in reverse order. Generate REG_CFA_DEF_CFA note also for restored hard frame pointer. (s390_sched_dependencies_evaluation): Implement new target hook. (TARGET_SCHED_DEPENDENCIES_EVALUATION_HOOK): New macro definition. gcc/testsuite/ChangeLog: 2019-04-17 Andreas Krebbel PR target/89952 * gcc.target/s390/pr89952.c: New test. --- gcc/config/s390/s390.c | 62 +++-- gcc/testsuite/gcc.target/s390/pr89952.c | 12 +++ 2 files changed, 72 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/pr89952.c diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index ad8eacd..fc4571d 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -10685,7 +10685,11 @@ s390_restore_gprs_from_fprs (void) if (!TARGET_Z10 || !TARGET_HARD_FLOAT || !crtl->is_leaf) return; - for (i = 6; i < 16; i++) + /* Restore the GPRs starting with the stack pointer. That way the + stack pointer already has its original value when it comes to + restoring the hard frame pointer. So we can set the cfa reg back + to the stack pointer. */ + for (i = STACK_POINTER_REGNUM; i >= 6; i--) { rtx_insn *insn; @@ -10701,7 +10705,13 @@ s390_restore_gprs_from_fprs (void) df_set_regs_ever_live (i, true); add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DImode, i)); - if (i == STACK_POINTER_REGNUM) + + /* If either the stack pointer or the frame pointer get restored +set the CFA value to its value at function start. Doing this +for the frame pointer results in .cfi_def_cfa_register 15 +what is ok since if the stack pointer got modified it has +been restored already. */ + if (i == STACK_POINTER_REGNUM || i == HARD_FRAME_POINTER_REGNUM) add_reg_note (insn, REG_CFA_DEF_CFA, plus_constant (Pmode, stack_pointer_rtx, STACK_POINTER_OFFSET)); @@ -16294,6 +16304,49 @@ s390_case_values_threshold (void) return default_case_values_threshold (); } +/* Evaluate the insns between HEAD and TAIL and do back-end to install + back-end specific dependencies. + + Establish an ANTI dependency between r11 and r15 restores from FPRs + to prevent the instructions scheduler from reordering them since + this would break CFI. No further handling in the sched_reorder + hook is required since the r11 and r15 restore will never appear in + the same ready list with that change. */ +void +s390_sched_dependencies_evaluation (rtx_insn *head, rtx_insn *tail) +{ + if (!frame_pointer_needed || !epilogue_completed) +return; + + while (head != tail && DEBUG_INSN_P (head)) +head = NEXT_INSN (head); + + rtx_insn *r15_restore = NULL, *r11_restore = NULL; + + for (rtx_insn *insn = tail; insn != head; insn = PREV_INSN (insn)) +{ + rtx set = single_set (insn); + if (!INSN_P (insn) + || !RTX_FRAME_RELATED_P (insn) + || set == NULL_RTX + || !REG_P (SET_DEST (set)) + || !FP_REG_P (SET_SRC (set))) + continue; + + if (REGNO (SET_DEST (set)) == HARD_FRAME_POINTER_REGNUM) + r11_restore = insn; + + if (REGNO (SET_DEST (set)) == STACK_POINTER_REGNUM) + r15_restore = insn; +} + + if (r11_restore == NULL || r15_restore == NULL) +return; + add_dependence (r11_restore, r15_restore, REG_DEP_ANTI); +} + + + /* Initialize GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP @@ -16585,6 +16638,11 @@ s390_case_values_threshold (void) #undef TARGET_CASE_VALUES_THRESHOLD #define TARGET_CASE_VALUES_THRESHOLD s390_case_values_threshold +#undef TARGET_SCHED_DEPENDENCIES_EVALUATION_HOOK +#define TARGET_SCHED_DEPENDENCIES_EVALUATION_HOOK \ + s390_sched_dependencies_evaluation + + /* Use only short displacement, since long displacement is not available for the floating point instructions. */ #undef TARGET_MAX_ANCHOR_OFFSET diff --git a/gcc/testsuite/gcc.target/s390/pr89952.c b/gcc/testsuite/gcc.target/s390/pr89952.c new file mode 100644 index 000..9f48e08 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/pr89952.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-march=zEC12 -fno-omit-frame-pointer -Os" } */ + + +extern void j(int); + +void +d(int e, long f, int g, int h, int i) { + if (h == 5 && i >= 4 && i <= 7) +h =
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
On Apr 17 2019, Florian Weimer wrote: > Not just that, .bss adds to the commit charge, Only one page at most. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
* Andreas Schwab: > On Apr 17 2019, Florian Weimer wrote: > >> Not just that, .bss adds to the commit charge, > > Only one page at most. That would be a bug. All of it is anonymous memory which needs backing from RAM or swap, in case the process writes to it. Thanks, Florian
collect2 patch to https in URL
Hello Change the "collect2 -help" output to have https URL: Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html 2019-04-14 Jonny Grant * collect2.c: Change gcc.gnu.org URL to HTTPS Thank you Jonny Index: gcc/collect2.c === --- gcc/collect2.c (revision 270408) +++ gcc/collect2.c (working copy) @@ -1640,7 +1640,7 @@ printf (" --help Display this information\n"); printf (" -v, --version Display this program's version number\n"); printf ("\n"); - printf ("Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";); + printf ("Overview: https://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";); printf ("Report bugs: %s\n", bug_report_url); printf ("\n"); }
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
On Apr 17 2019, Florian Weimer wrote: > * Andreas Schwab: > >> On Apr 17 2019, Florian Weimer wrote: >> >>> Not just that, .bss adds to the commit charge, >> >> Only one page at most. > > That would be a bug. You cannot avoid it for the page shared with .data, unless you force .bss to be page aligned. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
* Andreas Schwab: > On Apr 17 2019, Florian Weimer wrote: > >> * Andreas Schwab: >> >>> On Apr 17 2019, Florian Weimer wrote: >>> Not just that, .bss adds to the commit charge, >>> >>> Only one page at most. >> >> That would be a bug. > > You cannot avoid it for the page shared with .data, unless you force > .bss to be page aligned. Would you please elaborate? With “commit charge”, I mean address space accounted towards the commit limit, when Linux is running in vm.overcommit_memory=2 mode. Thanks, Florian
gcc-patches@gcc.gnu.org
In C++1z drafts up to N4606 the constexpr keyword was missing from the detailed description of this function, despite being shown in the class synopsis. That was fixed editorially for N4618, but our implementation was not corrected to match. * include/std/optional (optional::value_or(U&&) &&): Add missing constexpr specifier. * testsuite/20_util/optional/constexpr/observers/4.cc: Check value_or for disengaged optionals and rvalue optionals. * testsuite/20_util/optional/observers/4.cc: Likewise. Tested powerpc64le-linux, committed to trunk. I will backport this to gcc-8-branch too. commit ce471593e4ce944807efad1d0fa7ed5d0a53da1e Author: Jonathan Wakely Date: Wed Apr 17 13:57:14 2019 +0100 Add constexpr to std::optional::value_or(U&&)&& In C++1z drafts up to N4606 the constexpr keyword was missing from the detailed description of this function, despite being shown in the class synopsis. That was fixed editorially for N4618, but our implementation was not corrected to match. * include/std/optional (optional::value_or(U&&) &&): Add missing constexpr specifier. * testsuite/20_util/optional/constexpr/observers/4.cc: Check value_or for disengaged optionals and rvalue optionals. * testsuite/20_util/optional/observers/4.cc: Likewise. diff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional index d243930fed4..503d859bee6 100644 --- a/libstdc++-v3/include/std/optional +++ b/libstdc++-v3/include/std/optional @@ -959,7 +959,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } template - _Tp + constexpr _Tp value_or(_Up&& __u) && { static_assert(is_move_constructible_v<_Tp>); diff --git a/libstdc++-v3/testsuite/20_util/optional/constexpr/observers/4.cc b/libstdc++-v3/testsuite/20_util/optional/constexpr/observers/4.cc index 1f7f0e8b6a2..a085f53f8fa 100644 --- a/libstdc++-v3/testsuite/20_util/optional/constexpr/observers/4.cc +++ b/libstdc++-v3/testsuite/20_util/optional/constexpr/observers/4.cc @@ -25,10 +25,42 @@ struct value_type int i; }; -int main() +void test01() { constexpr std::optional o { value_type { 51 } }; constexpr value_type fallback { 3 }; - static_assert( o.value_or(fallback).i == 51, "" ); - static_assert( o.value_or(fallback).i == (*o).i, "" ); + static_assert( o.value_or(fallback).i == 51 ); + static_assert( o.value_or(fallback).i == (*o).i ); +} + +void test02() +{ + constexpr std::optional o; + constexpr value_type fallback { 3 }; + static_assert( o.value_or(fallback).i == 3 ); +} + +template + constexpr std::optional + make_rvalue(T t) + { return std::optional{t}; } + +void test03() +{ + constexpr value_type fallback { 3 }; + static_assert( make_rvalue(value_type{51}).value_or(fallback).i == 51 ); +} + +void test04() +{ + constexpr value_type fallback { 3 }; + static_assert( make_rvalue(std::nullopt).value_or(fallback).i == 3 ); +} + +int main() +{ + test01(); + test02(); + test03(); + test04(); } diff --git a/libstdc++-v3/testsuite/20_util/optional/observers/4.cc b/libstdc++-v3/testsuite/20_util/optional/observers/4.cc index c24e4e6856e..5d608cdeaf7 100644 --- a/libstdc++-v3/testsuite/20_util/optional/observers/4.cc +++ b/libstdc++-v3/testsuite/20_util/optional/observers/4.cc @@ -26,10 +26,42 @@ struct value_type int i; }; -int main() +void test01() { std::optional o { value_type { 51 } }; value_type fallback { 3 }; VERIFY( o.value_or(fallback).i == 51 ); VERIFY( o.value_or(fallback).i == (*o).i ); } + +void test02() +{ + std::optional o; + value_type fallback { 3 }; + VERIFY( o.value_or(fallback).i == 3 ); +} + +void test03() +{ + std::optional o { value_type { 51 } }; + value_type fallback { 3 }; + VERIFY( std::move(o).value_or(fallback).i == 51 ); + VERIFY( o.has_value() ); + VERIFY( std::move(o).value_or(fallback).i == (*o).i ); +} + +void test04() +{ + std::optional o; + value_type fallback { 3 }; + VERIFY( std::move(o).value_or(fallback).i == 3 ); + VERIFY( !o.has_value() ); +} + +int main() +{ + test01(); + test02(); + test03(); + test04(); +}
Re: [Patch] [Aarch64] PR rtl-optimization/87763 - this patch fixes gcc.target/aarch64/lsl_asr_sbfiz.c
On 4/17/19 4:19 AM, Richard Earnshaw (lists) wrote: > On 10/04/2019 23:03, Steve Ellcey wrote: >> >> Here is another patch to fix one of the failures >> listed in PR rtl-optimization/87763. This change >> fixes gcc.target/aarch64/lsl_asr_sbfiz.c by adding >> an alternative version of *ashiftsi_extv_bfiz that >> has a subreg in it. >> >> Tested with bootstrap and regression test run. >> >> OK for checkin? >> >> Steve Ellcey >> >> >> 2018-04-10 Steve Ellcey >> >> PR rtl-optimization/87763 >> * config/aarch64/aarch64.md (*ashiftsi_extv_bfiz_alt): >> New Instruction. >> >> >> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md >> index e0df975..04dc06f 100644 >> --- a/gcc/config/aarch64/aarch64.md >> +++ b/gcc/config/aarch64/aarch64.md >> @@ -5634,6 +5634,22 @@ >>[(set_attr "type" "bfx")] >> ) >> >> +(define_insn "*ashiftsi_extv_bfiz_alt" >> + [(set (match_operand:SI 0 "register_operand" "=r") >> +(ashift:SI >> + (subreg:SI >> +(sign_extract:DI >> + (subreg:DI (match_operand:SI 1 "register_operand" "r") 0) >> + (match_operand 2 "aarch64_simd_shift_imm_offset_si" "n") >> + (const_int 0)) >> +0) >> + (match_operand 3 "aarch64_simd_shift_imm_si" "n")))] >> + "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), >> + 1, GET_MODE_BITSIZE (SImode) - 1)" >> + "sbfiz\\t%w0, %w1, %3, %2" >> + [(set_attr "type" "bfx")] >> +) >> + >> ;; When the bit position and width of the equivalent extraction add up to 32 >> ;; we can use a W-reg LSL instruction taking advantage of the implicit >> ;; zero-extension of the X-reg. >> > > I don't think this is right for big-endian, where the subreg offset is > not zero. Perhaps you should look at using subreg_lowpart_operator. As general guidance anytime I see a subreg in the .md files I suspect we're likely gone the wrong direction at some point. That doesn't mean we can't use subregs, nor does it mean it's wrong in this instance, but it certainly makes me look at the changes more carefully to see if we can do something earlier or later so that we're not matching subreg expressions in the md files. I agree that in this specific case, it's likely incorrect. subreg_lowpart_* would likely help, either as a predicate or as an operator. jeff
Re: [PATCH][C++] Improve compile-time by ordering expensive checks last
On Tue, 16 Apr 2019, Richard Biener wrote: > > Two cases from a -fsynax-only tramp3d callgrind profile. Amended by two others. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK? Thanks, Richard. 2019-04-17 Richard Biener cp/ * call.c (null_ptr_cst_p): Order checks according to expensiveness. (conversion_null_warnings): Likewise. * typeck.c (same_type_ignoring_top_level_qualifiers_p): Return early if type1 == type2. Index: gcc/cp/call.c === --- gcc/cp/call.c (revision 270407) +++ gcc/cp/call.c (working copy) @@ -541,11 +541,11 @@ null_ptr_cst_p (tree t) STRIP_ANY_LOCATION_WRAPPER (t); /* Core issue 903 says only literal 0 is a null pointer constant. */ - if (TREE_CODE (type) == INTEGER_TYPE - && !char_type_p (type) - && TREE_CODE (t) == INTEGER_CST + if (TREE_CODE (t) == INTEGER_CST + && !TREE_OVERFLOW (t) + && TREE_CODE (type) == INTEGER_TYPE && integer_zerop (t) - && !TREE_OVERFLOW (t)) + && !char_type_p (type)) return true; } else if (CP_INTEGRAL_TYPE_P (type)) @@ -6844,8 +6844,9 @@ static void conversion_null_warnings (tree totype, tree expr, tree fn, int argnum) { /* Issue warnings about peculiar, but valid, uses of NULL. */ - if (null_node_p (expr) && TREE_CODE (totype) != BOOLEAN_TYPE - && ARITHMETIC_TYPE_P (totype)) + if (TREE_CODE (totype) != BOOLEAN_TYPE + && ARITHMETIC_TYPE_P (totype) + && null_node_p (expr)) { location_t loc = get_location_for_expr_unwinding_for_system_header (expr); if (fn) @@ -6882,8 +6883,8 @@ conversion_null_warnings (tree totype, t } /* Handle zero as null pointer warnings for cases other than EQ_EXPR and NE_EXPR */ - else if (null_ptr_cst_p (expr) && - (TYPE_PTR_OR_PTRMEM_P (totype) || NULLPTR_TYPE_P (totype))) + else if ((TYPE_PTR_OR_PTRMEM_P (totype) || NULLPTR_TYPE_P (totype)) + && null_ptr_cst_p (expr)) { location_t loc = get_location_for_expr_unwinding_for_system_header (expr); maybe_warn_zero_as_null_pointer_constant (expr, loc); Index: gcc/cp/typeck.c === --- gcc/cp/typeck.c (revision 270407) +++ gcc/cp/typeck.c (working copy) @@ -1508,6 +1508,8 @@ same_type_ignoring_top_level_qualifiers_ { if (type1 == error_mark_node || type2 == error_mark_node) return false; + if (type1 == type2) +return true; type1 = cp_build_qualified_type (type1, TYPE_UNQUALIFIED); type2 = cp_build_qualified_type (type2, TYPE_UNQUALIFIED);
Re: [PATCH] Fixup IRA debug dump output
On 4/16/19 12:47 PM, Peter Bergner wrote: > The patch below fixes the issue not continuing if the allocno's conflict > array is null and instead guarding the current conflict prints by that > test. If the conflict array is null, we instead now print out simple > empty conflict info. This now gives us what we'd expect to see: > > ;; a5(r116,l0) conflicts: > ;; total conflict hard regs: > ;; conflict hard regs: > > > cp0:a0(r111)<->a4(r117)@330:move Actually, if we keep the continue, it makes the patch smaller and more readable. How about this instead which gives the same output as the previous patch? Peter * ira-conflicts.c (print_allocno_conflicts): Always print something, even for allocno's with no conflicts. (print_conflicts): Print an extra newline. Index: gcc/ira-conflicts.c === --- gcc/ira-conflicts.c (revision 270331) +++ gcc/ira-conflicts.c (working copy) @@ -633,7 +631,12 @@ print_allocno_conflicts (FILE * file, bo ira_object_conflict_iterator oci; if (OBJECT_CONFLICT_ARRAY (obj) == NULL) - continue; + { + fprintf (file, "\n;; total conflict hard regs:\n"); + fprintf (file, ";; conflict hard regs:\n\n"); + continue; + } + if (n > 1) fprintf (file, "\n;; subobject %d:", i); FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) @@ -683,6 +686,7 @@ print_conflicts (FILE *file, bool reg_p) FOR_EACH_ALLOCNO (a, ai) print_allocno_conflicts (file, reg_p, a); + putc ('\n', file); } /* Print information about allocno or only regno (if REG_P) conflicts
[PATCH] Fix up dg-extract-results.sh
Hi! On Tue, Apr 16, 2019 at 09:26:46PM +0200, Christophe Lyon wrote: > > Actually, I managed to reproduce in a Fedora 31 chroot, in which I don't > > have /usr/bin/python installed (I think in Fedora 30+ there is > > /usr/bin/python2 and /usr/bin/python3 but not /usr/bin/python, at least not > > in the default buildroot). So, I've grabbed 11 *.log and 11 *.sum files from testsuite/gcc*/, injected a couple of -PASS: gcc.c-torture/execute/20001009-2.c -O1 execution test +WARNING: program timed out. +FAIL: gcc.c-torture/execute/20001009-2.c -O1 execution test changes into them (both to *.log and *.sum) and tested, each time with dg-extract-results.sh -L *.log > logN and dg-extract-results.sh *.sum > sumN The tested versions were: 1) gcc-8 2) current trunk 3) current trunk with this patch (all of them with /usr/bin/python available, so effectively using the python version) and 4) gcc-8 5) current trunk 6) current trunk with this patch contrib/dg-extract-results.sh copied to /tmp/dg-extract-results.sh so that it doesn't find the python version and thus uses awk. I found a couple of issues in the patch I've sent earlier, so this contains fixes. 973bf331c5f223a08a4724289635fe43 log1 973bf331c5f223a08a4724289635fe43 log2 973bf331c5f223a08a4724289635fe43 log3 b7fde321188f9d60265c2801fb8e81e9 log4 26e7dc514ab063b99d2929759826814b log5 b7fde321188f9d60265c2801fb8e81e9 log6 d6a24e581653e284d9db118cca48f72c sum1 ca25461808ea1f9b061409fe096f286f sum2 ca25461808ea1f9b061409fe096f286f sum3 33e194e093632290a5d5bd16cb15ca10 sum4 f82f4a60a095655d7359700b7bf688e1 sum5 f82f4a60a095655d7359700b7bf688e1 sum6 Thus, there is no change in -L log mode generation between any of those 3 python versions (note, the patch just changes a comment typo in the *.py, so 2 and 3 are expected to be identical with -L), and appart from the broken trunk handling of -L gcc 8 as well as trunk with this patch generate the same logfile too (though, not identical to python). As for the sum mode, gcc 8 generated with both python and sh/awk different results from trunk and trunk with patch, with the WARNING: program timed out. lines sorted together at one spot, while trunk and trunk with patch emit identical result (though, again, python generates different ordering from sh/awk). So, I believe with this patch the results are exactly as I expect them, the *.sum WARNING: thing is improved as Christophe wanted, while *.log files which are broken on current trunk totally when not using python are fixed again. The incremental fixes from previous patch are using correct operator for $0 matching and also, as we use timeout_cnt value of 0 with the meaning that no timeout message needs to be handled, but in theory WARNING: program timed out. could appear also with cnt == 0, I've made that var contain otherwise cnt + 1 and subtract 1 again when printing it. Ok for trunk? 2019-04-17 Jakub Jelinek * dg-extract-results.sh: Only handle WARNING: program timed out lines specially in "$MODE" == "sum". Restore previous behavior for "$MODE" != "sum". Clear has_timeout and timeout_cnt if in a different variant or curfile is empty. * dg-extract-results.py: Fix a typo. --- contrib/dg-extract-results.sh.jj2019-03-05 21:49:34.471573434 +0100 +++ contrib/dg-extract-results.sh 2019-04-17 17:35:53.718285283 +0200 @@ -331,13 +331,15 @@ BEGIN { # Ugly hack for gfortran.dg/dg.exp if ("$TOOL" == "gfortran" && testname ~ /^gfortran.dg\/g77\//) testname="h"testname - if (\$1 == "WARNING:" && \$2 == "program" && \$3 == "timed" && (\$4 == "out" || \$4 == "out.")) { -has_timeout=1 -timeout_cnt=cnt - } else { - # Prepare timeout replacement message in case it's needed -timeout_msg=\$0 -sub(\$1, "WARNING:", timeout_msg) + if ("$MODE" == "sum") { +if (\$0 ~ /^WARNING: program timed out/) { + has_timeout=1 + timeout_cnt=cnt+1 +} else { + # Prepare timeout replacement message in case it's needed + timeout_msg=\$0 + sub(\$1, "WARNING:", timeout_msg) +} } } /^$/ { if ("$MODE" == "sum") next } @@ -345,25 +347,30 @@ BEGIN { if ("$MODE" == "sum") { # Do not print anything if the current line is a timeout if (has_timeout == 0) { -# If the previous line was a timeout, -# insert the full current message without keyword -if (timeout_cnt != 0) { - printf "%s %08d|%s program timed out.\n", testname, timeout_cnt, timeout_msg >> curfile - timeout_cnt = 0 - cnt = cnt + 1 -} -printf "%s %08d|", testname, cnt >> curfile -cnt = cnt + 1 -filewritten[curfile]=1 -need_close=1 -if (timeout_cnt == 0) - print >> curfile + # If the previous line was a timeout, + # insert the full current message without keyword + if (timeout_cnt != 0) { + printf "%s %08d|%s program timed out.\n", testname, timeout_cnt-1,
[PATCH 1/3] Fix tests for std::variant to match original intention
* testsuite/20_util/variant/compile.cc (MoveCtorOnly): Fix type to actually match its name. (MoveCtorAndSwapOnly): Define new type that adds swap to MoveCtorOnly. (test_swap()): Fix result for MoveCtorOnly and check MoveCtorAndSwapOnly. Tested powerpc64le-linux. commit 855e2fb029adf77f6189f01b1a8d86dc2cca2464 Author: Jonathan Wakely Date: Wed Apr 17 14:55:39 2019 +0100 Fix tests for std::variant to match original intention * testsuite/20_util/variant/compile.cc (MoveCtorOnly): Fix type to actually match its name. (MoveCtorAndSwapOnly): Define new type that adds swap to MoveCtorOnly. (test_swap()): Fix result for MoveCtorOnly and check MoveCtorAndSwapOnly. diff --git a/libstdc++-v3/testsuite/20_util/variant/compile.cc b/libstdc++-v3/testsuite/20_util/variant/compile.cc index 04fef0be13f..5a2d91709a0 100644 --- a/libstdc++-v3/testsuite/20_util/variant/compile.cc +++ b/libstdc++-v3/testsuite/20_util/variant/compile.cc @@ -54,12 +54,15 @@ struct DefaultNoexcept struct MoveCtorOnly { MoveCtorOnly() noexcept = delete; - MoveCtorOnly(const DefaultNoexcept&) noexcept = delete; - MoveCtorOnly(DefaultNoexcept&&) noexcept { } - MoveCtorOnly& operator=(const DefaultNoexcept&) noexcept = delete; - MoveCtorOnly& operator=(DefaultNoexcept&&) noexcept = delete; + MoveCtorOnly(const MoveCtorOnly&) noexcept = delete; + MoveCtorOnly(MoveCtorOnly&&) noexcept { } + MoveCtorOnly& operator=(const MoveCtorOnly&) noexcept = delete; + MoveCtorOnly& operator=(MoveCtorOnly&&) noexcept = delete; }; +struct MoveCtorAndSwapOnly : MoveCtorOnly { }; +void swap(MoveCtorAndSwapOnly&, MoveCtorAndSwapOnly&) { } + struct nonliteral { nonliteral() { } @@ -259,7 +262,8 @@ static_assert( !std::is_swappable_v> ); void test_swap() { static_assert(is_swappable_v>, ""); - static_assert(is_swappable_v>, ""); + static_assert(!is_swappable_v>, ""); + static_assert(is_swappable_v>, ""); static_assert(!is_swappable_v>, ""); }
[PATCH 2/3] Remove unnecessary string literals from static_assert in C++17 tests
Remove unnecessary string literals from static_assert in C++17 tests The string literal is optional in C++17 and all these are empty so add no value. Tested powerpc64le-linux. commit 028676a32fa51c0116e3c117a36550dd04cd39fe Author: Jonathan Wakely Date: Wed Apr 17 14:57:41 2019 +0100 Remove unnecessary string literals from static_assert in C++17 tests The string literal is optional in C++17 and all these are empty so add no value. * testsuite/20_util/variant/compile.cc: Remove empty string literals from static_assert declarations. diff --git a/libstdc++-v3/testsuite/20_util/variant/compile.cc b/libstdc++-v3/testsuite/20_util/variant/compile.cc index 5a2d91709a0..b67c98adf4a 100644 --- a/libstdc++-v3/testsuite/20_util/variant/compile.cc +++ b/libstdc++-v3/testsuite/20_util/variant/compile.cc @@ -77,59 +77,59 @@ struct nonliteral void default_ctor() { - static_assert(is_default_constructible_v>, ""); - static_assert(is_default_constructible_v>, ""); - static_assert(!is_default_constructible_v>, ""); - static_assert(is_default_constructible_v>, ""); + static_assert(is_default_constructible_v>); + static_assert(is_default_constructible_v>); + static_assert(!is_default_constructible_v>); + static_assert(is_default_constructible_v>); - static_assert(noexcept(variant()), ""); - static_assert(!noexcept(variant()), ""); - static_assert(noexcept(variant()), ""); + static_assert(noexcept(variant())); + static_assert(!noexcept(variant())); + static_assert(noexcept(variant())); } void copy_ctor() { - static_assert(is_copy_constructible_v>, ""); - static_assert(!is_copy_constructible_v>, ""); - static_assert(is_trivially_copy_constructible_v>, ""); - static_assert(!is_trivially_copy_constructible_v>, ""); + static_assert(is_copy_constructible_v>); + static_assert(!is_copy_constructible_v>); + static_assert(is_trivially_copy_constructible_v>); + static_assert(!is_trivially_copy_constructible_v>); { variant a; -static_assert(noexcept(variant(a)), ""); +static_assert(noexcept(variant(a))); } { variant a; -static_assert(!noexcept(variant(a)), ""); +static_assert(!noexcept(variant(a))); } { variant a; -static_assert(!noexcept(variant(a)), ""); +static_assert(!noexcept(variant(a))); } { variant a; -static_assert(noexcept(variant(a)), ""); +static_assert(noexcept(variant(a))); } } void move_ctor() { - static_assert(is_move_constructible_v>, ""); - static_assert(!is_move_constructible_v>, ""); - static_assert(is_trivially_move_constructible_v>, ""); - static_assert(!is_trivially_move_constructible_v>, ""); - static_assert(!noexcept(variant(declval>())), ""); - static_assert(noexcept(variant(declval>())), ""); + static_assert(is_move_constructible_v>); + static_assert(!is_move_constructible_v>); + static_assert(is_trivially_move_constructible_v>); + static_assert(!is_trivially_move_constructible_v>); + static_assert(!noexcept(variant(declval>(; + static_assert(noexcept(variant(declval>(; } void arbitrary_ctor() { - static_assert(!is_constructible_v, const char*>, ""); - static_assert(is_constructible_v, const char*>, ""); - static_assert(noexcept(variant(int{})), ""); - static_assert(noexcept(variant(int{})), ""); - static_assert(!noexcept(variant(Empty{})), ""); - static_assert(noexcept(variant(DefaultNoexcept{})), ""); + static_assert(!is_constructible_v, const char*>); + static_assert(is_constructible_v, const char*>); + static_assert(noexcept(variant(int{}))); + static_assert(noexcept(variant(int{}))); + static_assert(!noexcept(variant(Empty{}))); + static_assert(noexcept(variant(DefaultNoexcept{}))); } void in_place_index_ctor() @@ -142,105 +142,105 @@ void in_place_type_ctor() { variant a(in_place_type, "a"); variant b(in_place_type, {'a'}); - static_assert(!is_constructible_v, in_place_type_t, const char*>, ""); + static_assert(!is_constructible_v, in_place_type_t, const char*>); } void dtor() { - static_assert(is_destructible_v>, ""); - static_assert(is_destructible_v>, ""); + static_assert(is_destructible_v>); + static_assert(is_destructible_v>); } void copy_assign() { - static_assert(is_copy_assignable_v>, ""); - static_assert(!is_copy_assignable_v>, ""); - static_assert(is_trivially_copy_assignable_v>, ""); - static_assert(!is_trivially_copy_assignable_v>, ""); + static_assert(is_copy_assignable_v>); + static_assert(!is_copy_assignable_v>); + static_assert(is_trivially_copy_assignable_v>); + static_assert(!is_trivially_copy_assignable_v>); { variant a; -static_assert(!noexcept(a = a), ""); +static_assert(!noexcept(a = a)); } { variant a; -static_assert(noexcept(a = a), ""); +static_assert(noexcept(a = a)); } } void move_assign() { - static_assert(is_move_assignable_v>, ""); - static_assert(!is_move_assignable_v>, ""); - static_as
[PATCH 3/3] Fix condition for std::variant to be copy constructible
The standard says the std::variant copy constructor is defined as deleted unless all alternative types are copy constructible, but we were making it also depend on move constructible. Fix the condition and enhance the tests to check the semantics with pathological copy-only types (i.e. supporting copying but having deleted moves). The enhanced tests revealed a regression in copy assignment for non-trivial alternative types, where the assignment would not be performed because the condition in the _Copy_assign_base visitor is false: is_same_v, remove_reference_t>. Tested powerpc64le-linux. I plan to commit all three of these patches later today, unless somebody sees a problem with them. commit a5a517df4933ffd0e6a08c42280c7d2ee0699904 Author: Jonathan Wakely Date: Wed Apr 17 16:17:25 2019 +0100 Fix condition for std::variant to be copy constructible The standard says the std::variant copy constructor is defined as deleted unless all alternative types are copy constructible, but we were making it also depend on move constructible. Fix the condition and enhance the tests to check the semantics with pathological copy-only types (i.e. supporting copying but having deleted moves). The enhanced tests revealed a regression in copy assignment for non-trivial alternative types, where the assignment would not be performed because the condition in the _Copy_assign_base visitor is false: is_same_v, remove_reference_t>. * include/std/variant (__detail::__variant::_Traits::_S_copy_assign): Do not depend on whether all alternative types are move constructible. (__detail::__variant::_Copy_assign_base::operator=): Remove cv-quals from the operand when deciding whether to perform the assignment. * testsuite/20_util/variant/compile.cc (DeletedMoves): Define type with deleted move constructor and deleted move assignment operator. (default_ctor, copy_ctor, move_ctor, copy_assign, move_assign): Check behaviour of variants with DeletedMoves as an alternative. * testsuite/20_util/variant/run.cc (DeletedMoves): Define same type. (move_ctor, move_assign): Check that moving a variant with a DeletedMoves alternative falls back to copying instead of moving. diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant index 22b0c3d5c22..e153363bbf3 100644 --- a/libstdc++-v3/include/std/variant +++ b/libstdc++-v3/include/std/variant @@ -279,7 +279,7 @@ namespace __variant static constexpr bool _S_move_ctor = (is_move_constructible_v<_Types> && ...); static constexpr bool _S_copy_assign = - _S_copy_ctor && _S_move_ctor + _S_copy_ctor && (is_copy_assignable_v<_Types> && ...); static constexpr bool _S_move_assign = _S_move_ctor @@ -613,7 +613,7 @@ namespace __variant __variant::__get<__rhs_index>(*this); if constexpr (is_same_v< remove_reference_t, - remove_reference_t>) + __remove_cvref_t>) __this_mem = __rhs_mem; } } diff --git a/libstdc++-v3/testsuite/20_util/variant/compile.cc b/libstdc++-v3/testsuite/20_util/variant/compile.cc index b67c98adf4a..5cc2a9460a9 100644 --- a/libstdc++-v3/testsuite/20_util/variant/compile.cc +++ b/libstdc++-v3/testsuite/20_util/variant/compile.cc @@ -63,6 +63,15 @@ struct MoveCtorOnly struct MoveCtorAndSwapOnly : MoveCtorOnly { }; void swap(MoveCtorAndSwapOnly&, MoveCtorAndSwapOnly&) { } +struct DeletedMoves +{ + DeletedMoves() = default; + DeletedMoves(const DeletedMoves&) = default; + DeletedMoves(DeletedMoves&&) = delete; + DeletedMoves& operator=(const DeletedMoves&) = default; + DeletedMoves& operator=(DeletedMoves&&) = delete; +}; + struct nonliteral { nonliteral() { } @@ -81,6 +90,7 @@ void default_ctor() static_assert(is_default_constructible_v>); static_assert(!is_default_constructible_v>); static_assert(is_default_constructible_v>); + static_assert(is_default_constructible_v>); static_assert(noexcept(variant())); static_assert(!noexcept(variant())); @@ -93,6 +103,7 @@ void copy_ctor() static_assert(!is_copy_constructible_v>); static_assert(is_trivially_copy_constructible_v>); static_assert(!is_trivially_copy_constructible_v>); + static_assert(is_trivially_copy_constructible_v>); { variant a; @@ -116,6 +127,7 @@ void move_ctor() { static_assert(is_move_constructible_v>); static_assert(!is_move_constructible_v>); + static_assert(is_move_constructible_v>); // uses copy ctor static_assert(is_trivially_move_constructible_v>); static_assert(!is_trivially_move_constructible_v>); static_assert(!noexcept(variant(declval>(; @@ -157,6 +169,7 @@ void copy_assign() static_assert(!is_copy_assignable_v>); static_assert(is_trivially_copy_assignable_v>); static_assert(!is_trivially_copy_assignable_v>); + static_assert(is_t
Re: [PATCH 1/3] Fix tests for std::variant to match original intention
On Wed, 17 Apr 2019 at 19:07, Jonathan Wakely wrote: > > * testsuite/20_util/variant/compile.cc (MoveCtorOnly): Fix type to > actually match its name. > (MoveCtorAndSwapOnly): Define new type that adds swap to MoveCtorOnly. > (test_swap()): Fix result for MoveCtorOnly and check > MoveCtorAndSwapOnly. > > Tested powerpc64le-linux. Looks good to me.
Re: [PATCH 2/3] Remove unnecessary string literals from static_assert in C++17 tests
On Wed, 17 Apr 2019 at 19:09, Jonathan Wakely wrote: > > Remove unnecessary string literals from static_assert in C++17 tests > > The string literal is optional in C++17 and all these are empty so add > no value. > > > Tested powerpc64le-linux. Looks good to me.
Re: [PATCH 3/3] Fix condition for std::variant to be copy constructible
On Wed, 17 Apr 2019 at 19:12, Jonathan Wakely wrote: > > The standard says the std::variant copy constructor is defined as > deleted unless all alternative types are copy constructible, but we were > making it also depend on move constructible. Fix the condition and > enhance the tests to check the semantics with pathological copy-only > types (i.e. supporting copying but having deleted moves). > > The enhanced tests revealed a regression in copy assignment for > non-trivial alternative types, where the assignment would not be > performed because the condition in the _Copy_assign_base visitor is > false: is_same_v, remove_reference_t>. > > > Tested powerpc64le-linux. > > I plan to commit all three of these patches later today, unless > somebody sees a problem with them. Looks good to me.
Re: [PATCH] Fix up dg-extract-results.sh
On Apr 17, 2019, at 8:59 AM, Jakub Jelinek wrote: > Ok for trunk? Ok.
Re: [PATCH] backport r257541, r259936, r260294, r260623, r261098, r261333, r268585.
Hi! On Wed, Apr 17, 2019 at 03:05:06PM +0800, Xiong Hu Luo wrote: > On 2019/4/16 PM6:54, Segher Boessenkool wrote: > > ("be" and "le" are essentially PowerPC-specific selectors on the 7 branch, > > otherwise you'd need a release manager's approval as well). > > Do you mean move the "be" and "le" code from > gcc/testsuite/lib/target-supports.exp to > gcc/testsuite/gcc.target/powerpc/powerpc.exp here? I mean it is okay as you posted it, and I can approve it even though it is in generic code. :-) Segher
C++ PATCH for c++/90124 - bogus error with incomplete type in decltype
This fixes a recent P1. Here we were giving the "invalid use of incomplete type" error, but "the operand of the decltype specifier is an unevaluated operand" and so the objects it names are not required to have a definition. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2019-04-17 Marek Polacek PR c++/90124 - bogus error with incomplete type in decltype. * typeck.c (build_class_member_access_expr): Check cp_unevaluated_operand. * g++.dg/cpp0x/decltype70.C: New test. diff --git gcc/cp/typeck.c gcc/cp/typeck.c index 03b14024738..7224d9bf9ed 100644 --- gcc/cp/typeck.c +++ gcc/cp/typeck.c @@ -2477,7 +2477,8 @@ build_class_member_access_expr (cp_expr object, tree member, /* We didn't complain above about a currently open class, but now we must: we don't know how to refer to a base member before layout is complete. But still don't complain in a template. */ - if (!dependent_type_p (object_type) + if (!cp_unevaluated_operand + && !dependent_type_p (object_type) && !complete_type_or_maybe_complain (object_type, object, complain)) return error_mark_node; diff --git gcc/testsuite/g++.dg/cpp0x/decltype70.C gcc/testsuite/g++.dg/cpp0x/decltype70.C new file mode 100644 index 000..b26aca90651 --- /dev/null +++ gcc/testsuite/g++.dg/cpp0x/decltype70.C @@ -0,0 +1,10 @@ +// PR c++/90124 +// { dg-do compile { target c++11 } } + +class a { +public: + int b; +}; +class c : a { + auto m_fn1() -> decltype(b); +};
Re: C++ PATCH for c++/90124 - bogus error with incomplete type in decltype
On Wed, Apr 17, 2019 at 10:45 AM Marek Polacek wrote: > > This fixes a recent P1. Here we were giving the "invalid use of incomplete > type" error, but "the operand of the decltype specifier is an unevaluated > operand" > and so the objects it names are not required to have a definition. > > Bootstrapped/regtested on x86_64-linux, ok for trunk? > > 2019-04-17 Marek Polacek > > PR c++/90124 - bogus error with incomplete type in decltype. > * typeck.c (build_class_member_access_expr): Check > cp_unevaluated_operand. OK, thanks. Jason
Re: [PATCH] Fixup IRA debug dump output
On 4/17/19 9:35 AM, Peter Bergner wrote: > On 4/16/19 12:47 PM, Peter Bergner wrote: >> The patch below fixes the issue not continuing if the allocno's conflict >> array is null and instead guarding the current conflict prints by that >> test. If the conflict array is null, we instead now print out simple >> empty conflict info. This now gives us what we'd expect to see: >> >> ;; a5(r116,l0) conflicts: >> ;; total conflict hard regs: >> ;; conflict hard regs: >> >> >> cp0:a0(r111)<->a4(r117)@330:move > > > Actually, if we keep the continue, it makes the patch smaller and more > readable. How about this instead which gives the same output as the > previous patch? > > Peter > > * ira-conflicts.c (print_allocno_conflicts): Always print something, > even for allocno's with no conflicts. > (print_conflicts): Print an extra newline. OK. And while it's technically not a regression fix, I think this can safely go in now :-) jeff
Re: [PATCH] auto-inc-dec: Set alignment properly
On 4/17/19 4:13 AM, Segher Boessenkool wrote: > When auto-inc-dec creates a new mem to compute the cost of doing some > transform, it forgets to copy over the alignment of the original mem. > This gives wrong costs, for example, for rs6000 a floating point load > or store is hugely expensive if unaligned. This patch fixes it. > > This doesn't fix any test case I'm aware of, but it is a very simple > patch. Is it okay for trunk? > > > Segher > > > 2019-04-17 Segher Boessenkool > > * auto-inc-dec.c (attempt_change): Set the alignment of the > temporary memory to that of the original. Given this is only changing the RTL passed into the costing calculations, I think it can go in now. OK for the trunk. jeff
Re: collect2 patch to https in URL
On 4/17/19 6:45 AM, Jonny Grant wrote: > Hello > > Change the "collect2 -help" output to have https URL: > > Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html > > 2019-04-14 Jonny Grant > * collect2.c: Change gcc.gnu.org URL to HTTPS > > > Thank you > Jonny THanks. I've installed this on the trunk. jeff
[PATCH] Fix up _mm_maskz_f{,n}m{add,sub}_round_s{s,d} at -O0 (PR target/90125)
Hi! The following patch fixes a bunch of pastos in the -O0 macros in the PR89784 implementation plus testcase coverage that FAILs without the header change and succeeds with that (the tests were previously run at -O2 only where they test the inline functions and not the macros). Because at -O0 the C x * y + z isn't contracted into FMA, there is a small precision difference in two of the tests with the chosen constants, so I've changed them to ones where a precision difference isn't really possible. I think the constants weren't chosen very well, because either we just want some basic testing, for which even the adjusted ones are ok, or we want to specifically check for FMA, in that case we should check some FMA cornercases where without FMA the result is completely different from one with FMA. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? And sorry for screwing it up. 2019-04-17 Hongtao Liu PR target/90125 * config/i386/avx512fintrin.h (_mm_maskz_fmadd_round_sd, _mm_maskz_fmadd_round_ss, _mm_maskz_fmsub_round_sd, _mm_maskz_fmsub_round_ss, _mm_maskz_fnmadd_round_sd, _mm_maskz_fnmadd_round_ss, _mm_maskz_fnmsub_round_sd, _mm_maskz_fnmsub_round_ss): Use _maskz builtin instead of _mask3. 2019-04-17 Jakub Jelinek PR target/90125 * gcc.target/i386/avx512f-vfmsubXXXss-2.c (avx512f_test): Adjust constants to ensure precise result even when not using fma. * gcc.target/i386/avx512f-vfnmaddXXXss-2.c (avx512f_test): Likewise. * gcc.target/i386/avx512f-vfmaddXXXsd-3.c: New test. * gcc.target/i386/avx512f-vfmaddXXXss-3.c: New test. * gcc.target/i386/avx512f-vfmsubXXXsd-3.c: New test. * gcc.target/i386/avx512f-vfmsubXXXss-3.c: New test. * gcc.target/i386/avx512f-vfnmaddXXXsd-3.c: New test. * gcc.target/i386/avx512f-vfnmaddXXXss-3.c: New test. * gcc.target/i386/avx512f-vfnmsubXXXsd-3.c: New test. * gcc.target/i386/avx512f-vfnmsubXXXss-3.c: New test. --- gcc/config/i386/avx512fintrin.h.jj 2019-03-22 11:07:00.699948784 +0100 +++ gcc/config/i386/avx512fintrin.h 2019-04-17 11:24:53.683695473 +0200 @@ -12104,10 +12104,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U, (__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, C, U, R) #define _mm_maskz_fmadd_round_sd(U, A, B, C, R)\ -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, B, C, U, R) +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, B, C, U, R) #define _mm_maskz_fmadd_round_ss(U, A, B, C, R)\ -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, C, U, R) +(__m128) __builtin_ia32_vfmaddss3_maskz (A, B, C, U, R) #define _mm_mask_fmsub_round_sd(A, U, B, C, R)\ (__m128d) __builtin_ia32_vfmaddsd3_mask (A, B, -(C), U, R) @@ -12122,10 +12122,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U, (__m128) __builtin_ia32_vfmsubss3_mask3 (A, B, C, U, R) #define _mm_maskz_fmsub_round_sd(U, A, B, C, R)\ -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, B, -(C), U, R) +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, B, -(C), U, R) #define _mm_maskz_fmsub_round_ss(U, A, B, C, R)\ -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, -(C), U, R) +(__m128) __builtin_ia32_vfmaddss3_maskz (A, B, -(C), U, R) #define _mm_mask_fnmadd_round_sd(A, U, B, C, R)\ (__m128d) __builtin_ia32_vfmaddsd3_mask (A, -(B), C, U, R) @@ -12140,10 +12140,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U, (__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), C, U, R) #define _mm_maskz_fnmadd_round_sd(U, A, B, C, R)\ -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, -(B), C, U, R) +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, -(B), C, U, R) #define _mm_maskz_fnmadd_round_ss(U, A, B, C, R)\ -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), C, U, R) +(__m128) __builtin_ia32_vfmaddss3_maskz (A, -(B), C, U, R) #define _mm_mask_fnmsub_round_sd(A, U, B, C, R)\ (__m128d) __builtin_ia32_vfmaddsd3_mask (A, -(B), -(C), U, R) @@ -12158,10 +12158,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U, (__m128) __builtin_ia32_vfmsubss3_mask3 (A, -(B), C, U, R) #define _mm_maskz_fnmsub_round_sd(U, A, B, C, R)\ -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, -(B), -(C), U, R) +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, -(B), -(C), U, R) #define _mm_maskz_fnmsub_round_ss(U, A, B, C, R)\ -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), -(C), U, R) +(__m128) __builtin_ia32_vfmaddss3_maskz (A, -(B), -(C), U, R) #endif #ifdef __OPTIMIZE__ --- gcc/testsuite/gcc.target/i386/avx512f-vfmsubXXXss-2.c.jj2019-03-22 11:07:00.701948752 +0100 +++ gcc/testsuite/gcc.target/i386/avx512f-vfmsubXXXss-2.c 2019-04-17 11:35:57.314481901 +0200 @@ -41,8 +41,8 @@ avx512f_test (void) for (i = 0; i < SIZE; i++) { src1.a[i] = DEFAULT_VALUE; -
Re: [PATCH][RFC] Improve get_qualified_type linear list walk
On 4/16/19 6:55 AM, Richard Biener wrote: > > The following makes the C++ FEs heavy use of build_qualified_type > cheaper. When looking at a tramp3d -fsyntax-only compile you can > see that for 470.000 build_qualified_type calls we end up > with 9.492.205 calls to check_qualified_type (thus we visit around > 20 variant type candidates) ending up finding it in all but > 15.300 cases that end up in build_variant_type_copy. > > That's of course because the FE uses this machinery to do things like > > bool > same_type_ignoring_top_level_qualifiers_p (tree type1, tree type2) > { > if (type1 == error_mark_node || type2 == error_mark_node) > return false; > > type1 = cp_build_qualified_type (type1, TYPE_UNQUALIFIED); > type2 = cp_build_qualified_type (type2, TYPE_UNQUALIFIED); > return same_type_p (type1, type2); > > but so it be. The improvement is to re-organize get_qualified_type > to put found type variants on the head of the variant list. This > improves the number of calls to check_qualified_type to 1.215.030 > thus around 2.5 candidates. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > Comments? OK? > > Richard. > > 2019-04-16 Richard Biener > > * tree.c (get_qualified_type): Put found type variants at the > head of the variant list. Seems quite reasonable to me. I just hope we don't find a case where this is the exact worst case behavior ;-) jeff
Re: [PATCH v2] Fix __patchable_function_entries section flags
On 4/15/19 10:31 AM, Joao Moreira wrote: > > > On 4/12/19 1:19 PM, Jeff Law wrote: >> On 4/11/19 11:18 AM, Joao Moreira wrote: >>> When -fpatchable-relocation-entry is used, gcc places nops on the >>> prologue of each compiled function and creates a section named >>> __patchable_function_entries which holds relocation entries for the >>> positions in which the nops were placed. As is, gcc creates this >>> section without the proper section flags, causing crashes in the >>> compiled program during its load. >>> >>> Given the above, fix the problem by creating the section with the >>> SECTION_WRITE and SECTION_RELRO flags. >>> >>> The problem was noticed while compiling glibc with >>> -fpatchable-function-entry compiler flag. After applying the patch, >>> this issue was solved. >>> >>> This was also tested on x86-64 arch without visible problems under >>> the gcc standard tests. >>> >>> 2019-04-10 Joao Moreira >>> >>> * targhooks.c (default_print_patchable_function_entry): Emit >>> __patchable_function_entries section with writable flags to allow >>> relocation resolution. >> OK. Do you have write access to the GCC repo? >> > No. I went ahead and installed on the trunk for you. Are you going to be working on GCC regularly? If so it might make sense to go ahead and get that access setup. Jeff
Re: [PATCH] Fix up _mm_maskz_f{,n}m{add,sub}_round_s{s,d} at -O0 (PR target/90125)
On Wed, Apr 17, 2019 at 8:13 PM Jakub Jelinek wrote: > > Hi! > > The following patch fixes a bunch of pastos in the -O0 macros in the > PR89784 implementation plus testcase coverage that FAILs without the header > change and succeeds with that (the tests were previously run at -O2 only > where they test the inline functions and not the macros). > Because at -O0 the C x * y + z isn't contracted into FMA, there is a small > precision difference in two of the tests with the chosen constants, so I've > changed them to ones where a precision difference isn't really possible. > I think the constants weren't chosen very well, because either we just want > some basic testing, for which even the adjusted ones are ok, or we want > to specifically check for FMA, in that case we should check some FMA > cornercases where without FMA the result is completely different from one > with FMA. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > And sorry for screwing it up. > > 2019-04-17 Hongtao Liu > > PR target/90125 > * config/i386/avx512fintrin.h (_mm_maskz_fmadd_round_sd, > _mm_maskz_fmadd_round_ss, _mm_maskz_fmsub_round_sd, > _mm_maskz_fmsub_round_ss, _mm_maskz_fnmadd_round_sd, > _mm_maskz_fnmadd_round_ss, _mm_maskz_fnmsub_round_sd, > _mm_maskz_fnmsub_round_ss): Use _maskz builtin instead of _mask3. > > 2019-04-17 Jakub Jelinek > > PR target/90125 > * gcc.target/i386/avx512f-vfmsubXXXss-2.c (avx512f_test): Adjust > constants to ensure precise result even when not using fma. > * gcc.target/i386/avx512f-vfnmaddXXXss-2.c (avx512f_test): Likewise. > * gcc.target/i386/avx512f-vfmaddXXXsd-3.c: New test. > * gcc.target/i386/avx512f-vfmaddXXXss-3.c: New test. > * gcc.target/i386/avx512f-vfmsubXXXsd-3.c: New test. > * gcc.target/i386/avx512f-vfmsubXXXss-3.c: New test. > * gcc.target/i386/avx512f-vfnmaddXXXsd-3.c: New test. > * gcc.target/i386/avx512f-vfnmaddXXXss-3.c: New test. > * gcc.target/i386/avx512f-vfnmsubXXXsd-3.c: New test. > * gcc.target/i386/avx512f-vfnmsubXXXss-3.c: New test. The patch can be committed under obvious rule. Thanks, Uros. > --- gcc/config/i386/avx512fintrin.h.jj 2019-03-22 11:07:00.699948784 +0100 > +++ gcc/config/i386/avx512fintrin.h 2019-04-17 11:24:53.683695473 +0200 > @@ -12104,10 +12104,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U, > (__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, C, U, R) > > #define _mm_maskz_fmadd_round_sd(U, A, B, C, R)\ > -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, B, C, U, R) > +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, B, C, U, R) > > #define _mm_maskz_fmadd_round_ss(U, A, B, C, R)\ > -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, C, U, R) > +(__m128) __builtin_ia32_vfmaddss3_maskz (A, B, C, U, R) > > #define _mm_mask_fmsub_round_sd(A, U, B, C, R)\ > (__m128d) __builtin_ia32_vfmaddsd3_mask (A, B, -(C), U, R) > @@ -12122,10 +12122,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U, > (__m128) __builtin_ia32_vfmsubss3_mask3 (A, B, C, U, R) > > #define _mm_maskz_fmsub_round_sd(U, A, B, C, R)\ > -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, B, -(C), U, R) > +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, B, -(C), U, R) > > #define _mm_maskz_fmsub_round_ss(U, A, B, C, R)\ > -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, -(C), U, R) > +(__m128) __builtin_ia32_vfmaddss3_maskz (A, B, -(C), U, R) > > #define _mm_mask_fnmadd_round_sd(A, U, B, C, R)\ > (__m128d) __builtin_ia32_vfmaddsd3_mask (A, -(B), C, U, R) > @@ -12140,10 +12140,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U, > (__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), C, U, R) > > #define _mm_maskz_fnmadd_round_sd(U, A, B, C, R)\ > -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, -(B), C, U, R) > +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, -(B), C, U, R) > > #define _mm_maskz_fnmadd_round_ss(U, A, B, C, R)\ > -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), C, U, R) > +(__m128) __builtin_ia32_vfmaddss3_maskz (A, -(B), C, U, R) > > #define _mm_mask_fnmsub_round_sd(A, U, B, C, R)\ > (__m128d) __builtin_ia32_vfmaddsd3_mask (A, -(B), -(C), U, R) > @@ -12158,10 +12158,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U, > (__m128) __builtin_ia32_vfmsubss3_mask3 (A, -(B), C, U, R) > > #define _mm_maskz_fnmsub_round_sd(U, A, B, C, R)\ > -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, -(B), -(C), U, R) > +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, -(B), -(C), U, R) > > #define _mm_maskz_fnmsub_round_ss(U, A, B, C, R)\ > -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), -(C), U, R) > +(__m128) __builtin_ia32_vfmaddss3_maskz (A, -(B), -(C), U, R) > #endif > > #ifdef __OPTIMIZE__ > --- gcc/testsuite/gcc.target/i386/av
Re: [PATCH wwwdocs] Mention GNU Tools Cauldron in the News section
On 4/15/19 11:39 AM, Simon Marchi wrote: > On 2019-04-15 12:42 p.m., Simon Marchi wrote: >> Hi, >> >> Here is a patch that adds a mention of the 2019 Cauldron, similar to the >> entries >> for the previous editions. >> >> Thanks, >> >> Simon >> >> >> Index: index.html >> === >> RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v >> retrieving revision 1.1125 >> diff -u -r1.1125 index.html >> --- index.html 29 Mar 2019 12:28:15 - 1.1125 >> +++ index.html 15 Apr 2019 16:39:00 - >> @@ -54,6 +54,10 @@ >> News >> >> >> +https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools >> Cauldron 2019 >> +[2019-04-15] >> +Held in Montréal, Canada, September 13-15 2019 >> + >> GCC 8.3 released >> [2019-02-22] >> >> > Actually, it would be better to use the same dates as are written on the wiki > (12-15), > so please consider the patch below instead. > > Also, please note that I don't have push access on GCC, so if somebody could > push the > patch for me, once it's approved, I would appreciate it. Thanks! Thanks. Committed. jeff
Re: [PATCH] [ARC][COMMITTED] Fix diagnostic messages.
On Wed, Apr 17, 2019 at 01:25:05PM +0200, Jakub Jelinek wrote: > On Wed, Apr 17, 2019 at 02:09:33PM +0300, Claudiu Zissulescu wrote: > >/* Warn for unimplemented PIC in pre-ARC700 cores, and disable flag_pic. > > */ > >if (flag_pic && TARGET_ARC600_FAMILY) > > { > >warning (0, > > - "PIC is not supported for %s. Generating non-PIC code only..", > > + "PIC is not supported for %s. Generating non-PIC code only", > >arc_cpu_string); > > I believe this is undesirable too. Either use something like > "PIC is not supported for %s; generating non-PIC code only" > or split that into two messages > if (warning (0, "PIC is not supported for %s", arc_cpu_string)) > inform (input_location, "generating non-PIC code only"); And I suppose we should avoid pleonasm like "PIC code" ;). Marek
Re: [PATCH 3/3] Fix condition for std::variant to be copy constructible
On 17/04/19 19:20 +0300, Ville Voutilainen wrote: On Wed, 17 Apr 2019 at 19:12, Jonathan Wakely wrote: The standard says the std::variant copy constructor is defined as deleted unless all alternative types are copy constructible, but we were making it also depend on move constructible. Fix the condition and enhance the tests to check the semantics with pathological copy-only types (i.e. supporting copying but having deleted moves). The enhanced tests revealed a regression in copy assignment for non-trivial alternative types, where the assignment would not be performed because the condition in the _Copy_assign_base visitor is false: is_same_v, remove_reference_t>. Tested powerpc64le-linux. I plan to commit all three of these patches later today, unless somebody sees a problem with them. Looks good to me. Thanks. All three patches committed to trunk.
[PATCH] Use builtin sort instead of shell sort
Some build environments and configuration options may lead to the make variable PLUGIN_HEADERS being too long to be passed as parameters to the shell `echo` command, leading to a "write error" message when making the target install-plugin. The following patch fixes this issue by using the [Make $(sort list)][1] function instead to remove duplicates from the list of headers. There is no functional change, the value assigned to the shell variable is the same. Tested in production on x86 and armv7 cross-compilation toolchains. - The length of the headers variable goes from 8+ chars to 7500+ Tested with make bootstrap and make check on host-x86_64-pc-linux-gnu - make bootstrap successful - make check fails even before the patch is applied WARNING: program timed out. FAIL: libgomp.c/../libgomp.c-c++-common/cancel-parallel-1.c execution test ... make[4]: *** [Makefile:479: check-DEJAGNU] Error 1 2019-04-15 Emeric Dupont * Makefile.in: Use builtin sort instead of shell sort Signed-off-by: Emeric Dupont [1]: https://www.gnu.org/software/make/manual/html_node/Text-Functions.html#index-sorting-words Signed-off-by: Emeric Dupont --- gcc/Makefile.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/Makefile.in b/gcc/Makefile.in index d186d71c91e..3196e774a26 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -3538,7 +3538,7 @@ install-plugin: installdirs lang.install-plugin s-header-vars install-gengtype # We keep the directory structure for files in config or c-family and .def # files. All other files are flattened to a single directory. $(mkinstalldirs) $(DESTDIR)$(plugin_includedir) -headers=`echo $(PLUGIN_HEADERS) $$(cd $(srcdir); echo *.h *.def) | tr ' ' '\012' | sort -u`; \ +headers=$(sort $(PLUGIN_HEADERS) $$(cd $(srcdir); echo *.h *.def)); \ srcdirstrip=`echo "$(srcdir)" | sed 's/[].[^$$\\*|]/&/g'`; \ for file in $$headers; do \ if [ -f $$file ] ; then \ -- 2.21.0 TriaGnoSys GmbH, Registergericht: München HRB 141647, Vat.: DE 813396184 Geschäftsführer: Núria Riera Díaz, Peter Lewalter This email and any files transmitted with it are confidential & proprietary to Zodiac Inflight Innovations. This information is intended solely for the use of the individual or entity to which it is addressed. Access or transmittal of the information contained in this e-mail, in full or in part, to any other organization or persons is not authorized.
Re: collect2 patch to https in URL
On 17/04/2019 19:11, Jeff Law wrote: On 4/17/19 6:45 AM, Jonny Grant wrote: Hello Change the "collect2 -help" output to have https URL: Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html 2019-04-14 Jonny Grant * collect2.c: Change gcc.gnu.org URL to HTTPS Thank you Jonny THanks. I've installed this on the trunk. jeff Excellent
Re: [PATCH] Use builtin sort instead of shell sort
The 17.04.2019 21:36, Emeric Dupont wrote: <... Unwanted legalese ...> Sorry, please disregard the unwanted footer added against my will. I am actively trying to have out admins get rid of it where it is not applicable. -- Emeric Dupont Zodiac Inflight Innovations P +49815388678207 Argelsrieder Feld, 22 - 82234 Wessling www.safran-aerosystems.com TriaGnoSys GmbH, Registergericht: München HRB 141647, Vat.: DE 813396184 Geschäftsführer: Núria Riera Díaz, Peter Lewalter This email and any files transmitted with it are confidential & proprietary to Zodiac Inflight Innovations. This information is intended solely for the use of the individual or entity to which it is addressed. Access or transmittal of the information contained in this e-mail, in full or in part, to any other organization or persons is not authorized.
Re: [PATCH][RFC] Improve get_qualified_type linear list walk
On Wed, 17 Apr 2019, Jeff Law wrote: * tree.c (get_qualified_type): Put found type variants at the head of the variant list. Seems quite reasonable to me. I just hope we don't find a case where this is the exact worst case behavior ;-) That seems unlikely. Competitive analysis of the list update problem shows that the move-to-front strategy is 2-competitive. Here we also have insertions so the problem is different, but still close. -- Marc Glisse
Re: [PATCH] Fix up dg-extract-results.sh
On Wed, 17 Apr 2019 at 18:44, Mike Stump wrote: > > On Apr 17, 2019, at 8:59 AM, Jakub Jelinek wrote: > > Ok for trunk? > > Ok. Thanks!
[PATCH] rs6000: Remove a comma in a debug string
It is a bit confusing, it looks as if the compiler tried to print something there. Committing. Segher 2018-04-17 Segher Boessenkool * config/rs6000/rs6000.c (rs6000_register_move_cost): Fix typo. --- gcc/config/rs6000/rs6000.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 3c9b557..1b94e16 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -34893,7 +34893,7 @@ rs6000_register_move_cost (machine_mode mode, { if (dbg_cost_ctrl == 1) fprintf (stderr, -"rs6000_register_move_cost:, ret=%d, mode=%s, from=%s, to=%s\n", +"rs6000_register_move_cost: ret=%d, mode=%s, from=%s, to=%s\n", ret, GET_MODE_NAME (mode), reg_class_names[from], reg_class_names[to]); dbg_cost_ctrl--; -- 1.8.3.1
Re: [PATCH] Fixup IRA debug dump output
On 4/17/19 12:57 PM, Jeff Law wrote: > On 4/17/19 9:35 AM, Peter Bergner wrote: >> * ira-conflicts.c (print_allocno_conflicts): Always print something, >> even for allocno's with no conflicts. >> (print_conflicts): Print an extra newline. > OK. And while it's technically not a regression fix, I think this can > safely go in now :-) Hi Jeff, Ok, I committed the patch which is an improvement over the old code. Thanks! However, debugging PR87871 some more, I still didn't see p116 conflict with r0 like Vlad said it did with the new debug output. Not surprising, since the patch only affected adding missing \n's to the output. So I dumped the OBJECT_TOTAL_CONFLICT_HARD_REGS() output for p116 and sure enough, it does mention r0. I then called print_allocno_conflicts() by hand and it still didn't output r0 as a conflicting hard reg. Stepping through the debugger, I see that the: if (OBJECT_CONFLICT_ARRAY (obj) != NULL) { fprintf (file, "\n;; total conflict hard regs:\n"); fprintf (file, ";; conflict hard regs:\n\n"); continue; } ...is actually incorrect. The "if" test only says we don't have any conflicts with any other allocnos/pseudos. It doesn't tell us whether we have any hard register conflicts or not, so we really shouldn't do a continue here. Instead, we should guard the code that outputs the allocno conflicts and then fall down to the hard reg conflict prints, that should also be suitably guarded. With the patch below, we now see the missing r0 conflict Vlad said was there. ;; a5(r116,l0) conflicts: ;; total conflict hard regs: 0 ;; conflict hard regs: cp0:a0(r111)<->a4(r117)@330:move cp1:a2(r114)<->a3(r112)@41:shuffle ... Note, I still don't understand why p116 conflicts with r0, but that is orthogonal to actually printing out the conflict sets as they exist. Is this ok as well? ...and I'm sorry for not noticing this issue before. Peter * ira-conflicts.c (print_allocno_conflicts): Print the hard register conflicts, even if there are no allocno conflicts. Index: gcc/ira-conflicts.c === --- gcc/ira-conflicts.c (revision 270420) +++ gcc/ira-conflicts.c (working copy) @@ -632,47 +632,58 @@ print_allocno_conflicts (FILE * file, bo ira_object_t conflict_obj; ira_object_conflict_iterator oci; - if (OBJECT_CONFLICT_ARRAY (obj) == NULL) + if (OBJECT_CONFLICT_ARRAY (obj) != NULL) { - fprintf (file, "\n;; total conflict hard regs:\n"); - fprintf (file, ";; conflict hard regs:\n\n"); - continue; - } - - if (n > 1) - fprintf (file, "\n;; subobject %d:", i); - FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) - { - ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); - if (reg_p) - fprintf (file, " r%d,", ALLOCNO_REGNO (conflict_a)); - else + if (n > 1) + fprintf (file, "\n;; subobject %d:", i); + FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) { - fprintf (file, " a%d(r%d", ALLOCNO_NUM (conflict_a), - ALLOCNO_REGNO (conflict_a)); - if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1) - fprintf (file, ",w%d", OBJECT_SUBWORD (conflict_obj)); - if ((bb = ALLOCNO_LOOP_TREE_NODE (conflict_a)->bb) != NULL) - fprintf (file, ",b%d", bb->index); + ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); + if (reg_p) + fprintf (file, " r%d,", ALLOCNO_REGNO (conflict_a)); else - fprintf (file, ",l%d", -ALLOCNO_LOOP_TREE_NODE (conflict_a)->loop_num); - putc (')', file); + { + fprintf (file, " a%d(r%d", ALLOCNO_NUM (conflict_a), + ALLOCNO_REGNO (conflict_a)); + if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1) + fprintf (file, ",w%d", OBJECT_SUBWORD (conflict_obj)); + if ((bb = ALLOCNO_LOOP_TREE_NODE (conflict_a)->bb) != NULL) + fprintf (file, ",b%d", bb->index); + else + fprintf (file, ",l%d", +ALLOCNO_LOOP_TREE_NODE (conflict_a)->loop_num); + putc (')', file); + } } } - COPY_HARD_REG_SET (conflicting_hard_regs, OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)); - AND_COMPL_HARD_REG_SET (conflicting_hard_regs, ira_no_alloc_regs); - AND_HARD_REG_SET (conflicting_hard_regs, - reg_class_contents[ALLOCNO_CLASS (a)]); - print_hard_reg_set (file, "\n;; total conflict hard regs:", - conflicting_hard_regs); - - COPY_HARD_REG_SET (conflicting_hard_regs, OBJECT_CONFLICT_HARD_REGS (obj)); - AND_COMPL_HARD_REG_SET (conflic
Re: [PATCH] PR libstdc++/90105 make forward_list::sort stable
On 16/04/19 23:16 +0100, Jonathan Wakely wrote: While testing the fix I also discovered that operator== assumes the elements are comparable with operator!= which is not required. PR libstdc++/90105 * include/bits/forward_list.h (operator==): Do not use operator!= to compare elements. (forward_list::sort(Comp)): When elements are equal take the one earlier in the list, so that sort is stable. * testsuite/23_containers/forward_list/operations/90105.cc: New test. * testsuite/23_containers/forward_list/comparable.cc: Test with types that meet the minimum EqualityComparable and LessThanComparable requirements. Remove irrelevant comment. Tested powerpc64le-linux. I'm surprised nobody has noticed either of these bugs before! I think this is safe for stage 4, and for backporting to active branches. Any objections? Committed to trunk.
[PATCH] avoid aarch64 ICE on large vectors (PR 89797)
The fix for pr89797 committed in r270326 was limited to targets with NUM_POLY_INT_COEFFS == 1 which I think is all but aarch64. The tests for the fix have been failing with an ICE on aarch64 because it suffers from more or less the same problem but in its own target-specific code. Attached is the patch I posted yesterday that fixes the ICE, successfully bootstrapped and regtested on x86_64-linux. I also ran the dg.exp=*attr* and aarch64.exp tests with an aarch64-linux-elf cross-compiler. There are no ICEs but there are tons of errors in the latter tests because many (most?) either expect to be able to find libc headers or link executables (I have not built libc for aarch64). I'm around tomorrow but then traveling the next two weeks (with no connectivity the first week) so I unfortunately won't be able to fix whatever this change might break until the week of May 6. Jeff, if you have an aarch64 tester that could verify this patch tomorrow that would help give us some confidence. Otherwise, another option to consider for the time being is to xfail the tests on aarch64. Thanks Martin PR middle-end/89797 - ICE on a vector_size (1LU << 33) int variable gcc/ChangeLog: PR middle-end/89797 * tree.h (TYPE_VECTOR_SUBPARTS): Correct computation when NUM_POLY_INT_COEFFS == 2. Use HOST_WIDE_INT_1U. * config/aarch64/aarch64.c (aarch64_simd_vector_alignment): Avoid assuming type size fits in SHWI. Index: gcc/tree.h === --- gcc/tree.h (revision 270418) +++ gcc/tree.h (working copy) @@ -3735,13 +3735,13 @@ TYPE_VECTOR_SUBPARTS (const_tree node) if (NUM_POLY_INT_COEFFS == 2) { poly_uint64 res = 0; - res.coeffs[0] = 1 << (precision & 0xff); + res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff); if (precision & 0x100) - res.coeffs[1] = 1 << (precision & 0xff); + res.coeffs[1] = HOST_WIDE_INT_1U << ((precision & 0x100) >> 16); return res; } else -return (unsigned HOST_WIDE_INT)1 << precision; +return HOST_WIDE_INT_1U << precision; } /* Set the number of elements in VECTOR_TYPE NODE to SUBPARTS, which must Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c (revision 270418) +++ gcc/config/aarch64/aarch64.c (working copy) @@ -14924,7 +14924,10 @@ aarch64_simd_vector_alignment (const_tree type) be set for non-predicate vectors of booleans. Modes are the most direct way we have of identifying real SVE predicate types. */ return GET_MODE_CLASS (TYPE_MODE (type)) == MODE_VECTOR_BOOL ? 16 : 128; - HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type)); + tree size = TYPE_SIZE (type); + unsigned HOST_WIDE_INT align = 128; + if (tree_fits_uhwi_p (size)) +align = tree_to_uhwi (TYPE_SIZE (type)); return MIN (align, 128); }
[C++ PATCH] PR c++/90047 - ICE with enable_if alias template.
In order to make alias templates useful for SFINAE we instantiate them under the prevailing 'complain' argument, so an error encountered while instantiating during SFINAE context is silent. The problem in this PR comes when we later look up the erroneous instantiation and don't give an error at that point. Fixed by not adding an erroneous instantiation to the hash table, so we instantiate it again when needed and get the error. This required changes to a number of tests, which previously said "substitution failed:" with no explanation of what the failure was; now we properly explain. Tested x86_64-pc-linux-gnu, applying to trunk. * pt.c (tsubst_decl) [TYPE_DECL]: Don't put an erroneous decl in the hash table when we're in SFINAE context. --- gcc/cp/pt.c | 3 +- .../20_util/duration/arithmetic/dr3050.cc | 2 ++ .../20_util/from_chars/1_c++20_neg.cc | 2 ++ .../testsuite/20_util/from_chars/1_neg.cc | 2 ++ .../20_util/shared_ptr/assign/auto_ptr_neg.cc | 1 + .../shared_ptr/assign/shared_ptr_neg.cc | 2 ++ .../20_util/shared_ptr/cons/unique_ptr_neg.cc | 1 + .../testsuite/20_util/to_chars/1_neg.cc | 2 ++ .../20_util/tuple/element_access/get_neg.cc | 2 ++ .../unique_ptr/cons/ptr_deleter_neg.cc| 2 ++ .../20_util/unique_ptr/modifiers/reset_neg.cc | 2 ++ .../deque/requirements/dr438/assign_neg.cc| 2 ++ .../requirements/dr438/constructor_1_neg.cc | 2 ++ .../requirements/dr438/constructor_2_neg.cc | 2 ++ .../deque/requirements/dr438/insert_neg.cc| 2 ++ .../requirements/dr438/assign_neg.cc | 2 ++ .../requirements/dr438/constructor_1_neg.cc | 2 ++ .../requirements/dr438/constructor_2_neg.cc | 2 ++ .../requirements/dr438/insert_neg.cc | 2 ++ .../list/requirements/dr438/assign_neg.cc | 2 ++ .../requirements/dr438/constructor_1_neg.cc | 2 ++ .../requirements/dr438/constructor_2_neg.cc | 2 ++ .../list/requirements/dr438/insert_neg.cc | 2 ++ .../vector/requirements/dr438/assign_neg.cc | 2 ++ .../requirements/dr438/constructor_1_neg.cc | 2 ++ .../requirements/dr438/constructor_2_neg.cc | 2 ++ .../vector/requirements/dr438/insert_neg.cc | 2 ++ .../memory/shared_ptr/cons/copy_ctor_neg.cc | 2 ++ .../shared_ptr/cons/pointer_ctor_neg.cc | 2 ++ .../memory/shared_ptr/modifiers/reset_neg.cc | 2 ++ gcc/testsuite/g++.dg/cpp0x/alias-decl-67.C| 30 +++ gcc/testsuite/g++.old-deja/g++.robertl/eb43.C | 2 ++ gcc/cp/ChangeLog | 6 33 files changed, 96 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-67.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index f8001317bda..3a11eaa7630 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -13948,7 +13948,8 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain) DECL_TEMPLATE_INFO (r) = build_template_info (tmpl, argvec); SET_DECL_IMPLICIT_INSTANTIATION (r); - register_specialization (r, gen_tmpl, argvec, false, hash); + if (!error_operand_p (r) || (complain & tf_error)) + register_specialization (r, gen_tmpl, argvec, false, hash); } else { diff --git a/libstdc++-v3/testsuite/20_util/duration/arithmetic/dr3050.cc b/libstdc++-v3/testsuite/20_util/duration/arithmetic/dr3050.cc index 5854195dce5..fc64e5a4e61 100644 --- a/libstdc++-v3/testsuite/20_util/duration/arithmetic/dr3050.cc +++ b/libstdc++-v3/testsuite/20_util/duration/arithmetic/dr3050.cc @@ -28,3 +28,5 @@ void test01(std::chrono::seconds s, X x) s / x; // { dg-error "no match" } s % x; // { dg-error "no match" } } + +// { dg-prune-output "enable_if" } diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc b/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc index 83d297676bf..821cc17413d 100644 --- a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc +++ b/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc @@ -36,3 +36,5 @@ test01(const char* first, const char* last) std::from_chars(first, last, c32); // { dg-error "no matching" } std::from_chars(first, last, c32, 10); // { dg-error "no matching" } } + +// { dg-prune-output "enable_if" } diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc index 2e3c34c9145..bc52628218a 100644 --- a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc +++ b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc @@ -36,3 +36,5 @@ test01(const char* first, const char* last) std::from_chars(first, last, c32); // { dg-error "no matching" } std::from_chars(first, last, c32, 10); // { dg-error "no matching" } } + +// { dg-prune-output "enable_if" } diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/assign/auto_ptr_neg.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/assign/auto_ptr_neg.cc index 19a73a1d8f2..9c80c77c96e 1006
Go patch committed: Use temporary to avoid early destruction
This patch to the Go frontend fixes a bug in which the code referred to a temporary value after it was destroyed. It also fixes an incorrect test of the string index rather than the value parsed using strtol. This should fix PR 90110. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 270373) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -20010e494f46d8fd58cfd372093b059578d3379a +ecbd6562aff604b9559f63d714e922a0c9c2a77f The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/import.cc === --- gcc/go/gofrontend/import.cc (revision 270373) +++ gcc/go/gofrontend/import.cc (working copy) @@ -1478,8 +1478,9 @@ Import_function_body::read_type() this->off_ = i + 1; char *end; - long val = strtol(this->body_.substr(start, i - start).c_str(), &end, 10); - if (*end != '\0' || i > 0x7fff) + std::string num = this->body_.substr(start, i - start); + long val = strtol(num.c_str(), &end, 10); + if (*end != '\0' || val > 0x7fff) { if (!this->saw_error_) go_error_at(this->location(),
[C/C++ PATCH] Further typedef duplicate decl fixes (PR c++/90108)
Hi! As reported, the newly added testcase ICEs with --param ggc-min-heapsize=0. The problem is that while the remove type is not referenced by anything else, it is a distinct type created to hold the attributes, there is another type with TYPE_NAME equal to the newdecl we want to tree. That one is created in common_handle_aligned_attribute: { if ((flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) /* OK, modify the type in place. */; /* If we have a TYPE_DECL, then copy the type, so that we don't accidentally modify a builtin type. See pushdecl. */ else if (decl && TREE_TYPE (decl) != error_mark_node && DECL_ORIGINAL_TYPE (decl) == NULL_TREE) { tree tt = TREE_TYPE (decl); *type = build_variant_type_copy (*type); DECL_ORIGINAL_TYPE (decl) = tt; TYPE_NAME (*type) = decl; TREE_USED (*type) = TREE_USED (decl); TREE_TYPE (decl) = *type; } else *type = build_variant_type_copy (*type); where we create a variant type and set the TYPE_NAME too. I've tried to remove that else if ... and just do *type = build_variant_type_copy (*type); but that regressed some DWARF DW_AT_alignment tests. So, the following patch instead removes the remove type from the variants list if it is not a main variant (as before), otherwise tries to find in TYPE_MAIN_VARIANT (DECL_ORIGINAL_TYPE (newdecl)) variant list a type with TYPE_NAME equal to newdecl and remove that one. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-04-18 Jakub Jelinek PR c++/90108 * c-decl.c (merge_decls): If remove is main variant and DECL_ORIGINAL_TYPE is some other type, remove a DECL_ORIGINAL_TYPE variant that has newdecl as TYPE_NAME if any. * decl.c (duplicate_decls): If remove is main variant and DECL_ORIGINAL_TYPE is some other type, remove a DECL_ORIGINAL_TYPE variant that has newdecl as TYPE_NAME if any. * c-c++-common/pr90108.c: New test. --- gcc/c/c-decl.c.jj 2019-04-17 21:21:39.936133112 +0200 +++ gcc/c/c-decl.c 2019-04-17 23:25:08.098936888 +0200 @@ -2513,7 +2513,24 @@ merge_decls (tree newdecl, tree olddecl, { tree remove = TREE_TYPE (newdecl); if (TYPE_MAIN_VARIANT (remove) == remove) - gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE); + { + gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE); + /* If remove is the main variant, no need to remove that +from the list. One of the DECL_ORIGINAL_TYPE +variants, e.g. created for aligned attribute, might still +refer to the newdecl TYPE_DECL though, so remove that one +in that case. */ + if (DECL_ORIGINAL_TYPE (newdecl) + && DECL_ORIGINAL_TYPE (newdecl) != remove) + for (tree t = TYPE_MAIN_VARIANT (DECL_ORIGINAL_TYPE (newdecl)); +t; t = TYPE_MAIN_VARIANT (t)) + if (TYPE_NAME (TYPE_NEXT_VARIANT (t)) == newdecl) + { + TYPE_NEXT_VARIANT (t) + = TYPE_NEXT_VARIANT (TYPE_NEXT_VARIANT (t)); + break; + } + } else for (tree t = TYPE_MAIN_VARIANT (remove); ; t = TYPE_NEXT_VARIANT (t)) --- gcc/cp/decl.c.jj2019-04-17 21:21:39.753136091 +0200 +++ gcc/cp/decl.c 2019-04-17 23:27:13.995875527 +0200 @@ -2133,7 +2133,24 @@ next_arg:; { tree remove = TREE_TYPE (newdecl); if (TYPE_MAIN_VARIANT (remove) == remove) - gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE); + { + gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE); + /* If remove is the main variant, no need to remove that +from the list. One of the DECL_ORIGINAL_TYPE +variants, e.g. created for aligned attribute, might still +refer to the newdecl TYPE_DECL though, so remove that one +in that case. */ + if (tree orig = DECL_ORIGINAL_TYPE (newdecl)) + if (orig != remove) + for (tree t = TYPE_MAIN_VARIANT (orig); t; + t = TYPE_MAIN_VARIANT (t)) + if (TYPE_NAME (TYPE_NEXT_VARIANT (t)) == newdecl) + { + TYPE_NEXT_VARIANT (t) + = TYPE_NEXT_VARIANT (TYPE_NEXT_VARIANT (t)); + break; + } + } else for (tree t = TYPE_MAIN_VARIANT (remove); ; t = TYPE_NEXT_VARIANT (t)) --- gcc/testsuite/c-c++-common/pr90108.c.jj 2019-04-17 23:18:23.466566296 +0200 +++ gcc/testsuite/c-c++-com
[PATCH] i18n fix for gimple-ssa-sprintf.c (PR translation/79183)
Hi! This patch fixes the following messages, so that they are translatable even to languages that don't use the english Plural-Forms: nplurals=2; plural=n != 1; See https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html#Plural-forms for more details. Bootstrapped/regtested on x86_64-linux and i686-linux, plus generated gcc.pot and eyeballed the changes. Ok for trunk? 2019-04-18 Jakub Jelinek PR translation/79183 * gimple-ssa-sprintf.c (format_directive): Use inform_n instead of inform where appropriate. --- gcc/gimple-ssa-sprintf.c.jj 2019-04-10 09:26:49.476692760 +0200 +++ gcc/gimple-ssa-sprintf.c2019-04-17 21:37:51.535294586 +0200 @@ -3016,12 +3016,10 @@ format_directive (const sprintf_dom_walk help the user figure out how big a buffer they need. */ if (min == max) - inform (callloc, - (min == 1 -? G_("%qE output %wu byte into a destination of size %wu") -: G_("%qE output %wu bytes into a destination of size " - "%wu")), - info.func, min, info.objsize); + inform_n (callloc, min, + "%qE output %wu byte into a destination of size %wu", + "%qE output %wu bytes into a destination of size %wu", + info.func, min, info.objsize); else if (max < HOST_WIDE_INT_MAX) inform (callloc, "%qE output between %wu and %wu bytes into " @@ -3044,11 +3042,9 @@ format_directive (const sprintf_dom_walk of printf with no destination size just print the computed result. */ if (min == max) - inform (callloc, - (min == 1 -? G_("%qE output %wu byte") -: G_("%qE output %wu bytes")), - info.func, min); + inform_n (callloc, min, + "%qE output %wu byte", "%qE output %wu bytes", + info.func, min); else if (max < HOST_WIDE_INT_MAX) inform (callloc, "%qE output between %wu and %wu bytes", Jakub