[SPARC] Fix PR target/69072
This fixes an ICE on SPARC 64-bit in a corner case where a struct containing a nested packed struct is passed beyond the 6th position to a function. The various routines of the back-end implementing the complex calling convention disagree on the passing mechanism, leading to an assertion failure. Tested (incl. binary compatibility) on SPARC/Solaris, applied on the mainline. 2016-01-04 Eric Botcazou PR target/69072 * config/sparc/sparc.c (scan_record_type): Take into account subfields to compute the PACKED_P predicate. (function_arg_record_value): Minor tweaks. 2016-01-04 Eric Botcazou * gcc.target/sparc/20160104-1.c: New test. -- Eric BotcazouIndex: config/sparc/sparc.c === --- config/sparc/sparc.c (revision 231971) +++ config/sparc/sparc.c (working copy) @@ -6140,30 +6140,28 @@ sparc_strict_argument_naming (cumulative_args_t ca that is eligible for promotion in integer registers. - FP_REGS_P: the record contains at least one field or sub-field that is eligible for promotion in floating-point registers. -- PACKED_P: the record contains at least one field that is packed. +- PACKED_P: the record contains at least one field that is packed. */ - Sub-fields are not taken into account for the PACKED_P predicate. */ - static void scan_record_type (const_tree type, int *intregs_p, int *fpregs_p, int *packed_p) { - tree field; - - for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field)) + for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field)) { if (TREE_CODE (field) == FIELD_DECL) { - if (TREE_CODE (TREE_TYPE (field)) == RECORD_TYPE) - scan_record_type (TREE_TYPE (field), intregs_p, fpregs_p, 0); - else if ((FLOAT_TYPE_P (TREE_TYPE (field)) - || TREE_CODE (TREE_TYPE (field)) == VECTOR_TYPE) + tree field_type = TREE_TYPE (field); + + if (TREE_CODE (field_type) == RECORD_TYPE) + scan_record_type (field_type, intregs_p, fpregs_p, packed_p); + else if ((FLOAT_TYPE_P (field_type) + || TREE_CODE (field_type) == VECTOR_TYPE) && TARGET_FPU) *fpregs_p = 1; else *intregs_p = 1; - if (packed_p && DECL_PACKED (field)) + if (DECL_PACKED (field)) *packed_p = 1; } } @@ -6647,9 +6645,10 @@ function_arg_record_value (const_tree type, machin parms.nregs += intslots; } + + /* Allocate the vector and handle some annoying special cases. */ nregs = parms.nregs; - /* Allocate the vector and handle some annoying special cases. */ if (nregs == 0) { /* ??? Empty structure has no value? Duh? */ @@ -6661,17 +6660,16 @@ function_arg_record_value (const_tree type, machin load. */ return gen_rtx_REG (mode, regbase); } - else - { - /* ??? C++ has structures with no fields, and yet a size. Give up - for now and pass everything back in integer registers. */ - nregs = (typesize + UNITS_PER_WORD - 1) / UNITS_PER_WORD; - } + + /* ??? C++ has structures with no fields, and yet a size. Give up + for now and pass everything back in integer registers. */ + nregs = (typesize + UNITS_PER_WORD - 1) / UNITS_PER_WORD; if (nregs + slotno > SPARC_INT_ARG_MAX) nregs = SPARC_INT_ARG_MAX - slotno; } - gcc_assert (nregs != 0); + gcc_assert (nregs > 0); + parms.ret = gen_rtx_PARALLEL (mode, rtvec_alloc (parms.stack + nregs)); /* If at least one field must be passed on the stack, generate /* PR target/69072 */ /* Reported by Zdenek Sojka */ /* { dg-do compile } */ typedef struct { struct { double d; } __attribute__((packed)) a; } S; void foo (S s1, S s2, S s3, S s4, S s5, S s6, S s7) {}
[SPARC] Fix PR target/69100
This fixes another ICE on SPARC 64-bit in a corner case where __builtin_apply is compiled with -mno-fpu/-msoft-float. Tested (incl. binary compatibility) on SPARC/Solaris, applied on the mainline. 2016-01-04 Eric Botcazou PR target/69100 * config/sparc/sparc.h (FUNCTION_ARG_REGNO_P): Return true in 64-bit mode for %f0-%f31 only if TARGET_FPU. 2016-01-04 Eric Botcazou * gcc.target/sparc/20160104-2.c: New test. -- Eric Botcazou/* PR target/69100 */ /* Reported by Zdenek Sojka */ /* { dg-do compile } */ /* { dg-options "-mno-fpu" } */ void foo (void) { __builtin_apply (0, 0, 0); } Index: config/sparc/sparc.h === --- config/sparc/sparc.h (revision 231971) +++ config/sparc/sparc.h (working copy) @@ -1176,9 +1176,8 @@ extern char leaf_reg_remap[]; On SPARC, these are the "output" registers. v9 also uses %f0-%f31. */ #define FUNCTION_ARG_REGNO_P(N) \ -(TARGET_ARCH64 \ - ? (((N) >= 8 && (N) <= 13) || ((N) >= 32 && (N) <= 63)) \ - : ((N) >= 8 && (N) <= 13)) + (((N) >= 8 && (N) <= 13) \ + || (TARGET_ARCH64 && TARGET_FPU && (N) >= 32 && (N) <= 63)) /* Define a data type for recording info about an argument list during the scan of that argument list. This data type should
Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint
On Sun, Jan 03, 2016 at 07:11:58PM -0800, H.J. Lu wrote: > --- a/gcc/config/i386/predicates.md > +++ b/gcc/config/i386/predicates.md > @@ -951,6 +951,13 @@ > (match_test "INTEGRAL_MODE_P (GET_MODE (op))") > (match_test "op == CONSTM1_RTX (GET_MODE (op))"))) > > +; Return true when OP is operand acceptable for vector memory operand. > +; Only AVX can have misaligned memory operand. > +(define_predicate "vector_memory_operand" > + (and (match_operand 0 "memory_operand") > + (ior (match_test "TARGET_AVX") > + (match_test "MEM_ALIGN (op) >= GET_MODE_ALIGNMENT (mode)" Shouldn't this take into account the ssememalign attribute too? I mean, various instructions have some ssememalign > 8, which means they can't accept any alignment, but happily accept say >= 32-bit alignment or >= 64-bit alignment. Though, ssememalign is an instruction attribute and the predicates/constraints don't have access to the current instruction. So maybe we need more constraints and more predicates, the ones you've added for ssememalign == 0 instructions, don't change anything in instructions with ssememalign == 8 (you've clearly changed some of them, and patch 3 shows you've tried to partially undo it afterwards, but only the constraint, not the predicate, and only in one instruction), and use different predicates/constraints for ssememalign == {16,32,64} instructions. Jakub
[ping] Enable -mstackrealign with SSE on 32-bit Windows
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01458.html Thanks in advance. -- Eric Botcazou
Re: [ping] Enable -mstackrealign with SSE on 32-bit Windows
On Mon, Jan 4, 2016 at 9:26 AM, Eric Botcazou wrote: > https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01458.html > > Thanks in advance. This is really Windows specific setting, so Windows maintainer should OK the patch. Uros.
Re: [ping] Enable -mstackrealign with SSE on 32-bit Windows
> This is really Windows specific setting, so Windows maintainer should > OK the patch. Makes sense, both maintainers now CCed. https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01458.html -- Eric Botcazou
Re: [PATCH] PR/68089: C++-11: Ingore "alignas(0)".
On Fri, Jan 01, 2016 at 05:53:08PM -0700, Martin Sebor wrote: > On 12/31/2015 04:50 AM, Dominik Vogt wrote: > >The attached patch fixes C++-11 handling of "alignas(0)" which > >should be ignored but currently generates an error message. A > >test case is included; the patch has been tested on S390x. Since > >it's a language issue it should be independent of the backend > >used. > > The patch doesn't handle value-dependent expressions(*). > It > seems that the problem is in handle_aligned_attribute() calling > check_user_alignment() with the second argument (ALLOW_ZERO) > set to false. Calling it with true fixes the problem and handles > value-dependent expressions (I haven't done any more testing beyond > that). Like the attached patch? (Passes the testsuite on s390x.) But wouldn't an "aligned" attribute be added, allowing the backend to possibly generate an error or a warning? > Also, in the test, I noticed the definition of the first struct > is missing the terminating semicolon. Yeah. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany gcc/c-family/ChangeLog PR/69089 * c-common.c (handle_aligned_attribute): Allow 0 as an argument to the "aligned" attribute. gcc/testsuite/ChangeLog PR/69089 * g++.dg/cpp0x/alignas5.C: New test. >From 2461293b9070da74950fd0ae055d1239cc69ce67 Mon Sep 17 00:00:00 2001 From: Dominik Vogt Date: Wed, 30 Dec 2015 15:08:52 +0100 Subject: [PATCH] C++-11: Ingore "alignas(0)" instead of generating an error message. This is required by the C++-11 standard. --- gcc/c-family/c-common.c | 2 +- gcc/testsuite/g++.dg/cpp0x/alignas5.C | 29 + 2 files changed, 30 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas5.C diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 653d1dc..9eb25a9 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -7804,7 +7804,7 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args, else if (TYPE_P (*node)) type = node, is_type = 1; - if ((i = check_user_alignment (align_expr, false)) == -1 + if ((i = check_user_alignment (align_expr, true)) == -1 || !check_cxx_fundamental_alignment_constraints (*node, i, flags)) *no_add_attrs = true; else if (is_type) diff --git a/gcc/testsuite/g++.dg/cpp0x/alignas5.C b/gcc/testsuite/g++.dg/cpp0x/alignas5.C new file mode 100644 index 000..f3252a9 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/alignas5.C @@ -0,0 +1,29 @@ +// PR c++/69089 +// { dg-do compile { target c++11 } } +// { dg-options "-Wno-attributes" } + +alignas (0) int valid1; +alignas (1 - 1) int valid2; +struct Tvalid +{ + alignas (0) int i; + alignas (2 * 0) int j; +}; + +alignas (-1) int invalid1; /* { dg-error "not a positive power of 2" } */ +alignas (1 - 2) int invalid2; /* { dg-error "not a positive power of 2" } */ +struct Tinvalid +{ + alignas (-1) int i; /* { dg-error "not a positive power of 2" } */ + alignas (2 * 0 - 1) int j; /* { dg-error "not a positive power of 2" } */ +}; + +template struct TNvalid1 { alignas (N) int i; }; +TNvalid1<0> SNvalid1; +template struct TNvalid2 { alignas (N) int i; }; +TNvalid2<1 - 1> SNvalid2; + +template struct TNinvalid1 { alignas (N) int i; }; /* { dg-error "not a positive power of 2" } */ +TNinvalid1<-1> SNinvalid1; +template struct TNinvalid2 { alignas (N) int i; }; /* { dg-error "not a positive power of 2" } */ +TNinvalid2<1 - 2> SNinvalid2; -- 2.3.0
Re: [PATCH 4/4] Un-XFAIL ssa-dom-cse-2.c for most platforms
On 24/12/15 19:59, Mike Stump wrote: On Dec 22, 2015, at 8:00 AM, Alan Lawrence wrote: On 21/12/15 15:33, Bill Schmidt wrote: Not on a stage1 compiler - check_p8vector_hw_available itself requires being able to run executables - I'll check on gcc112. However, both look like they're really about the host (ability to execute an asm instruction), not the target (/ability for gcc to output such an instruction) Hm, that looks like a pervasive problem for powerpc. There are a number of things that are supposed to be testing effective target but rely on check_p8vector_hw_available, which as you note requires executing an instruction and is really about the host. We need to clean that up; I should probably open a bug. Kind of amazed this has gotten past us for a couple of years. Well, I was about to apologize for making a bogus remark. A really "proper" setup, would be to tell dejagnu to run your execution tests in some kind of emulator/simulator (on your host, perhaps one kind of powerpc) that only/additionally runs instructions for the other, _target_, kind of powerpc...and whatever setup you'd need for all that probably does not live in the GCC repository! I’m not following. dejagnu can already run tests on the target to makes decisions on which tests to run and what to expect from them, if it wants. Some ports already do this. Further, this is pretty typical and standard and easy to do You confuse the issue by mentioning host, but this I think is wrong. These decisions have nothing to do with the host. The are properties of the target execution environment. I’d be happy to help if you’d like. I’d just need the details of what you’d like help with. You're right, which is why I described my first (wrong) remark as bogus. That is, check_p8vector_hw_available is executing an assembly instruction, and on a well-configured test setup, that would potentially invoke an emulator etc. - whereas I am just doing 'native' testing on gcc110/gcc112 on the compile farm. So (as Mike says) there is no bug here, but one just needs to be aware that passing -mcpu=power7 (say) is not sufficient to make check_p8vector_hw_available return false when executing on a power8 host; you would also need to set up some kind of power7 emulator/simulator. Hope that's clear! Thanks, Alan
Re: [PATCH 1/4] Make SRA scalarize constant-pool loads
On 24/12/15 11:53, Alan Lawrence wrote: Here's a new version that fixes the gcc.dg/guality/pr54970.c failures seen on aarch64 and powerpc64. [snip] This also fixes a bunch of other guality tests on AArch64 that were failing prior to the patch series, and another bunch on PowerPC64 (bigendian -m32), listed below. Ach, sorry, not quite. That version avoids any regressions (e.g. in pr54970.c), but does not fix all those other tests, unless you also have this hunk (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01483.html): diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index a3ff2df..2a741b8 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -2651,7 +2651,8 @@ analyze_all_variable_accesses (void) && scalarizable_type_p (TREE_TYPE (var))) { if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var))) - <= max_scalarization_size) + <= max_scalarization_size + || DECL_IN_CONSTANT_POOL (var)) { create_total_scalarization_access (var); completely_scalarize (var, TREE_TYPE (var), 0, var); ...which I was using to increase test coverage of the SRA changes. (Alternatively, you can "fix" the tests by running the testsuite with a forced --param sra-max-scalarization-size. But this is only saying that the dwarf info now generated by scalarizing constant-pools, is better than whatever dwarf was being generated by whatever other part of the compiler before.) --Alan
Re: [PATCH][Testsuite]Cleanup logs from gdb tests by adding newlines
Ping. --Alan On 10/12/15 10:31, Alan Lawrence wrote: Runs of the guality testsuite can sometimes end up with gcc.log containing malformed lines like: A debugging session is active.PASS: gcc.dg/guality/pr36728-1.c -O2 line 18 arg4 == 4 A debugging session is active.PASS: gcc.dg/guality/restrict.c -O2 line 30 type:ip == int * Inferior 1 [process 27054] will be killed.PASS: gcc.dg/guality/restrict.c -O2 line 30 type:cicrp == const int * const restrict Inferior 1 [process 27160] will be killed.PASS: gcc.dg/guality/restrict.c -O2 line 30 type:cvirp == int * const volatile restrict This patch just makes sure the PASS/FAIL comes at the beginning of a line. (At the slight cost of adding some extra newlines not in the actual test output.) I moved the remote_close target calls earlier, to avoid any possible race condition of extra output being generated after the newline - this may not be strictly necessary. Tested on aarch64-none-linux-gnu and x86_64-none-linux-gnu. I think this is reasonable for stage 3 - OK for trunk? gcc/testsuite/ChangeLog: * lib/gcc-gdb-test.exp (gdb-test): call remote_close earlier, and send newline to log, before calling pass/fail/unsupported. * lib/gcc-simulate-thread.exp (simulate-thread): Likewise. --- gcc/testsuite/lib/gcc-gdb-test.exp| 15 ++- gcc/testsuite/lib/gcc-simulate-thread.exp | 10 +++--- 2 files changed, 17 insertions(+), 8 deletions(-) diff --git a/gcc/testsuite/lib/gcc-gdb-test.exp b/gcc/testsuite/lib/gcc-gdb-test.exp index d3ba6e4..f60cabf 100644 --- a/gcc/testsuite/lib/gcc-gdb-test.exp +++ b/gcc/testsuite/lib/gcc-gdb-test.exp @@ -84,8 +84,9 @@ proc gdb-test { args } { remote_expect target [timeout_value] { # Too old GDB -re "Unhandled dwarf expression|Error in sourced command file|
[committed] Update copyright years, part 1
Hi! I've committed following patch to update the user visible copyright years (and rolled new year of gcc/fortran and libjava ChangeLogs, plus added Copyright boilerplate at the end of libitm, libgomp and libquadmath ChangeLog files). 2016-01-04 Jakub Jelinek gcc/ * gcc.c (process_command): Update copyright notice dates. * gcov-dump.c (print_version): Ditto. * gcov.c (print_version): Ditto. * gcov-tool.c (print_version): Ditto. * gengtype.c (create_file): Ditto. * doc/cpp.texi: Bump @copying's copyright year. * doc/cppinternals.texi: Ditto. * doc/gcc.texi: Ditto. * doc/gccint.texi: Ditto. * doc/gcov.texi: Ditto. * doc/install.texi: Ditto. * doc/invoke.texi: Ditto. gcc/ada/ * gnat_ugn.texi: Bump @copying's copyright year. * gnat_rm.texi: Likewise. gcc/fortran/ * gfortranspec.c (lang_specific_driver): Update copyright notice dates. * gfc-internals.texi: Bump @copying's copyright year. * gfortran.texi: Ditto. * intrinsic.texi: Ditto. * invoke.texi: Ditto. gcc/go/ * gccgo.texi: Bump @copyrights-go year. gcc/java/ * jcf-dump.c (version): Update copyright notice dates. libgomp/ * libgomp.texi: Bump @copying's copyright year. libitm/ * libitm.texi: Bump @copying's copyright year. libjava/ * classpath/gnu/java/rmi/registry/RegistryImpl.java (version): Update copyright notice dates. * classpath/tools/gnu/classpath/tools/orbd/Main.java (run): Ditto. * gnu/gcj/convert/Convert.java (version): Update copyright notice dates. * gnu/gcj/tools/gcj_dbtool/Main.java (main): Ditto. libquadmath/ * libquadmath.texi: Bump @copying's copyright year. --- gcc/ada/gnat_rm.texi(revision 232052) +++ gcc/ada/gnat_rm.texi(working copy) @@ -25,7 +25,7 @@ GNAT Reference Manual , November 18, 201 AdaCore -Copyright @copyright{} 2008-2015, Free Software Foundation +Copyright @copyright{} 2008-2016, Free Software Foundation @end quotation @end copying --- gcc/ada/gnat_ugn.texi (revision 232052) +++ gcc/ada/gnat_ugn.texi (working copy) @@ -25,7 +25,7 @@ GNAT User's Guide for Native Platforms , AdaCore -Copyright @copyright{} 2008-2015, Free Software Foundation +Copyright @copyright{} 2008-2016, Free Software Foundation @end quotation @end copying --- gcc/doc/cpp.texi(revision 232052) +++ gcc/doc/cpp.texi(working copy) @@ -10,7 +10,7 @@ @copying @c man begin COPYRIGHT -Copyright @copyright{} 1987-2015 Free Software Foundation, Inc. +Copyright @copyright{} 1987-2016 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or --- gcc/doc/cppinternals.texi (revision 232052) +++ gcc/doc/cppinternals.texi (working copy) @@ -18,7 +18,7 @@ @ifinfo This file documents the internals of the GNU C Preprocessor. -Copyright (C) 2000-2015 Free Software Foundation, Inc. +Copyright (C) 2000-2016 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice @@ -47,7 +47,7 @@ into another language, under the above c @page @vskip 0pt plus 1filll @c man begin COPYRIGHT -Copyright @copyright{} 2000-2015 Free Software Foundation, Inc. +Copyright @copyright{} 2000-2016 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice --- gcc/doc/gcc.texi(revision 232052) +++ gcc/doc/gcc.texi(working copy) @@ -40,7 +40,7 @@ @c %**end of header @copying -Copyright @copyright{} 1988-2015 Free Software Foundation, Inc. +Copyright @copyright{} 1988-2016 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or --- gcc/doc/gccint.texi (revision 232052) +++ gcc/doc/gccint.texi (working copy) @@ -26,7 +26,7 @@ @c %**end of header @copying -Copyright @copyright{} 1988-2015 Free Software Foundation, Inc. +Copyright @copyright{} 1988-2016 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or --- gcc/doc/gcov.texi (revision 232052) +++ gcc/doc/gcov.texi (working copy) @@ -4,7 +4,7 @@ @ignore @c man begin COPYRIGHT -Copyright @copyright{} 1996-2015 Free Software Foundation, Inc. +Copyright @copyright{} 1996-2016 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or --- gcc/doc/install.texi(revision 232052) +++ gcc/doc/install.texi(working copy) @@ -44,7
Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint
On Mon, Jan 4, 2016 at 12:21 AM, Jakub Jelinek wrote: > On Sun, Jan 03, 2016 at 07:11:58PM -0800, H.J. Lu wrote: >> --- a/gcc/config/i386/predicates.md >> +++ b/gcc/config/i386/predicates.md >> @@ -951,6 +951,13 @@ >> (match_test "INTEGRAL_MODE_P (GET_MODE (op))") >> (match_test "op == CONSTM1_RTX (GET_MODE (op))"))) >> >> +; Return true when OP is operand acceptable for vector memory operand. >> +; Only AVX can have misaligned memory operand. >> +(define_predicate "vector_memory_operand" >> + (and (match_operand 0 "memory_operand") >> + (ior (match_test "TARGET_AVX") >> + (match_test "MEM_ALIGN (op) >= GET_MODE_ALIGNMENT (mode)" > > Shouldn't this take into account the ssememalign attribute too? > I mean, various instructions have some ssememalign > 8, which means they > can't accept any alignment, but happily accept say >= 32-bit alignment > or >= 64-bit alignment. Though, ssememalign is an instruction attribute > and the predicates/constraints don't have access to the current instruction. > So maybe we need more constraints and more predicates, the ones you've added > for ssememalign == 0 instructions, don't change anything in instructions > with ssememalign == 8 (you've clearly changed some of them, and patch 3 > shows you've tried to partially undo it afterwards, but only the constraint, > not the predicate, and only in one instruction), and use different > predicates/constraints for ssememalign == {16,32,64} instructions. > > Jakub >From INSTRUCTION EXCEPTION SPECIFICATION section in Intel SDM volume 2, only legacy SSE instructions with memory operand not 16-byte aligned get General Protection fault. There is no need to check 1, 2, 4, 8 byte alignments. Since x86 backend has accurate constraints and predicates for 16-byte alignment after my patches, there is no need for ix86_legitimate_combined_insn nor ssememalign. My followup patch will remove them. I have tested it without regressions. I will submit it after my patches have been checked in. -- H.J.
Re: [testsuite][ARM target attributes] Fix effective_target tests
On 18 December 2015 at 15:16, Kyrill Tkachov wrote: > Hi Christophe, > > > On 17/12/15 22:17, Christophe Lyon wrote: >> >> Hi, >> >> Here is an updated version of this patch. >> I did test it with >> -mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard in >> addition to my usual set of options. >> >> Compared to the previous version: >> - I added some doc in sourcebuild.texi >> - I no longer modify arm_vfp_ok... >> - I replaced all uses of arm_vfp with the new arm_fp because I found >> that the existing tests do not actually need to pass -mfpu=vfp: this >> is implicitly set as the default when using -mfloat-abi={softfp|hard} >> - I chose not to remove arm_vfp_ok because we may need it in the >> future, if a test really needs vfp (as opposed to neon for instance) >> - in gcc.target/arm/attr-crypto.c I force the initial fpu to be vfp >> via pragma instead, so that the next pragma fpu >> fpu=crypto-neon-fp-armv8 is always compatible, regardless of the >> command-line options/default fpu >> - same for attr-neon2.c and attr-neon3.c >> - I updated cmp-2.c, unsigned-float.c, vfp-1.c, vfp-ldmdbd.c, >> vfp-ldmdbs.c, vfp-ldmiad.c, vfp-ldmias.c, vfp-stmdbd.c, vfp-stmdbs.c, >> vfp-stmiad.c, vfp-stmias.c, vnmul-[1234].c to use the new arm_fp >> effective target instead of arm_vfp. This is so that they don't need >> to use -mfpu=vfp and can use the new dg-add-options arm_fp >> >> The validation results show (in addition to what I originally reported): >> - attr-crypto.c and attr-neon3.c now ICE in some cases. This is PR68895. >> - depending on the GCC configuration (e.g. --with-fpu=neon) >> attr-neon3.c may fail. This is PR68896. >> >> OK? > > > Thanks for following up on this. > I think you also need to document the new arm_crypto_pragma_ok. > Indeed, I forgot it. Here is a new version of the patch with a few words added to document this function. I did not modify the testcase after Christian's comments and PR68934: my understanding is that the testscase are valid after all and Christian is working on fixing the ICE. 2016-01-04 Christophe Lyon * doc/sourcebuild.texi (arm_crypto_pragma_ok): Document new entry. (arm_fp_ok): Likewise. (arm_fp): Likewise. (arm_crypto): Likewise. * lib/target-supports.exp (check_effective_target_arm_fp_ok_nocache): New. (check_effective_target_arm_fp_ok): New. (add_options_for_arm_fp): New. (check_effective_target_arm_crypto_ok_nocache): Require target_arm_v8_neon_ok instead of arm32. (check_effective_target_arm_crypto_pragma_ok_nocache): New. (check_effective_target_arm_crypto_pragma_ok): New. (add_options_for_arm_vfp): New. * gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective target instead. Force initial fpu to vfp. * gcc.target/arm/attr-neon-builtin-fail.c: Do not force -mfloat-abi=softfp, use arm_fp_ok effective target instead. * gcc.target/arm/attr-neon-fp16.c: Likewise. Remove arm_neon_ok dependency. * gcc.target/arm/attr-neon2.c: Do not force -mfloat-abi=softfp, use arm_vfp effective target instead. Force initial fpu to vfp. * gcc.target/arm/attr-neon3.c: Likewise. * gcc.target/arm/cmp-2.c: Use arm_fp_ok effective target instead of arm_vfp_ok. * gcc.target/arm/unsigned-float.c: Likewise. * gcc.target/arm/vfp-1.c: Likewise. * gcc.target/arm/vfp-ldmdbd.c: Likewise. * gcc.target/arm/vfp-ldmdbs.c: Likewise. * gcc.target/arm/vfp-ldmiad.c: Likewise. * gcc.target/arm/vfp-ldmias.c: Likewise. * gcc.target/arm/vfp-stmdbd.c: Likewise. * gcc.target/arm/vfp-stmdbs.c: Likewise. * gcc.target/arm/vfp-stmiad.c: Likewise. * gcc.target/arm/vfp-stmias.c: Likewise. * gcc.target/arm/vnmul-1.c: Likewise. * gcc.target/arm/vnmul-2.c: Likewise. * gcc.target/arm/vnmul-3.c: Likewise. * gcc.target/arm/vnmul-4.c: Likewise. OK? Christophe. > Kyrill > > >> Christophe >> >> 2015-12-17 Christophe Lyon >> >> * doc/sourcebuild.texi (arm_fp_ok): Document new entry. >> (arm_fp): Likewise. >> * lib/target-supports.exp >> (check_effective_target_arm_fp_ok_nocache): New. >> (check_effective_target_arm_fp_ok): New. >> (add_options_for_arm_fp): New. >> (check_effective_target_arm_crypto_ok_nocache): Require >> target_arm_v8_neon_ok instead of arm32. >> (check_effective_target_arm_crypto_pragma_ok_nocache): New. >> (check_effective_target_arm_crypto_pragma_ok): New. >> (add_options_for_arm_vfp): New. >> * gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective >> target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective >> target instead. Force initial fpu to vfp. >> * gcc.target/arm/attr-neon-builtin-fail.c: Do not force >> -mfloat-abi=softfp, use arm_fp_ok effective target instead. >> * gcc.target/arm/attr-neon-fp16.c: Likewise. Remove arm_neon_ok >> dependency.
Re: [testsuite][ARM target attributes] Fix effective_target tests
On 4 January 2016 at 15:20, Christophe Lyon wrote: > On 18 December 2015 at 15:16, Kyrill Tkachov > wrote: >> Hi Christophe, >> >> >> On 17/12/15 22:17, Christophe Lyon wrote: >>> >>> Hi, >>> >>> Here is an updated version of this patch. >>> I did test it with >>> -mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard in >>> addition to my usual set of options. >>> >>> Compared to the previous version: >>> - I added some doc in sourcebuild.texi >>> - I no longer modify arm_vfp_ok... >>> - I replaced all uses of arm_vfp with the new arm_fp because I found >>> that the existing tests do not actually need to pass -mfpu=vfp: this >>> is implicitly set as the default when using -mfloat-abi={softfp|hard} >>> - I chose not to remove arm_vfp_ok because we may need it in the >>> future, if a test really needs vfp (as opposed to neon for instance) >>> - in gcc.target/arm/attr-crypto.c I force the initial fpu to be vfp >>> via pragma instead, so that the next pragma fpu >>> fpu=crypto-neon-fp-armv8 is always compatible, regardless of the >>> command-line options/default fpu >>> - same for attr-neon2.c and attr-neon3.c >>> - I updated cmp-2.c, unsigned-float.c, vfp-1.c, vfp-ldmdbd.c, >>> vfp-ldmdbs.c, vfp-ldmiad.c, vfp-ldmias.c, vfp-stmdbd.c, vfp-stmdbs.c, >>> vfp-stmiad.c, vfp-stmias.c, vnmul-[1234].c to use the new arm_fp >>> effective target instead of arm_vfp. This is so that they don't need >>> to use -mfpu=vfp and can use the new dg-add-options arm_fp >>> >>> The validation results show (in addition to what I originally reported): >>> - attr-crypto.c and attr-neon3.c now ICE in some cases. This is PR68895. >>> - depending on the GCC configuration (e.g. --with-fpu=neon) >>> attr-neon3.c may fail. This is PR68896. >>> >>> OK? >> >> >> Thanks for following up on this. >> I think you also need to document the new arm_crypto_pragma_ok. >> > Indeed, I forgot it. > > Here is a new version of the patch with a few words added to document > this function. > I did not modify the testcase after Christian's comments and > PR68934: my understanding is that the testscase are valid after > all and Christian is working on fixing the ICE. > With the attachment, this time... > 2016-01-04 Christophe Lyon > > * doc/sourcebuild.texi (arm_crypto_pragma_ok): Document new entry. > (arm_fp_ok): Likewise. > (arm_fp): Likewise. > (arm_crypto): Likewise. > * lib/target-supports.exp > (check_effective_target_arm_fp_ok_nocache): New. > (check_effective_target_arm_fp_ok): New. > (add_options_for_arm_fp): New. > (check_effective_target_arm_crypto_ok_nocache): Require > target_arm_v8_neon_ok instead of arm32. > (check_effective_target_arm_crypto_pragma_ok_nocache): New. > (check_effective_target_arm_crypto_pragma_ok): New. > (add_options_for_arm_vfp): New. > * gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective > target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective > target instead. Force initial fpu to vfp. > * gcc.target/arm/attr-neon-builtin-fail.c: Do not force > -mfloat-abi=softfp, use arm_fp_ok effective target instead. > * gcc.target/arm/attr-neon-fp16.c: Likewise. Remove arm_neon_ok > dependency. > * gcc.target/arm/attr-neon2.c: Do not force -mfloat-abi=softfp, > use arm_vfp effective target instead. Force initial fpu to vfp. > * gcc.target/arm/attr-neon3.c: Likewise. > * gcc.target/arm/cmp-2.c: Use arm_fp_ok effective target instead of > arm_vfp_ok. > * gcc.target/arm/unsigned-float.c: Likewise. > * gcc.target/arm/vfp-1.c: Likewise. > * gcc.target/arm/vfp-ldmdbd.c: Likewise. > * gcc.target/arm/vfp-ldmdbs.c: Likewise. > * gcc.target/arm/vfp-ldmiad.c: Likewise. > * gcc.target/arm/vfp-ldmias.c: Likewise. > * gcc.target/arm/vfp-stmdbd.c: Likewise. > * gcc.target/arm/vfp-stmdbs.c: Likewise. > * gcc.target/arm/vfp-stmiad.c: Likewise. > * gcc.target/arm/vfp-stmias.c: Likewise. > * gcc.target/arm/vnmul-1.c: Likewise. > * gcc.target/arm/vnmul-2.c: Likewise. > * gcc.target/arm/vnmul-3.c: Likewise. > * gcc.target/arm/vnmul-4.c: Likewise. > > OK? > > Christophe. > > >> Kyrill >> >> >>> Christophe >>> >>> 2015-12-17 Christophe Lyon >>> >>> * doc/sourcebuild.texi (arm_fp_ok): Document new entry. >>> (arm_fp): Likewise. >>> * lib/target-supports.exp >>> (check_effective_target_arm_fp_ok_nocache): New. >>> (check_effective_target_arm_fp_ok): New. >>> (add_options_for_arm_fp): New. >>> (check_effective_target_arm_crypto_ok_nocache): Require >>> target_arm_v8_neon_ok instead of arm32. >>> (check_effective_target_arm_crypto_pragma_ok_nocache): New. >>> (check_effective_target_arm_crypto_pragma_ok): New. >>> (add_options_for_arm_vfp): New. >>> * gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective >>> target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective >>> target
Re: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909
On 12/20/2015 05:27 PM, Segher Boessenkool wrote: On Fri, Dec 18, 2015 at 02:19:37AM +0100, Bernd Schmidt wrote: On 12/17/2015 10:07 PM, Segher Boessenkool wrote: It turns out v4 wasn't quite complete anyway; so here "v5". If a candidate PRE cannot get the prologue because a block BB is reachable from it, but PRE does not dominate BB, we try again with the dominators of PRE. That "try again" needs to again consider BB though, we aren't done with it. This fixes this problem. Tested on the 68909 testcase, and bootstrapped and regression checked on powerpc64-linux. Is this okay for trunk? This code is getting really quite confusing, and at the least I think we need more documentation of what exactly vec is supposed to contain at the entry to the inner while loop here. Same as in the other loop: vec is a stack of blocks that still need to be looked at. I can duplicate the comment if you want? No, I think more is needed. The inner loop looks like it should be emptying the vec, but this is not true if we break out of it, and your patch now even adds an explicit push. It also looks like it wants to use the bb_tmp bitmap to cache results for future iterations of the outer loop, but I'm not convinced this is actually correct. I can't follow this behaviour anymore without clear a description of intent. Also, it might be clearer to not modify "pro" in this loop - use a "cand" variable, and modify "pro" instead of last_ok, getting rid of the latter. That would be a regression (from GCC 5); but I understand your worry. How about we disable it if any further problems show up? Let's see whether we can make sense of this code and decide then. bernd
RE: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Wednesday, December 23, 2015 12:06 PM To: Ajit Kumar Agarwal; Richard Biener Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation On 12/11/2015 02:11 AM, Ajit Kumar Agarwal wrote: > > Mibench/EEMBC benchmarks (Target Microblaze) > > Automotive_qsort1(4.03%), Office_ispell(4.29%), Office_stringsearch1(3.5%). > Telecom_adpcm_d( 1.37%), ospfv2_lite(1.35%). >>I'm having a real tough time reproducing any of these results. In fact, I'm >>having a tough time seeing cases where path splitting even applies to the >>Mibench/EEMBC benchmarks >>mentioned above. >>In the very few cases where split-paths might apply, the net resulting >>assembly code I get is the same with and without split-paths. >>How consistent are these results? I am consistently getting the gains for office_ispell and office_stringsearch1, telcom_adpcm_d. I ran it again today and we see gains in the same bench mark tests with the split path changes. >>What functions are being affected that in turn impact performance? For office_ispell: The function are Function "linit (linit, funcdef_no=0, decl_uid=2535, cgraph_uid=0, symbol_order=2) for lookup.c file". "Function checkfile (checkfile, funcdef_no=1, decl_uid=2478, cgraph_uid=1, symbol_order=4)" " Function correct (correct, funcdef_no=2, decl_uid=2503, cgraph_uid=2, symbol_order=5)" " Function askmode (askmode, funcdef_no=24, decl_uid=2464, cgraph_uid=24, symbol_order=27)" for correct.c file. For office_stringsearch1: The function is Function "bmhi_search (bmhi_search, funcdef_no=1, decl_uid=2178, cgraph_uid=1, symbol_order=5)" for bmhisrch.c file. >>What options are you using to compile the benchmarks? I'm trying with >>-O2 -fsplit-paths and -O3 in my attempts to trigger the transformation so >>that I can look more closely at possible heuristics. I am using the following flags. -O3 mlittle-endian -mxl-barrel-shift -mno-xl-soft-div -mhard-float -mxl-float-convert -mxl-float-sqrt -mno-xl-soft-mul -mxl-multiply-high -mxl-pattern-compare. To disable split paths -fno-split-paths is used on top of the above flags. >>Is this with the standard microblaze-elf target? Or with some other target? I am using the --target=microblaze-xilinx-elf to build the microblaze target. Thanks & Regards Ajit jeff
[PATCH] Adjust contrib/update-copyright.py
Hi! One of the gfortran.dg/ tests has NVidia copyright, which made update-copyright.py stop changing anything further. Committed to trunk as obvious. 2016-01-04 Jakub Jelinek * update-copyright.py (GCCCopyright): Add NVIDIA Corporation as external author. --- contrib/update-copyright.py (revision 232054) +++ contrib/update-copyright.py (working copy) @@ -1,6 +1,6 @@ #!/usr/bin/python # -# Copyright (C) 2013 Free Software Foundation, Inc. +# Copyright (C) 2013-2016 Free Software Foundation, Inc. # # This script is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by @@ -696,6 +696,7 @@ class GCCCopyright (Copyright): self.add_external_author ('James Theiler, Brian Gough') self.add_external_author ('Makoto Matsumoto and Takuji Nishimura,') self.add_external_author ('National Research Council of Canada.') +self.add_external_author ('NVIDIA Corporation') self.add_external_author ('Peter Dimov and Multi Media Ltd.') self.add_external_author ('Peter Dimov') self.add_external_author ('Pipeline Associates, Inc.') Property changes on: contrib/update-copyright.py ___ Added: svn:executable ## -0,0 +1 ## +* \ No newline at end of property Jakub
Re: [PATCH, PR69043, fortran] Trying to include a directory causes an infinite loop
On 24/12/15 16:38, Jim MacArthur wrote: Botstrapped and tested for regressions on x86_64-pc-linux-gnu. There is a test case for the bug included. I missed out the test case when creating the first patch. This one should have it. PR fortran/69043 * scanner.c (load_file): Abort and show an error if stat() shows the path is a directory. Index: gcc/fortran/scanner.c === --- gcc/fortran/scanner.c (revision 231945) +++ gcc/fortran/scanner.c (working copy) @@ -2200,6 +2200,8 @@ load_file (const char *realfilename, const char *d FILE *input; int len, line_len; bool first_line; + struct stat st; + int stat_result; const char *filename; /* If realfilename and displayedname are different and non-null then surely realfilename is the preprocessed form of @@ -2242,6 +2244,16 @@ load_file (const char *realfilename, const char *d current_file->filename, current_file->line, filename); return false; } + + stat_result = stat (realfilename, &st); + if (stat_result == 0 && st.st_mode & S_IFDIR) + { + fprintf (stderr, "%s:%d: Error: Included path '%s'" + " is a directory.\n", + current_file->filename, current_file->line, filename); + fclose (input); + return false; + } } /* Load the file. Index: gcc/testsuite/gfortran.dg/include_9.f === --- gcc/testsuite/gfortran.dg/include_9.f (revision 0) +++ gcc/testsuite/gfortran.dg/include_9.f (working copy) @@ -0,0 +1,7 @@ +! { dg-do compile } + + include '/' + program main + end program + +! { dg-error "is a directory" " " { target *-*-* } 3 }
Re: cilkplus fails without pthreads for me
On 01/01/2016 07:13 PM, Mike Stump wrote: cilkplus fails without pthreads for me: xg++: error: unrecognized command line option '-pthread' compiler exited with status 1 output is: xg++: error: unrecognized command line option '-pthread' > @@ -1450,6 +1450,10 @@ proc check_effective_target_cilkplus { } { > return 0; > } > > +if { ! [check_effective_target_pthread] } { > + return 0; > +} > + I think you'll also want to revert Nathan's earlier change that adds just nvptx for the same reason. Ok with that change. Bernd
Re: [PATCH], PowerPC, add ISA 3.0 xxperm (power9 patch #12)
On Thu, Dec 31, 2015 at 1:30 PM, Michael Meissner wrote: > This patch adds support for the ISA 3.0 XXPERM instruction, which is like > VPERM, except it can operate on any VSX register. Since the instruction is a > 3 > operand instruction (RT and RA must be the same), I made it so VPERM was > preferred. I also added XXPERM fusion support where a XXLOR move instruction > immediately before the XXPERM instruction is fused together. > > I have bootstrapped and done make check on a big endian power7 and a little > endian power8 system. In addition, I built all of Spec 2006 with power9 > support enabled, and all of the tests that previously built now build with > XXPERM being generated (the OMNETPP benchmark currently does not build on > little endian for either power8 or power9). Are these patches ok to check in? > > [gcc] > 2015-12-31 Michael Meissner > > * config/rs6000/constraints.md (wo constraint): New constraint for > ISA 3.0 (power9). > > * config/rs6000/rs6000.c (rs6000_debug_reg_global): Add support > for wo constraint. > (rs6000_init_hard_regno_mode_ok): Likewise. > > * config/rs6000/rs6000.h (r6000_reg_class_enum): Add support for > wo constraint. > > * config/rs6000/altivec.md (altivec_vperm_): Clean up vperm > expanders not to have constraints. Add support for ISA 3.0 xxperm > instruction. Add support for fusing xxlor with xxperm. > (altivec_vperm__internal): Likewise. > (altivec_vperm_v8hiv16qi): Likewise. > (altivec_vperm_v16q): Likewise. > (altivec_vperm__uns): Likewise. > (vperm_v8hiv4si): Likewise. > (vperm_v16qiv8hi): Likewise. > > * doc/md.texi (RS/6000 constraints): Document wo constraint. > > [gcc/testsuite] > 2015-12-31 Michael Meissner > > * gcc.target/powerpc/p9-permute.c: New test for xxperm code > generation. This is okay. Thanks, David
Re: [PATCH], PowerPC, Add -mpower9-dform to switches turned on with -mcpu=power9
On Thu, Dec 31, 2015 at 3:41 PM, Michael Meissner wrote: > When I did the inital d-form support for ISA 3.0 (power9) for loading scalar > SF/DF values into Altivec registers, I did not enable -mpower9-dform with the > other ISA 3.0 switches when you used -mcpu=power9. This was during the > initial > development, I had some bugs. I fixed the bugs, but I forgot to enable the > d-form addressing support. This patch enables that default. > > I have built all of Spec 2006 with this option, and there were no failures. I > did not do the full bootstrap/make check right now, but I have done it in the > past with no regressions. Is it ok to install this patch? > > 2015-12-31 Michael Meissner > > * config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_SERVER): Add > OPTION_MASK_P9_DFORM. > > (Note, at some point there will be patches to enable using d-form addressing > with 128-bit vector types, but those patches aren't ready yet). This is okay. Thanks, David
Re: cilkplus fails without pthreads for me
On 01/01/16 13:13, Mike Stump wrote: cilkplus fails without pthreads for me: xg++: error: unrecognized command line option '-pthread' compiler exited with status 1 output is: xg++: error: unrecognized command line option '-pthread' FAIL: c-c++-common/attr-simd-3.c -std=gnu++14 PR68158 (test for errors, line 5) I suspect pthreads is a fairly hard requirement. Either a test compile and link needs to be done, or we need to be able to whack out the tests on non-pthread systems. Ok? Probably not. See the discussion at https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01882.html Admittedly, that was annotating the test directly, but Rainer's comment suggests to me that requiring pthreads would be too great a hammer. You don't say what target -- is it a system where a target triplet is insufficient for this check? nathan
Re: [PATCH] c/68966 - atomic_fetch_* on atomic_bool not diagnosed
Hi Martin, On Sun, Jan 03, 2016 at 08:03:20PM -0700, Martin Sebor wrote: > Index: gcc/doc/extend.texi > === > --- gcc/doc/extend.texi (revision 232047) > +++ gcc/doc/extend.texi (working copy) > @@ -9238,6 +9238,8 @@ > @{ tmp = *ptr; *ptr = ~(tmp & value); return tmp; @} // nand > @end smallexample > > +The object pointed to by the first argument must of integer or pointer type. > It must not be a Boolean type. Too long line and missing "be " after "must"? > +The same constraints on arguments apply as for the corresponding > @code{__sync_op_and_fetch} built-in functions. > + Too long line. > -All memory orders are valid. > +The object pointed to by the first argument must of integer or pointer type. > It must not be a Boolean type. All memory orders are valid. Too long line and missing "be " after "must"? > +The same constraints on arguments apply as for the corresponding > @code{__atomic_op_fetch} built-in functions. All memory orders are valid. Too long line. > @@ -10686,12 +10691,16 @@ >if (!INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type)) > goto incompatible; > > + if (fetch && TREE_CODE (type) == BOOLEAN_TYPE) > + goto incompatible; This goto is indented two more spaces than it should be. > @@ -11250,6 +11259,11 @@ > vec *params) > { >enum built_in_function orig_code = DECL_FUNCTION_CODE (function); > + > + /* Is function is one of the _FETCH_OP_ or _OP_FETCH_ built-ins? I think drop the second "is". > @@ -11325,6 +11339,9 @@ > case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_N: > case BUILT_IN_ATOMIC_LOAD_N: > case BUILT_IN_ATOMIC_STORE_N: > + { > + fetch_op = false; > + } Let's either remove those {} or add a fallthrough comment as done above. > @@ -11358,7 +11375,16 @@ > case BUILT_IN_SYNC_LOCK_TEST_AND_SET_N: > case BUILT_IN_SYNC_LOCK_RELEASE_N: >{ > - int n = sync_resolve_size (function, params); > + /* The following are not _FETCH_OPs and must be accepted with > +pointers to _Bool (or C++ bool). */ > + if (fetch_op) > + fetch_op = > + orig_code != BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_N > + && orig_code != BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_N > + && orig_code != BUILT_IN_SYNC_LOCK_TEST_AND_SET_N > + && orig_code != BUILT_IN_SYNC_LOCK_RELEASE_N; > + Trailing whitespaces on this line. And I think add () around the RHS of the assignment to fetch_op. > Index: gcc/testsuite/gcc.dg/atomic-fetch-bool.c > === > --- gcc/testsuite/gcc.dg/atomic-fetch-bool.c (revision 0) > +++ gcc/testsuite/gcc.dg/atomic-fetch-bool.c (working copy) > @@ -0,0 +1,64 @@ > +/* PR c/68966 - atomic_fetch_* on atomic_bool not diagnosed > + Test to verify that calls to __atomic_fetch_op funcions with a _Bool > + argument are rejected. This is necessary because GCC expects that > + all initialized _Bool objects have a specific representation and > + allowing atomic operations to change it would break the invariant. */ > +/* { dg-do compile } */ > +/* { dg-options "-std=c11" } */ Doesn't matter here, but probably add -pedantic-errors. > Index: gcc/testsuite/gcc.dg/sync-fetch-bool.c > === > --- gcc/testsuite/gcc.dg/sync-fetch-bool.c(revision 0) > +++ gcc/testsuite/gcc.dg/sync-fetch-bool.c(working copy) > @@ -0,0 +1,54 @@ > +/* PR c/68966 - atomic_fetch_* on atomic_bool not diagnosed > + Test to verify that calls to __sync_fetch_op funcions with a _Bool > + argument are rejected. This is necessary because GCC expects that > + all initialized _Bool objects have a specific representation and > + allowing atomic operations to change it would break the invariant. */ > +/* { dg-do compile } */ > +/* { dg-options "-std=c99" } */ As the testcase uses _Atomic, I wonder why there's -std=c99. I'd use -std=c11 -pedantic-errors. Thanks, Marek
Re: [Patch ifcvt] Add a new parameter to limit if-conversion
On 12/31/2015 10:21 AM, Yuri Rumyantsev wrote: Here is slightly modified patch which limits a number of conditional moves instead of changing conditional branch cost. This is in fact a work-around for very poor cost model which needs to be enhanced to evaluate cost of conditional move that could be greater then cost of ordinary move (for some targets). This fix did not show any performance regressions on different x86 platforms in comparison with James patch. I think this is OK. In the future, when attaching patches, please make sure they are text/plain so they are displayed by mail readers and can be quoted. Bernd
Mark oacc kernels fns
There's currently no robust predicate to determine whether an oacc offload function is for a kernels region (as opposed to a parallel region). The test in tree-ssa-loop.c uses the heuristic of seeing if all the dimensions are defaulted (which can easily be true for parallel offloads at that point). This patch marks TREE_PUBLIC on the offload attribute values, to note kernels regions, and adds a predicate to check that. I also broke out the function level determination from oacc_validate_dims, as there it was only laziness on my part to have not done that earlier. Using these predicates improves the dump output of the openacc device lowering pass too. ok? nathan 2016-01-04 Nathan Sidwell * omp-low.h (oacc_fn_attrib_kernels_p): Declare. * omp-low.c (set_oacc_fn_attrib): Add IS_KERNEL arg. (oacc_fn_attrib_kernels_p, oacc_fn_attrib_level): New. (expand_omp_target): Pass is_kernel to set_oacc_fn_attrib. (oacc_validate_dims): Add LEVEL arg, don't return level. (new_oacc_loop_routine): Use oacc_fn_attrib_level, not oacc_validate_dims. (execute_oacc_device_lower): Adjust, add more dump output. * tree-ssa-loop.c (gate_oacc_kernels): Use oacc_fn_attrib_kernels_p. Index: gcc/omp-low.c === --- gcc/omp-low.c (revision 232057) +++ gcc/omp-low.c (working copy) @@ -12395,10 +12395,11 @@ replace_oacc_fn_attrib (tree fn, tree di /* Scan CLAUSES for launch dimensions and attach them to the oacc function attribute. Push any that are non-constant onto the ARGS - list, along with an appropriate GOMP_LAUNCH_DIM tag. */ + list, along with an appropriate GOMP_LAUNCH_DIM tag. IS_KERNEL is + true, if these are for a kernels region offload function. */ static void -set_oacc_fn_attrib (tree fn, tree clauses, vec *args) +set_oacc_fn_attrib (tree fn, tree clauses, bool is_kernel, vec *args) { /* Must match GOMP_DIM ordering. */ static const omp_clause_code ids[] @@ -12423,6 +12424,9 @@ set_oacc_fn_attrib (tree fn, tree clause non_const |= GOMP_DIM_MASK (ix); } attr = tree_cons (NULL_TREE, dim, attr); + /* Note kernelness with TREE_PUBLIC. */ + if (is_kernel) + TREE_PUBLIC (attr) = 1; } replace_oacc_fn_attrib (fn, attr); @@ -12491,6 +12495,36 @@ get_oacc_fn_attrib (tree fn) return lookup_attribute (OACC_FN_ATTRIB, DECL_ATTRIBUTES (fn)); } +/* Return true if this oacc fn attrib is for a kernels offload + region. We use the TREE_PUBLIC flag of each dimension -- only + need to check the first one. */ + +bool +oacc_fn_attrib_kernels_p (tree attr) +{ + return TREE_PUBLIC (TREE_VALUE (attr)); +} + +/* Return level at which oacc routine may spawn a partitioned loop, or + -1 if it is not a routine (i.e. is an offload fn). */ + +static int +oacc_fn_attrib_level (tree attr) +{ + tree pos = TREE_VALUE (attr); + + if (!TREE_PURPOSE (pos)) +return -1; + + int ix = 0; + for (ix = 0; ix != GOMP_DIM_MAX; + ix++, pos = TREE_CHAIN (pos)) +if (!integer_zerop (TREE_PURPOSE (pos))) + break; + + return ix; +} + /* Extract an oacc execution dimension from FN. FN must be an offloaded function or routine that has already had its execution dimensions lowered to the target-specific values. */ @@ -12808,6 +12842,7 @@ expand_omp_target (struct omp_region *re enum built_in_function start_ix; location_t clause_loc; unsigned int flags_i = 0; + bool oacc_kernels_p = false; switch (gimple_omp_target_kind (entry_stmt)) { @@ -12827,8 +12862,10 @@ expand_omp_target (struct omp_region *re start_ix = BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA; flags_i |= GOMP_TARGET_FLAG_EXIT_DATA; break; -case GF_OMP_TARGET_KIND_OACC_PARALLEL: case GF_OMP_TARGET_KIND_OACC_KERNELS: + oacc_kernels_p = true; + /* FALLTHROUGH */ +case GF_OMP_TARGET_KIND_OACC_PARALLEL: start_ix = BUILT_IN_GOACC_PARALLEL; break; case GF_OMP_TARGET_KIND_OACC_DATA: @@ -13010,7 +13047,7 @@ expand_omp_target (struct omp_region *re break; case BUILT_IN_GOACC_PARALLEL: { - set_oacc_fn_attrib (child_fn, clauses, &args); + set_oacc_fn_attrib (child_fn, clauses, oacc_kernels_p, &args); tagging = true; } /* FALLTHRU */ @@ -18929,17 +18966,17 @@ oacc_xform_loop (gcall *call) } /* Validate and update the dimensions for offloaded FN. ATTRS is the - raw attribute. DIMS is an array of dimensions, which is returned. - Returns the function level dimensionality -- the level at which an - offload routine wishes to partition a loop. */ + raw attribute. DIMS is an array of dimensions, which is filled in. + LEVEL is the partitioning level of a routine, or -1 for an offload + region itself. */ -static int -oacc_validate_dims (tree fn, tree attrs, int *dims) +static void +oacc_validate_dims (tree fn, tree attrs, int *dims, int level) { tree purpose[GOMP_DIM_MAX]; unsigned ix; tree pos =
Re: varpool/constpool bug
My patch to stop constant pool objects accidentally ending up in the varpool caused problems with (at least) powerpc. (https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02100.html) Hence reverted. This patch changes compare_base_decls to simply use the varpool getter, rather than get_create. We still need the preceding decl_in_symtab_p to filter out decls that should never be in the varpool (the getter has an assert to check you're not trying to abuse it). ok? nathan 2016-01-04 Nathan Sidwell gcc/ * alias.c (compare_base_decls): Use symtab_node::get. gcc/testsuite/ * gcc.dg/alias-15.c: New. Index: alias.c === --- alias.c (revision 232057) +++ alias.c (working copy) @@ -2044,8 +2044,15 @@ compare_base_decls (tree base1, tree bas || !decl_in_symtab_p (base2)) return 0; - ret = symtab_node::get_create (base1)->equal_address_to - (symtab_node::get_create (base2), true); + /* Don't cause symbols to be inserted by the act of checking. */ + symtab_node *node1 = symtab_node::get (base1); + if (!node1) +return 0; + symtab_node *node2 = symtab_node::get (base2); + if (!node2) +return 0; + + ret = node1->equal_address_to (node2, true); if (ret == 2) return -1; return ret; Index: testsuite/gcc.dg/alias-15.c === --- testsuite/gcc.dg/alias-15.c (revision 0) +++ testsuite/gcc.dg/alias-15.c (working copy) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -fdump-ipa-cgraph" } */ + +/* RTL-level CSE shouldn't introduce LCO (for the string) into varpool */ +char *p; + +void foo () +{ + p = "abc\n"; + + while (*p != '\n') +p++; +} + +/* { dg-final { scan-ipa-dump-not "LC0" "cgraph" } } */
Re: [PATCH] PR/68089: C++-11: Ingore "alignas(0)".
On 01/04/2016 04:33 AM, Dominik Vogt wrote: On Fri, Jan 01, 2016 at 05:53:08PM -0700, Martin Sebor wrote: On 12/31/2015 04:50 AM, Dominik Vogt wrote: The attached patch fixes C++-11 handling of "alignas(0)" which should be ignored but currently generates an error message. A test case is included; the patch has been tested on S390x. Since it's a language issue it should be independent of the backend used. The patch doesn't handle value-dependent expressions(*). It seems that the problem is in handle_aligned_attribute() calling check_user_alignment() with the second argument (ALLOW_ZERO) set to false. Calling it with true fixes the problem and handles value-dependent expressions (I haven't done any more testing beyond that). Like the attached patch? (Passes the testsuite on s390x.) Yes, like that (though someone other than me needs to approve your patch). But wouldn't an "aligned" attribute be added, allowing the backend to possibly generate an error or a warning? AFAICS, both the C and C++ front ends ignore the attribute when check_user_alignment() returns -1 (either on error or when the requested alignment is zero and ALLOW_ZERO is true). Martin PS I wonder what it is about this thread that makes my email client (Thunderbird) include only gcc-patches and krebbel when I hit Reply All and not you. (I had to manually add your email.) It looks like your reply back to me did the same thing. Martin
Re: cilkplus fails without pthreads for me
On 01/04/16 10:06, Bernd Schmidt wrote: On 01/01/2016 07:13 PM, Mike Stump wrote: cilkplus fails without pthreads for me: xg++: error: unrecognized command line option '-pthread' compiler exited with status 1 output is: xg++: error: unrecognized command line option '-pthread' > @@ -1450,6 +1450,10 @@ proc check_effective_target_cilkplus { } { > return 0; > } > > +if { ! [check_effective_target_pthread] } { > + return 0; > +} > + I think you'll also want to revert Nathan's earlier change that adds just nvptx for the same reason. Ok with that change. Yes please. nathan
guilty test suite fix
So, I’d like for the guality people to chime in. I only kick in, if they fail to do so for any reason. :-) Either, the stuff downstream _must_ arrange for newline ended content, or this code has to do it, if they don’t. My take, I think they are signing up for newline terminated content: binutils/gdb$ grep 'will be killed' *.c top.c:_("\tInferior %d [%s] will be killed.\n"), inf->num, commit b8fa0bfa752bb672c66a1d6fdefcdf4cb308a712 Author: Pedro Alves Date: Fri Aug 14 14:28:15 2009 + 2009-08-14 Pedro Alves gdb/ * top.c (any_thread_of): Delete. (kill_or_detach): Use any_thread_of_process. * top.c (print_inferior_quit_action): New. (quit_confirm): Rewrite to print info about all inferiors. * target.c (dispose_inferior): New. (target_preopen): Use it. 2009-08-14 Pedro Alves gdb/testsuite/ * gdb.threads/killed.exp, gdb.threads/manythreads.exp, gdb.threads/staticthreads.exp: Adjust to "quit" output changes. + _("\tInferior %d [%s] will be killed.\n"), inf->num, So, the question is, what output this line, stripping that newline? As far as I can tell, there is no gdb that won’t print it since 2009. The other possible patch would be the routine that read that from gdb and printed it without the newline. I’m not sure if that patch or your patch is better. On Jan 4, 2016, at 4:16 AM, Alan Lawrence wrote: > On 10/12/15 10:31, Alan Lawrence wrote: >> Runs of the guality testsuite can sometimes end up with gcc.log containing >> malformed lines like: >> >> A debugging session is active.PASS: gcc.dg/guality/pr36728-1.c -O2 line >> 18 arg4 == 4 >> A debugging session is active.PASS: gcc.dg/guality/restrict.c -O2 line 30 >> type:ip == int * >> Inferior 1 [process 27054] will be killed.PASS: >> gcc.dg/guality/restrict.c -O2 line 30 type:cicrp == const int * const >> restrict >> Inferior 1 [process 27160] will be killed.PASS: >> gcc.dg/guality/restrict.c -O2 line 30 type:cvirp == int * const volatile >> restrict >> >> This patch just makes sure the PASS/FAIL comes at the beginning of a line. >> (At >> the slight cost of adding some extra newlines not in the actual test output.) >> >> I moved the remote_close target calls earlier, to avoid any possible race >> condition of extra output being generated after the newline - this may not be >> strictly necessary. >> >> Tested on aarch64-none-linux-gnu and x86_64-none-linux-gnu. >> >> I think this is reasonable for stage 3 - OK for trunk? >> >> gcc/testsuite/ChangeLog: >> * lib/gcc-gdb-test.exp (gdb-test): call remote_close earlier, and send >> newline to log, before calling pass/fail/unsupported. >> * lib/gcc-simulate-thread.exp (simulate-thread): Likewise. >> --- >> gcc/testsuite/lib/gcc-gdb-test.exp| 15 ++- >> gcc/testsuite/lib/gcc-simulate-thread.exp | 10 +++--- >> 2 files changed, 17 insertions(+), 8 deletions(-) >> >> diff --git a/gcc/testsuite/lib/gcc-gdb-test.exp >> b/gcc/testsuite/lib/gcc-gdb-test.exp >> index d3ba6e4..f60cabf 100644 >> --- a/gcc/testsuite/lib/gcc-gdb-test.exp >> +++ b/gcc/testsuite/lib/gcc-gdb-test.exp >> @@ -84,8 +84,9 @@ proc gdb-test { args } { >> remote_expect target [timeout_value] { >> # Too old GDB >> -re "Unhandled dwarf expression|Error in sourced command file|> type in " { >> -unsupported "$testname" >> remote_close target >> +send_log "\n" >> +unsupported "$testname" >> file delete $cmd_file >> return >> } >> @@ -93,7 +94,9 @@ proc gdb-test { args } { >> -re {[\n\r]\$1 = ([^\n\r]*)[\n\r]+\$2 = ([^\n\r]*)[\n\r]} { >> set first $expect_out(1,string) >> set second $expect_out(2,string) >> +remote_close target >> if { $first == $second } { >> +send_log "\n" >> pass "$testname" >> } else { >> # We need the -- to disambiguate $first from an option, >> @@ -101,7 +104,6 @@ proc gdb-test { args } { >> send_log -- "$first != $second\n" >> fail "$testname" >> } >> -remote_close target >> file delete $cmd_file >> return >> } >> @@ -116,26 +118,29 @@ proc gdb-test { args } { >> regsub -all {\mlong int\M} $type "long" type >> regsub -all {\mshort int\M} $type "short" type >> set expected [lindex $args 2] >> +remote_close target >> if { $type == $expected } { >> +send_log "\n" >> pass "$testname" >> } else { >> send_log -- "$type != $expected\n" >> fail "$testname" >> } >> -remote_close target >> file delete $cmd_file >> return >> } >> timeout { >> -unsupported "$testname" >> remote_close target >> +send_log "\n" >> +
[gomp4] Fix acc_on_device for C++
This patch fixes acc_on_device's C++ wrapper when compiling at -O0. The wrapper isn't inlined, and we need to mark the function as needing emission by the device compiler too. nathan 2016-01-04 Nathan Sidwell * openacc.c (acc_on_device): Add routine pragma for C++ wrapper. * testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: New. Index: libgomp/openacc.h === --- libgomp/openacc.h (revision 232058) +++ libgomp/openacc.h (working copy) @@ -121,6 +121,7 @@ int acc_set_cuda_stream (int, void *) __ /* Forwarding function with correctly typed arg. */ +#pragma acc routine seq inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW { return acc_on_device ((int) __arg); Index: libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c === --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c (revision 0) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-additional-options "-O0" } */ + +#include + +/* acc_on_device might not be folded at -O0, but it should work. */ + +int main () +{ + int dev; + +#pragma acc parallel copyout (dev) + { +dev = acc_on_device (acc_device_not_host); + } + + int expect = 1; + +#if ACC_DEVICE_TYPE_host + expect = 0; +#endif + + return dev != expect; +}
Re: [PATCH 1/3] Fix logic bug in Cilk Plus array expansion
On 01/02/2016 04:26 PM, Patrick Palka wrote: On Sat, Jan 2, 2016 at 3:21 AM, Jakub Jelinek wrote: On Fri, Jan 01, 2016 at 10:06:34PM -0700, Jeff Law wrote: gcc/cp/ChangeLog: * cp-array-notation.c (cp_expand_cond_array_notations): Return error_mark_node only if find_rank failed, not if it was successful. Can you use -fdump-tree-original in the testcase and verify there's no <<< error >>> expressions in the resulting dump file? With that change, this is OK. I think the patch is incomplete. Because, find_rank does not always emit an error if it returns false, so we again have cases where we can get error_mark_node in the code without error being emitted. else if (*rank != current_rank) { /* In this case, find rank is being recursed through a set of expression of the form A B, where A and B both have array notations in them and the rank of A is not equal to rank of B. A simple example of such case is the following: X[:] + Y[:][:] */ *rank = current_rank; return false; } and other spots. E.g. if (prev_arg && EXPR_HAS_LOCATION (prev_arg)) error_at (EXPR_LOCATION (prev_arg), "rank mismatch between %qE and %qE", prev_arg, TREE_OPERAND (expr, ii)); looks very suspicious. Hmm, good point. Here's a contrived test case that causes find_rank to return false without emitting an error message thus we again end up with an error_mark_node in the gimplifier: /* { dg-do compile } */ /* { dg-options "-fcilkplus" } */ void foo() {} #define ALEN 1024 int main(int argc, char* argv[]) { typedef void (*f) (void *); f b[ALEN], c[ALEN][ALEN]; (b[:]) ((void *)c[:][:]); _Cilk_spawn foo(); return 0; } But this patch was intended to only fix the testsuite fallout that patch 3 would have otherwise caused, and not to e.g. fix all the bugs with find_rank. (BTW patch 3 also makes this test case trigger an ICE, instead of being silently miscompiled.) Can you please include this test (xfailed) when you commit patch #1. I think you want the test to scan for error_mark_node in the gimplified dump. Jeff
Re: varpool/constpool bug
On 01/04/2016 08:57 AM, Nathan Sidwell wrote: My patch to stop constant pool objects accidentally ending up in the varpool caused problems with (at least) powerpc. (https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02100.html) Hence reverted. This patch changes compare_base_decls to simply use the varpool getter, rather than get_create. We still need the preceding decl_in_symtab_p to filter out decls that should never be in the varpool (the getter has an assert to check you're not trying to abuse it). ok? Once it passes the usual bootstrap & regression testing. Looking at it again, it seems "obvious" now that the act of comparing things for alias analysis shouldn't be inserting new things into the tables. jeff
Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c
On 12/24/2015 04:55 AM, Alan Lawrence wrote: This version changes the test cases to fix failures on some platforms, by rewriting the initializers so that they aren't pushed out to the constant pool. gcc/ChangeLog: * tree-ssa-scopedtables.c (avail_expr_hash): Hash MEM_REF and ARRAY_REF using get_ref_base_and_extent. (equal_mem_array_ref_p): New. (hashable_expr_equal_p): Add call to previous. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ssa-dom-cse-5.c: New. * gcc.dg/tree-ssa/ssa-dom-cse-6.c: New. * gcc.dg/tree-ssa/ssa-dom-cse-7.c: New. This is fine. Thanks, Jeff
Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c
On 12/21/2015 06:13 AM, Alan Lawrence wrote: This is a respin of patches https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03266.html and https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03267.html, which were "too quickly" approved before concerns with efficiency were pointed out. I tried to change the hashing just in tree-ssa-dom.c using C++ subclassing, but couldn't cleanly separate this out from tree-ssa-scopedtables and tree-ssa-threadedge.c due to use of avail_exprs_stack. So I figured it was probably appropriate to use the equivalences in jump threading too. Also, using get_ref_base_and_extent unifies handling of MEM_REFs and ARRAY_REFs (hence only one patch rather than two). It is appropriate. I've added a couple of testcases that show the improvement in DOM, but in all cases I had to disable FRE, even PRE, to get any improvement, apart from on ssa-dom-cse-2.c itself (where on the affected platforms FRE still does not do the optimization). This makes me wonder if this is the right approach or whether changing the references output by SRA (as per https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01490.html , judged as a hack to SRA to work around limitations in DOM - or is it?) would be better. I just doubt it happens all that much. Jeff
Re: cilkplus fails without pthreads for me
On Jan 4, 2016, at 7:22 AM, Nathan Sidwell wrote: > On 01/01/16 13:13, Mike Stump wrote: >> cilkplus fails without pthreads for me: >> >> xg++: error: unrecognized command line option '-pthread' >> compiler exited with status 1 >> output is: >> xg++: error: unrecognized command line option '-pthread' >> >> FAIL: c-c++-common/attr-simd-3.c -std=gnu++14 PR68158 (test for errors, >> line 5) >> >> I suspect pthreads is a fairly hard requirement. Either a test compile and >> link needs to be done, or we need to be able to whack out the tests on >> non-pthread systems. >> >> Ok? > > Probably not. See the discussion at > https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01882.html Admittedly, that > was annotating the test directly, but Rainer's comment suggests to me that > requiring pthreads would be too great a hammer. > > You don't say what target -- is it a system where a target triplet is > insufficient for this check? That was on purpose. All non-pthreads targets. One cannot ascertain if a system has pthreads by checking a target triplet. This is the problem I want fixed. Adding a clause for one such target doesn’t fix all such targets. I didn’t read Rainer’s comments as authoritative for the design of cilk. I also don’t read them as inconsistent with my proposed patch. Since Bernd Ok it, and that is consistent with the apparent design to me, I’m going with his approval. Here is my take, the runtime is written to require pthreads, that’s just how it is. Since it is, the testing for it is going to require pthreads. That’s just how it is. We gate off all tests that require cilk on systems that don’t have pthreads. Special escapes from the general rule can happen before or after the newly added clause on a per target or some other metric. The next proposed patch is: Index: target-supports.exp === --- target-supports.exp (revision 232062) +++ target-supports.exp (working copy) @@ -1442,11 +1442,6 @@ proc check_effective_target_cilkplus { } return 0; } -# No pthreads on NVPTX -if { [istarget nvptx-*-*] } { - return 0; -} - if { ! [check_effective_target_pthread] } { return 0; } I believe this is now, not required nor desirable. The attr-simd-3.c test case on NVPTX should be able to show if this is on the right track. Ok?
Re: cilkplus fails without pthreads for me
On Jan 4, 2016, at 9:09 AM, Nathan Sidwell wrote: > On 01/04/16 10:06, Bernd Schmidt wrote: >> On 01/01/2016 07:13 PM, Mike Stump wrote: >>> cilkplus fails without pthreads for me: >>> >>> xg++: error: unrecognized command line option '-pthread' compiler >>> exited with status 1 output is: xg++: error: unrecognized command >>> line option '-pthread' >> >> > @@ -1450,6 +1450,10 @@ proc check_effective_target_cilkplus { } { >> > return 0; >> > } >> > >> > +if { ! [check_effective_target_pthread] } { >> > + return 0; >> > +} >> > + >> >> I think you'll also want to revert Nathan's earlier change that adds just >> nvptx >> for the same reason. Ok with that change. > > Yes please. I believe that patch has: +/* { dg-do compile { target cilkplus } } */ in it, and this I believe is required for the test to be skipped on my target?
Re: cilkplus fails without pthreads for me
On 01/04/16 14:19, Mike Stump wrote: I believe that patch has: +/* { dg-do compile { target cilkplus } } */ in it, and this I believe is required for the test to be skipped on my target? that bit is still necessary. It's the bit in the .exp file testing nvptx-*-* that's no longer needed. nathan
Re: cilkplus fails without pthreads for me
On 01/04/16 14:17, Mike Stump wrote: The next proposed patch is: Index: target-supports.exp === --- target-supports.exp (revision 232062) +++ target-supports.exp (working copy) @@ -1442,11 +1442,6 @@ proc check_effective_target_cilkplus { } return 0; } -# No pthreads on NVPTX -if { [istarget nvptx-*-*] } { - return 0; -} - if { ! [check_effective_target_pthread] } { return 0; } I believe this is now, not required nor desirable. The attr-simd-3.c test case on NVPTX should be able to show if this is on the right track. Ok? works for me, thanks.
[PATCH] Fix SLP ICE (PR tree-optimization/69083)
Hi! The vec-cmp SLP patch added + if (VECTOR_BOOLEAN_TYPE_P (vector_type)) + { + /* Can't use VIEW_CONVERT_EXPR for booleans because +of possibly different sizes of scalar value and +vector element. */ ... + } hunk a few lines above this spot, but that only handles constants. For non-constants, the problem is similar, boolean vector element type might have different size from the op's type, but it really should be fold convertible to that, so while we can't use VCE, we can use a NOP_EXPR instead. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-01-04 Jakub Jelinek PR tree-optimization/69083 * tree-vect-slp.c (vect_get_constant_vectors): For VECTOR_BOOLEAN_TYPE_P assert op is fold_convertible_p to vector_type's element type. If op is fold_convertible_p to vector_type's element type, use NOP_EXPR instead of VCE. * gcc.dg/vect/pr69083.c: New test. --- gcc/tree-vect-slp.c.jj 2015-12-18 09:38:27.0 +0100 +++ gcc/tree-vect-slp.c 2016-01-04 12:56:20.800412147 +0100 @@ -2967,9 +2967,22 @@ vect_get_constant_vectors (tree op, slp_ { tree new_temp = make_ssa_name (TREE_TYPE (vector_type)); gimple *init_stmt; - op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (vector_type), op); - init_stmt - = gimple_build_assign (new_temp, VIEW_CONVERT_EXPR, op); + if (VECTOR_BOOLEAN_TYPE_P (vector_type)) + { + gcc_assert (fold_convertible_p (TREE_TYPE (vector_type), + op)); + init_stmt = gimple_build_assign (new_temp, NOP_EXPR, op); + } + else if (fold_convertible_p (TREE_TYPE (vector_type), op)) + init_stmt = gimple_build_assign (new_temp, NOP_EXPR, op); + else + { + op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (vector_type), + op); + init_stmt + = gimple_build_assign (new_temp, VIEW_CONVERT_EXPR, + op); + } gimple_seq_add_stmt (&ctor_seq, init_stmt); op = new_temp; } --- gcc/testsuite/gcc.dg/vect/pr69083.c.jj 2016-01-04 13:11:51.958279240 +0100 +++ gcc/testsuite/gcc.dg/vect/pr69083.c 2016-01-04 13:12:36.142663787 +0100 @@ -0,0 +1,20 @@ +/* PR tree-optimization/69083 */ +/* { dg-do compile } */ +/* { dg-additional-options "-O3" } */ + +int d; +short f; + +void +foo (int a, int b, int e, short c) +{ + for (; e; e++) +{ + int j; + for (j = 0; j < 3; j++) + { + f = 7 >> b ? a : b; + d |= c == 1 ^ 1 == f; + } +} +} Jakub
Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint
On Mon, Jan 4, 2016 at 4:11 AM, H.J. Lu wrote: > On Sat, Jan 2, 2016 at 10:26 AM, H.J. Lu wrote: >> On Sat, Jan 2, 2016 at 3:58 AM, Richard Biener >> wrote: >>> On January 2, 2016 11:32:33 AM GMT+01:00, Uros Bizjak >>> wrote: On Thu, Dec 31, 2015 at 4:29 PM, H.J. Lu wrote: > On Thu, Dec 31, 2015 at 1:14 AM, Uros Bizjak wrote: >> On Wed, Dec 30, 2015 at 9:53 PM, H.J. Lu wrote: >>> SSE vector arithmetic and logic instructions only accept aligned memory >>> operand. This patch adds vector_memory_operand and "Bm" constraint for >>> aligned SSE memory operand. They are applied to SSE any_logic patterns. >>> >>> OK for trunk and release branches if there are regressions? >> >> This patch is just papering over deeper problem, as Jakub said in the PR [1]: >> >> --q-- >> GCC uses the ix86_legitimate_combined_insn target hook to disallow >> misaligned memory into certain SSE instructions. >> (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset &)FeatureEntry_21 + 8] ]) 0) >> is not misaligned memory, it is a subreg of a pseudo register, so it is fine. >> If the replacement of the pseudo register with memory happens in some >> other pass, then it probably either should use the >> legitimate_combined_insn target hook or some other one. I think we >> have already a PR where that happens during live range shrinking. >> --/q-- >> >> Please figure out where memory replacement happens. There are several >> other SSE insns (please grep the .md for "ssememalign" attribute) that >> are affected by this problem, so fixing a couple of patterns won't >> solve the problem completely. > > LRA turns > > insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ]) > (xor:V4SI (reg:V4SI 149) > (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset > &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3} > (expr_list:REG_DEAD (reg:V4SI 149) > (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset > &)FeatureEntry_2(D)] ]) > (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame) > (const_int -16 [0xfff0])) [3 > MEM[(unsigned int *)&D.2851]+0 S16 A128]) > (nil) > > into > > (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ] [148]) > (xor:V4SI (reg:V4SI 21 xmm0 [149]) > (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) > [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) > foo.ii:26 3454 {*xorv4si3} > (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame) > (const_int -16 [0xfff0])) [3 MEM[(unsigned > int *)&D.2851]+0 S16 A128]) > (nil))) > > since > > (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6 > MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) > > satisfies the 'm" constraint. I don't think LRA should call > ix86_legitimate_combined_insn to validate to validate constraints on > an instruction. Hm... if LRA desn't assume that generic "m" constraint implies at least natural alignment of propageted operand, then your patch is the way to go. >>> >>> I don't think it even considers alignment. Archs where alignment validity >>> depends on the actual instruction should model this with proper constraints. >>> >>> But in this case, *every* SSE vector memory constraint should be changed to Bm. >>> >>> I'd say so ... >> >> The "Bm" constraint should be applied only to non-move SSE >> instructions with 16-byte memory operand. >> > > Here are 3 patch which implement it. There is one exception > on SSE *mov_internal. With Bm, LRA will crash, which > may be an LRA bug. I used m as workaround. > > Tested on x86-64 without regressions. OK for trunk? Looking at the comment in Patch 3, I'd say let's keep *mov_internal constraints unchanged. But it looks to me that we have to finally relax if ((TARGET_AVX || TARGET_IAMCU) && (misaligned_operand (operands[0], mode) || misaligned_operand (operands[1], mode))) condition to allow unaligned moves for all targets, not only AVX and IAMCU. The rationale for this decision is that if the RA won't be able to satisfy Bm constraint, it can load the value into XMM register. This will be done through SSE *mov internal, so unaligned move has to be generated. But please, double check the changes. In Patch 2, I have found: @ -2041,10 +2041,10 @@ (set_attr "mode" "")]) (define_insn "*ieee_smax3" - [(set (match_operand:VF 0 "register_operand" "=v,v") + [(set (match_operand:VF 0 "register_operand" "=x,v") (unspec:VF [(match_operand:VF 1 "register_operand" "0,v") - (match_operand:VF 2 "nonimmediat
Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint
On Mon, Jan 4, 2016 at 12:19 PM, Uros Bizjak wrote: > On Mon, Jan 4, 2016 at 4:11 AM, H.J. Lu wrote: >> On Sat, Jan 2, 2016 at 10:26 AM, H.J. Lu wrote: >>> On Sat, Jan 2, 2016 at 3:58 AM, Richard Biener >>> wrote: On January 2, 2016 11:32:33 AM GMT+01:00, Uros Bizjak wrote: >On Thu, Dec 31, 2015 at 4:29 PM, H.J. Lu wrote: >> On Thu, Dec 31, 2015 at 1:14 AM, Uros Bizjak >wrote: >>> On Wed, Dec 30, 2015 at 9:53 PM, H.J. Lu >wrote: SSE vector arithmetic and logic instructions only accept aligned >memory operand. This patch adds vector_memory_operand and "Bm" constraint >for aligned SSE memory operand. They are applied to SSE any_logic >patterns. OK for trunk and release branches if there are regressions? >>> >>> This patch is just papering over deeper problem, as Jakub said in >the PR [1]: >>> >>> --q-- >>> GCC uses the ix86_legitimate_combined_insn target hook to disallow >>> misaligned memory into certain SSE instructions. >>> (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset >&)FeatureEntry_21 + 8] ]) 0) >>> is not misaligned memory, it is a subreg of a pseudo register, so it >is fine. >>> If the replacement of the pseudo register with memory happens in >some >>> other pass, then it probably either should use the >>> legitimate_combined_insn target hook or some other one. I think we >>> have already a PR where that happens during live range shrinking. >>> --/q-- >>> >>> Please figure out where memory replacement happens. There are >several >>> other SSE insns (please grep the .md for "ssememalign" attribute) >that >>> are affected by this problem, so fixing a couple of patterns won't >>> solve the problem completely. >> >> LRA turns >> >> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ]) >> (xor:V4SI (reg:V4SI 149) >> (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset >> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3} >> (expr_list:REG_DEAD (reg:V4SI 149) >> (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset >> &)FeatureEntry_2(D)] ]) >> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 >frame) >> (const_int -16 [0xfff0])) [3 >> MEM[(unsigned int *)&D.2851]+0 S16 A128]) >> (nil) >> >> into >> >> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ] >[148]) >> (xor:V4SI (reg:V4SI 21 xmm0 [149]) >> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] >[117]) >> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) >> foo.ii:26 3454 {*xorv4si3} >> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame) >> (const_int -16 [0xfff0])) [3 >MEM[(unsigned >> int *)&D.2851]+0 S16 A128]) >> (nil))) >> >> since >> >> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6 >> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) >> >> satisfies the 'm" constraint. I don't think LRA should call >> ix86_legitimate_combined_insn to validate to validate constraints on >> an instruction. > >Hm... > >if LRA desn't assume that generic "m" constraint implies at least >natural alignment of propageted operand, then your patch is the way to >go. I don't think it even considers alignment. Archs where alignment validity depends on the actual instruction should model this with proper constraints. But in this case, *every* SSE vector memory constraint should be >changed to Bm. I'd say so ... >>> >>> The "Bm" constraint should be applied only to non-move SSE >>> instructions with 16-byte memory operand. >>> >> >> Here are 3 patch which implement it. There is one exception >> on SSE *mov_internal. With Bm, LRA will crash, which >> may be an LRA bug. I used m as workaround. >> >> Tested on x86-64 without regressions. OK for trunk? > > Looking at the comment in Patch 3, I'd say let's keep > *mov_internal constraints unchanged. But it looks to me that we > have to finally relax > > if ((TARGET_AVX || TARGET_IAMCU) > && (misaligned_operand (operands[0], mode) > || misaligned_operand (operands[1], mode))) > > condition to allow unaligned moves for all targets, not only AVX and > IAMCU. The rationale for this decision is that if the RA won't be able > to satisfy Bm constraint, it can load the value into XMM register. > This will be done through SSE *mov internal, so unaligned move > has to be generated. > > But please, double check the changes. In Patch 2, I have found: > > @ -2041,10 +2041,10 @@ > (set_attr "mode" "")]) > > (define_insn "*ieee_smax3" > - [(set (match_op
Re: [patch] ARM FreeBSD fix bootstrap
Ping :) TIA, Andreas On 23.12.15 20:28, Andreas Tobler wrote: On 23.12.15 11:22, Richard Earnshaw (lists) wrote: On 22/12/15 19:53, Andreas Tobler wrote: Hi all, the commit for PR68617 broke boostrap on armv6*-*-freebsd*. We still have unaligned_access = 0 on armv6 here on FreeBSD. The commit from the above PR overrides my SUBTARGET_OVERRIDE_OPTIONS I called in arm_option_override. And it sets the unaligned_access to 1. The attached patch fixes this, bootstrap ongoing but passed the breaking stage where genmddeps bus errored. Is this patch ok for trunk once bootstrap completes? TIA, Andreas 2015-12-22 Andreas Tobler * config/arm/freebsd.h (SUBTARGET_OVERRIDE_OPTIONS): Adjust to check unaligned_access on the gcc_options set. * config/arm/arm.c (arm_option_override): Move SUBTARGET_OVERRIDE_OPTIONS from here to (arm_option_override_internal). Moving this hunk to a different place potentially affects VXWORKS (the only other target that uses this hook). I'd like to see confirmation from the VxWorks maintainers (Nathan?) that this doesn't cause any problems for them. If it does, then I think you need to create a new subtarget hook (SUBTARGET_OVERRIDE_INTERNAL_OPTIONS?) and change FreeBSD to use that rather than the existing hook. I noticed this morning that VxWorks might be affected. To be on the safe side I'd like to propose the attached version since it makes clear where the override belongs to and I don't think hijacking SUBTARGET_OVERRIDE_OPTIONS is a good idea here. I need the override in the arm_option_override_internal function after the default has been set. What do you think? Thanks, Andreas 2015-12-23 Andreas Tobler * config/arm/freebsd.h: Rename SUBTARGET_OVERRIDE_OPTIONS to SUBTARGET_OVERRIDE_INTERNAL_OPTIONS. Adjust to check unaligned_access on the gcc_options set. * config/arm/arm.c (arm_option_override_internal): Use SUBTARGET_OVERRIDE_INTERNAL_OPTIONS.
Re: [PATCH], PowerPC IEEE 128-bit fp, #11 (enable libgcc conversions)
On Thu, Dec 31, 2015 at 08:29:58PM +, Joseph Myers wrote: > On Tue, 29 Dec 2015, Michael Meissner wrote: > > > +/* __eqkf2 returns 0 if equal, or 1 if not equal or NaN. */ > > +CMPtype > > +__eqkf2_hw (TFtype a, TFtype b) > > +{ > > + return (__builtin_isunordered (a, b) || (a != b)) ? 1 : 0; > > This is more complicated than necessary. "return a != b;" will suffice. Ok. I will change this. > > +/* __gekf2 returns -1 if a < b, 0 if a == b, +1 if a > b, or -2 if NaN. */ > > +CMPtype > > +__gekf2_hw (TFtype a, TFtype b) > > +{ > > + if (__builtin_isunordered (a, b)) > > +return -2; > > + > > + else if (a < b) > > +return -1; > > The __builtin_isunordered check should come after the < check, so that the > "invalid" exception gets raised for quiet NaN arguments. > > > +/* __lekf2 returns -1 if a < b, 0 if a == b, +1 if a > b, or +2 if NaN. */ > > +CMPtype > > +__lekf2_hw (TFtype a, TFtype b) > > +{ > > + if (__builtin_isunordered (a, b)) > > +return 2; > > + > > + else if (a < b) > > +return -1; > > Likewise. Ok. I will change these. > > + char *p = (char *) getauxval (AT_PLATFORM); > > glibc deliberately exports __getauxval at a public symbol version, so you > can do this in a namespace-clean way. Ok. I will change this. The getauxval call by the way is only a temporary measure until the support for __builtin_cpu_supports is added to the PowerPC. > > +CMPtype __eqkf2 (TFtype, TFtype) > > + __attribute__ ((__ifunc__ ("__eqkf2_resolve"))); > > + > > +CMPtype __gekf2 (TFtype, TFtype) > > + __attribute__ ((__ifunc__ ("__gekf2_resolve"))); > > + > > +CMPtype __lekf2 (TFtype, TFtype) > > + __attribute__ ((__ifunc__ ("__lekf2_resolve"))); > > Don't you need to arrange __nekf2, __gtkf2, __ltkf2 aliases to these > resolvers (the semantics mean they don't need to be separate functions, > but the entry points need to be there given the optabs the back end sets > up)? Because of default conversions we cannot allow the normal optab mechanism to be used for IEEE 128-bit floating point emulation. This is due to the fact that if you have a __float128 comparison, the compiler will see if a larger type can do the comparison, and in this case, the larger type is TFmode (i.e. IBM extended double using the current defaults). Instead rs6000_generate_compare generates the calls, and it does not use the alternate names. I can easily put in the resolver calls as well for the alternate names just in case somebody hand crafts a call to __nekf3. > > > +#ifdef _ARCH_PPC64 > > +TItype_ppc __fixkfti (TFtype) > > + __attribute__ ((__ifunc__ ("__fixkfti_resolve"))); > > + > > +UTItype_ppc __fixunskfti (TFtype) > > + __attribute__ ((__ifunc__ ("__fixunskfti_resolve"))); > > + > > +TFtype __floattikf (TItype_ppc) > > + __attribute__ ((__ifunc__ ("__floattikf_resolve"))); > > + > > +TFtype __floatuntikf (UTItype_ppc) > > + __attribute__ ((__ifunc__ ("__floatuntikf_resolve"))); > > +#endif > > I don't see the point of using ifuncs that just always return the software > version. You might as well just give the software version the appropriate > function name directly, and add ifuncs later if adding a version using > hardware arithmetic (e.g. doing something like the libgcc2.c functions > with hardware conversions to/from DImode). I'll think about it. At some point, I was hoping to have implementations for ISA 3.0. However, there is not an ISA 3.0 instruction that converts from 128-bit integer to 128-bit floating point or vice versa. > > > +#define ISA_BIT(x) (1 << (63 - x)) > > As far as I can see, my previous comment still applies: this part of the > sfp-machine.h changes needs to be under some appropriate conditional so > that it only applies when building the KFmode functions, not for 32-bit > soft-float / e500 libgcc builds. Agreed. I will fix this. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[PATCH] document -Winvalid-memory-model
As discussed in c/69104, the -Winvalid-memory-model option is not documented in the manual. The attached patch rectifies that. Martin gcc/ChangeLog: 2016-01-04 Martin Sebor * doc/invoke.texi (Warning Options): Document -Winvalid-memory-model. Index: doc/invoke.texi === --- doc/invoke.texi (revision 232047) +++ doc/invoke.texi (working copy) @@ -263,7 +263,8 @@ -Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol -Winvalid-pch -Wlarger-than=@var{len} @gol -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol --Wmain -Wmaybe-uninitialized -Wmemset-transposed-args @gol +-Wmain -Wmaybe-uninitialized -Winvalid-memory-model @gol +-Wmemset-transposed-args @gol -Wmisleading-indentation -Wmissing-braces @gol -Wmissing-field-initializers -Wmissing-include-dirs @gol -Wno-multichar -Wnonnull -Wnormalized=@r{[}none@r{|}id@r{|}nfc@r{|}nfkc@r{]} @gol @@ -4305,6 +4306,26 @@ computations may be deleted by data flow analysis before the warnings are printed. +@item -Winvalid-memory-model +@opindex Winvalid-memory-model +@opindex Wno-invalid-memory-model +Warn for invocations of @ref{__atomic Builtins}, @ref{__sync Builtins}, +and the C11 atomic generic functions with a memory consistency argument +that is either invalid for the operation or outside the range of values +of the @code{memory_order} enumeration. For example, since the +@code{__atomic_store} and @code{__atomic_store_n} built-ins are only +defined for the relaxed, relase, and sequentially consistent memory +orders the following code is diagnosed: + +@smallexample +void store (int *i) +@{ + __atomic_store_n (i, 0, memory_order_consume); +@} +@end smallexample + +@option{-Winvalid-memory-model} is enabled by default. + @item -Wmaybe-uninitialized @opindex Wmaybe-uninitialized @opindex Wno-maybe-uninitialized
[PATCH] libiberty: support demangling of rvalue reference typenames
This patch adds handling of 'O' (rvalue ref) type codes in the C++ demangling code which is done similarly to the 'R' (regular references) case. It also adds a few testcases for various demangling styles which are just mirrored versions of the corresponding regular references demangling tests. libiberty/ChangeLog: 2016-01-04 Artemiy Volkov * cplus-dem.c (enum type_kind_t): Add tk_rvalue_reference constant. (demangle_template_value_parm): Handle tk_rvalue_reference type kind. (do_type): Support 'O' type id (rvalue references). * testsuite/demangle-expected: Add tests. --- libiberty/cplus-dem.c | 13 +++- libiberty/testsuite/demangle-expected | 115 ++ 2 files changed, 126 insertions(+), 2 deletions(-) diff --git a/libiberty/cplus-dem.c b/libiberty/cplus-dem.c index c68b981..122f05c 100644 --- a/libiberty/cplus-dem.c +++ b/libiberty/cplus-dem.c @@ -237,6 +237,7 @@ typedef enum type_kind_t tk_none, tk_pointer, tk_reference, + tk_rvalue_reference, tk_integral, tk_bool, tk_char, @@ -2033,7 +2034,8 @@ demangle_template_value_parm (struct work_stuff *work, const char **mangled, } else if (tk == tk_real) success = demangle_real_value (work, mangled, s); - else if (tk == tk_pointer || tk == tk_reference) + else if (tk == tk_pointer || tk == tk_reference + || tk == tk_rvalue_reference) { if (**mangled == 'Q') success = demangle_qualified (work, mangled, s, @@ -3574,6 +3576,14 @@ do_type (struct work_stuff *work, const char **mangled, string *result) tk = tk_reference; break; + /* An rvalue reference type */ + case 'O': + (*mangled)++; + string_prepend (&decl, "&&"); + if (tk == tk_none) +tk = tk_rvalue_reference; + break; + /* An array */ case 'A': { @@ -3631,7 +3641,6 @@ do_type (struct work_stuff *work, const char **mangled, string *result) break; case 'M': - case 'O': { type_quals = TYPE_UNQUALIFIED; diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected index aebf01b..f947de7 100644 --- a/libiberty/testsuite/demangle-expected +++ b/libiberty/testsuite/demangle-expected @@ -31,6 +31,11 @@ ArrowLine::ArrowheadIntersects(Arrowhead *, BoxObj &, Graphic *) ArrowLine::ArrowheadIntersects # --format=gnu --no-params +ArrowheadIntersects__9ArrowLineP9ArrowheadO6BoxObjP7Graphic +ArrowLine::ArrowheadIntersects(Arrowhead *, BoxObj &&, Graphic *) +ArrowLine::ArrowheadIntersects +# +--format=gnu --no-params AtEnd__13ivRubberGroup ivRubberGroup::AtEnd(void) ivRubberGroup::AtEnd @@ -51,6 +56,11 @@ TextCode::CoreConstDecls(ostream &) TextCode::CoreConstDecls # --format=gnu --no-params +CoreConstDecls__8TextCodeO7ostream +TextCode::CoreConstDecls(ostream &&) +TextCode::CoreConstDecls +# +--format=gnu --no-params Detach__8StateVarP12StateVarView StateVar::Detach(StateVarView *) StateVar::Detach @@ -66,21 +76,41 @@ RelateManip::Effect(ivEvent &) RelateManip::Effect # --format=gnu --no-params +Effect__11RelateManipO7ivEvent +RelateManip::Effect(ivEvent &&) +RelateManip::Effect +# +--format=gnu --no-params FindFixed__FRP4CNetP4CNet FindFixed(CNet *&, CNet *) FindFixed # --format=gnu --no-params +FindFixed__FOP4CNetP4CNet +FindFixed(CNet *&&, CNet *) +FindFixed +# +--format=gnu --no-params Fix48_abort__FR8twolongs Fix48_abort(twolongs &) Fix48_abort # --format=gnu --no-params +Fix48_abort__FO8twolongs +Fix48_abort(twolongs &&) +Fix48_abort +# +--format=gnu --no-params GetBarInfo__15iv2_6_VScrollerP13ivPerspectiveRiT2 iv2_6_VScroller::GetBarInfo(ivPerspective *, int &, int &) iv2_6_VScroller::GetBarInfo # --format=gnu --no-params +GetBarInfo__15iv2_6_VScrollerP13ivPerspectiveOiT2 +iv2_6_VScroller::GetBarInfo(ivPerspective *, int &&, int &&) +iv2_6_VScroller::GetBarInfo +# +--format=gnu --no-params GetBgColor__C9ivPainter ivPainter::GetBgColor(void) const ivPainter::GetBgColor @@ -986,11 +1016,21 @@ List::Pix::Pix(List::Pix const &) List::Pix::Pix # --format=gnu --no-params +__Q2t4List1Z10VHDLEntity3PixOCQ2t4List1Z10VHDLEntity3Pix +List::Pix::Pix(List::Pix const &&) +List::Pix::Pix +# +--format=gnu --no-params __Q2t4List1Z10VHDLEntity7elementRC10VHDLEntityPT0 List::element::element(VHDLEntity const &, List::element *) List::element::element # --format=gnu --no-params +__Q2t4List1Z10VHDLEntity7elementOC10VHDLEntityPT0 +List::element::element(VHDLEntity const &&, List::element *) +List::element::element +# +--format=gnu --no-params __Q2t4List1Z10VHDLEntity7elementRCQ2t4List1Z10VHDLEntity7element List::element::element(List::element const &) List::element::element @@ -1036,6 +1076,11 @@ PixX >::PixX(PixX >::PixX # --format=gnu --no-params +__t4PixX3Z11VHDLLibraryZ14VHDLLibraryRepZt4List1Z10VHDLEntityOCt4PixX3Z11VHDLLibrary
Re: [PATCH] document -Winvalid-memory-model
On 01/04/2016 03:17 PM, Martin Sebor wrote: As discussed in c/69104, the -Winvalid-memory-model option is not documented in the manual. The attached patch rectifies that. Thanks for tackling this. Index: doc/invoke.texi === --- doc/invoke.texi (revision 232047) +++ doc/invoke.texi (working copy) @@ -263,7 +263,8 @@ -Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol -Winvalid-pch -Wlarger-than=@var{len} @gol -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol --Wmain -Wmaybe-uninitialized -Wmemset-transposed-args @gol +-Wmain -Wmaybe-uninitialized -Winvalid-memory-model @gol +-Wmemset-transposed-args @gol -Wmisleading-indentation -Wmissing-braces @gol -Wmissing-field-initializers -Wmissing-include-dirs @gol -Wno-multichar -Wnonnull -Wnormalized=@r{[}none@r{|}id@r{|}nfc@r{|}nfkc@r{]} @gol We just had a patch a month or so ago (r231022) to sort this table into something approaching alphabetical order, module no- prefixes, I guess. Can you please insert the new entry into a less random place? @@ -4305,6 +4306,26 @@ computations may be deleted by data flow analysis before the warnings are printed. +@item -Winvalid-memory-model +@opindex Winvalid-memory-model +@opindex Wno-invalid-memory-model +Warn for invocations of @ref{__atomic Builtins}, @ref{__sync Builtins}, +and the C11 atomic generic functions with a memory consistency argument +that is either invalid for the operation or outside the range of values +of the @code{memory_order} enumeration. For example, since the +@code{__atomic_store} and @code{__atomic_store_n} built-ins are only s/built-ins/builtins/ (like in the @refs you used previously) +defined for the relaxed, relase, and sequentially consistent memory s/relase/release/ +orders the following code is diagnosed: + +@smallexample +void store (int *i) +@{ + __atomic_store_n (i, 0, memory_order_consume); +@} +@end smallexample + +@option{-Winvalid-memory-model} is enabled by default. + @item -Wmaybe-uninitialized @opindex Wmaybe-uninitialized @opindex Wno-maybe-uninitialized OK with those changes. -Sandra
Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint
On Mon, Jan 4, 2016 at 1:11 PM, H.J. Lu wrote: > On Mon, Jan 4, 2016 at 12:19 PM, Uros Bizjak wrote: >> On Mon, Jan 4, 2016 at 4:11 AM, H.J. Lu wrote: >>> On Sat, Jan 2, 2016 at 10:26 AM, H.J. Lu wrote: On Sat, Jan 2, 2016 at 3:58 AM, Richard Biener wrote: > On January 2, 2016 11:32:33 AM GMT+01:00, Uros Bizjak > wrote: >>On Thu, Dec 31, 2015 at 4:29 PM, H.J. Lu wrote: >>> On Thu, Dec 31, 2015 at 1:14 AM, Uros Bizjak >>wrote: On Wed, Dec 30, 2015 at 9:53 PM, H.J. Lu >>wrote: > SSE vector arithmetic and logic instructions only accept aligned >>memory > operand. This patch adds vector_memory_operand and "Bm" constraint >>for > aligned SSE memory operand. They are applied to SSE any_logic >>patterns. > > OK for trunk and release branches if there are regressions? This patch is just papering over deeper problem, as Jakub said in >>the PR [1]: --q-- GCC uses the ix86_legitimate_combined_insn target hook to disallow misaligned memory into certain SSE instructions. (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset >>&)FeatureEntry_21 + 8] ]) 0) is not misaligned memory, it is a subreg of a pseudo register, so it >>is fine. If the replacement of the pseudo register with memory happens in >>some other pass, then it probably either should use the legitimate_combined_insn target hook or some other one. I think we have already a PR where that happens during live range shrinking. --/q-- Please figure out where memory replacement happens. There are >>several other SSE insns (please grep the .md for "ssememalign" attribute) >>that are affected by this problem, so fixing a couple of patterns won't solve the problem completely. >>> >>> LRA turns >>> >>> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ]) >>> (xor:V4SI (reg:V4SI 149) >>> (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset >>> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3} >>> (expr_list:REG_DEAD (reg:V4SI 149) >>> (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset >>> &)FeatureEntry_2(D)] ]) >>> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 >>frame) >>> (const_int -16 [0xfff0])) [3 >>> MEM[(unsigned int *)&D.2851]+0 S16 A128]) >>> (nil) >>> >>> into >>> >>> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ] >>[148]) >>> (xor:V4SI (reg:V4SI 21 xmm0 [149]) >>> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] >>[117]) >>> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) >>> foo.ii:26 3454 {*xorv4si3} >>> (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame) >>> (const_int -16 [0xfff0])) [3 >>MEM[(unsigned >>> int *)&D.2851]+0 S16 A128]) >>> (nil))) >>> >>> since >>> >>> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6 >>> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32]))) >>> >>> satisfies the 'm" constraint. I don't think LRA should call >>> ix86_legitimate_combined_insn to validate to validate constraints on >>> an instruction. >> >>Hm... >> >>if LRA desn't assume that generic "m" constraint implies at least >>natural alignment of propageted operand, then your patch is the way to >>go. > > I don't think it even considers alignment. Archs where alignment validity > depends on the actual instruction should model this with proper > constraints. > > But in this case, *every* SSE vector memory constraint should be >>changed to Bm. > > I'd say so ... The "Bm" constraint should be applied only to non-move SSE instructions with 16-byte memory operand. >>> >>> Here are 3 patch which implement it. There is one exception >>> on SSE *mov_internal. With Bm, LRA will crash, which >>> may be an LRA bug. I used m as workaround. >>> >>> Tested on x86-64 without regressions. OK for trunk? >> >> Looking at the comment in Patch 3, I'd say let's keep >> *mov_internal constraints unchanged. But it looks to me that we >> have to finally relax >> >> if ((TARGET_AVX || TARGET_IAMCU) >> && (misaligned_operand (operands[0], mode) >> || misaligned_operand (operands[1], mode))) >> >> condition to allow unaligned moves for all targets, not only AVX and >> IAMCU. The rationale for this decision is that if the RA won't be able >> to satisfy Bm constraint, it can load the value into XMM register. >> This will be done through SSE *mov internal, so unaligned move >> has to be generated. >> >>
Re: [PATCH] document -Winvalid-memory-model
We just had a patch a month or so ago (r231022) to sort this table into something approaching alphabetical order, module no- prefixes, I guess. Can you please insert the new entry into a less random place? Sure. It was meant to be inserted in the right place, my brain just filtered out the "invalid" part in the name of the option. Fixed in the updated patch. @@ -4305,6 +4306,26 @@ computations may be deleted by data flow analysis before the warnings are printed. +@item -Winvalid-memory-model +@opindex Winvalid-memory-model +@opindex Wno-invalid-memory-model +Warn for invocations of @ref{__atomic Builtins}, @ref{__sync Builtins}, +and the C11 atomic generic functions with a memory consistency argument +that is either invalid for the operation or outside the range of values +of the @code{memory_order} enumeration. For example, since the +@code{__atomic_store} and @code{__atomic_store_n} built-ins are only s/built-ins/builtins/ (like in the @refs you used previously) I thought the @refs were an inconsistency and built-in was the preferred spelling. That's what someone else pointed out to me sometime ago and what I see documented in the GCC Coding Conventions (and what I also noticed used elsewhere in this section of the manual). But looking more closely, there are quite a few uses of both builtins and built-ins, on this manual page as well as on others. Which makes me wonder which of the two is prevalent. I count 68 occurrences of the words builtin and builtins in the manual (separated by space and ignoring capitalization) and 481 occurrences of the words built-in and built-ins. I also count 50 occurrences of built-in in the gcc.pot file and 33 occurrences of builtin. This seems to confirm my understanding of the recommended convention (though it also shows how inconsistently it is being followed). Please let me know if I missed something. +defined for the relaxed, relase, and sequentially consistent memory s/relase/release/ Fixed, thanks. I will go ahead and commit this version of the patch tomorrow unless you have objections. Martin gcc/ChangeLog: 2016-01-04 Martin Sebor * doc/invoke.texi (Warning Options): Document -Winvalid-memory-model. Index: doc/invoke.texi === --- doc/invoke.texi (revision 232047) +++ doc/invoke.texi (working copy) @@ -260,7 +260,7 @@ -Wignored-qualifiers -Wincompatible-pointer-types @gol -Wimplicit -Wimplicit-function-declaration -Wimplicit-int @gol -Winit-self -Winline -Wno-int-conversion @gol --Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol +-Wno-int-to-pointer-cast -Winvalid-memory-model -Wno-invalid-offsetof @gol -Winvalid-pch -Wlarger-than=@var{len} @gol -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol -Wmain -Wmaybe-uninitialized -Wmemset-transposed-args @gol @@ -4305,6 +4305,26 @@ computations may be deleted by data flow analysis before the warnings are printed. +@item -Winvalid-memory-model +@opindex Winvalid-memory-model +@opindex Wno-invalid-memory-model +Warn for invocations of @ref{__atomic Builtins}, @ref{__sync Builtins}, +and the C11 atomic generic functions with a memory consistency argument +that is either invalid for the operation or outside the range of values +of the @code{memory_order} enumeration. For example, since the +@code{__atomic_store} and @code{__atomic_store_n} built-ins are only +defined for the relaxed, release, and sequentially consistent memory +orders the following code is diagnosed: + +@smallexample +void store (int *i) +@{ + __atomic_store_n (i, 0, memory_order_consume); +@} +@end smallexample + +@option{-Winvalid-memory-model} is enabled by default. + @item -Wmaybe-uninitialized @opindex Wmaybe-uninitialized @opindex Wno-maybe-uninitialized
Re: [PATCH] document -Winvalid-memory-model
On 01/04/2016 05:15 PM, Martin Sebor wrote: s/built-ins/builtins/ (like in the @refs you used previously) I thought the @refs were an inconsistency and built-in was the preferred spelling. That's what someone else pointed out to me sometime ago and what I see documented in the GCC Coding Conventions (and what I also noticed used elsewhere in this section of the manual). But looking more closely, there are quite a few uses of both builtins and built-ins, on this manual page as well as on others. Which makes me wonder which of the two is prevalent. I count 68 occurrences of the words builtin and builtins in the manual (separated by space and ignoring capitalization) and 481 occurrences of the words built-in and built-ins. I also count 50 occurrences of built-in in the gcc.pot file and 33 occurrences of builtin. This seems to confirm my understanding of the recommended convention (though it also shows how inconsistently it is being followed). Please let me know if I missed something. Sorry, my bad. "Built-in", hyphenated, is correct as an adjective, as in "built-in function". It's not clear what we're supposed to use as a noun, but it seems "builtin" isn't it, either. :-S I think that a couple years ago I changed a bunch of other instances to "built-in function" to avoid this trouble, but I won't insist on that here. I will go ahead and commit this version of the patch tomorrow unless you have objections. Looks OK to me now. -Sandra
Re: [PATCH] c/68966 - atomic_fetch_* on atomic_bool not diagnosed
On 01/04/2016 08:22 AM, Marek Polacek wrote: Hi Martin, ... Thanks for the careful review! I've fixed the problems you pointed out in the attached patch. The typos are my bad. As for the whitespace, I have to confess I'm finding all the rules tedious to follow without some sort of automation. Jason suggested some option to git but I don't use git to commit (too many other problems). I'm also not sure the option makes Git replace 8 spaces with TABs. I tried to have my Emacs automatically strip trailing whitespace for me but that was causing spurious changes on otherwise untouched lines that already contain it (clearly, I'm not the only who struggles with whitespace). I don't suppose everyone is voluntarily subjecting themselves to this torture so there must be a way to make it less onerous and painful. What's your secret? Martin gcc/ChangeLog: 2016-01-04 Martin Sebor PR c/68966 * doc/extend.texi (__atomic Builtins, __sync Builtins): Document constraint on the type of arguments. gcc/c-family/ChangeLog: 2016-01-04 Martin Sebor PR c/68966 * c-common.c (sync_resolve_size): Reject first argument when it's a pointer to _Bool. gcc/testsuite/ChangeLog: 2016-01-04 Martin Sebor PR c/68966 * gcc.dg/atomic-fetch-bool.c: New test. * gcc.dg/sync-fetch-bool.c: Same. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 232047) +++ gcc/doc/extend.texi (working copy) @@ -9238,6 +9238,9 @@ @{ tmp = *ptr; *ptr = ~(tmp & value); return tmp; @} // nand @end smallexample +The object pointed to by the first argument must be of integer or pointer +type. It must not be a Boolean type. + @emph{Note:} GCC 4.4 and later implement @code{__sync_fetch_and_nand} as @code{*ptr = ~(tmp & value)} instead of @code{*ptr = ~tmp & value}. @@ -9261,6 +9264,9 @@ @{ *ptr = ~(*ptr & value); return *ptr; @} // nand @end smallexample +The same constraints on arguments apply as for the corresponding +@code{__sync_op_and_fetch} built-in functions. + @emph{Note:} GCC 4.4 and later implement @code{__sync_nand_and_fetch} as @code{*ptr = ~(*ptr & value)} instead of @code{*ptr = ~*ptr & value}. @@ -9507,13 +9513,14 @@ @deftypefnx {Built-in Function} @var{type} __atomic_or_fetch (@var{type} *ptr, @var{type} val, int memorder) @deftypefnx {Built-in Function} @var{type} __atomic_nand_fetch (@var{type} *ptr, @var{type} val, int memorder) These built-in functions perform the operation suggested by the name, and -return the result of the operation. That is, +return the result of the operation. That is, @smallexample @{ *ptr @var{op}= val; return *ptr; @} @end smallexample -All memory orders are valid. +The object pointed to by the first argument must be of integer or pointer +type. It must not be a Boolean type. All memory orders are valid. @end deftypefn @@ -9530,7 +9537,8 @@ @{ tmp = *ptr; *ptr @var{op}= val; return tmp; @} @end smallexample -All memory orders are valid. +The same constraints on arguments apply as for the corresponding +@code{__atomic_op_fetch} built-in functions. All memory orders are valid. @end deftypefn Index: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c (revision 232047) +++ gcc/c-family/c-common.c (working copy) @@ -7804,7 +7804,7 @@ else if (TYPE_P (*node)) type = node, is_type = 1; - if ((i = check_user_alignment (align_expr, false)) == -1 + if ((i = check_user_alignment (align_expr, true)) == -1 || !check_cxx_fundamental_alignment_constraints (*node, i, flags)) *no_add_attrs = true; else if (is_type) @@ -10657,11 +10657,16 @@ /* A helper function for resolve_overloaded_builtin in resolving the overloaded __sync_ builtins. Returns a positive power of 2 if the first operand of PARAMS is a pointer to a supported data type. - Returns 0 if an error is encountered. */ + Returns 0 if an error is encountered. + FETCH is true when FUNCTION is one of the _FETCH_OP_ or _OP_FETCH_ + built-ins. */ static int -sync_resolve_size (tree function, vec *params) +sync_resolve_size (tree function, vec *params, bool fetch) { + /* Type of the argument. */ + tree argtype; + /* Type the argument points to. */ tree type; int size; @@ -10671,7 +10676,7 @@ return 0; } - type = TREE_TYPE ((*params)[0]); + argtype = type = TREE_TYPE ((*params)[0]); if (TREE_CODE (type) == ARRAY_TYPE) { /* Force array-to-pointer decay for C++. */ @@ -10686,12 +10691,16 @@ if (!INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type)) goto incompatible; + if (fetch && TREE_CODE (type) == BOOLEAN_TYPE) +goto incompatible; + size = tree_to_uhwi (TYPE_SIZE_UNIT (type)); if (size == 1 || size == 2 || size == 4 || size == 8 || size == 16) return size; incompatible: - error ("incompatible type for argument %d of %qE", 1, function); + error (
Re: [PATCH] c++/58109 - alignas() fails to compile with constant expression
Ping: looking for review/approval of the patch below: https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02074.html Thanks Martin On 12/22/2015 07:32 PM, Martin Sebor wrote: The attached patch adds handling of dependent arguments to attribute aligned and attribute vector_size, fixing c++/58109 and 69022 - attribute vector_size ignored with dependent bytes. Tested on x86_64. Martin
[PATCH, GCC] Fix PR67781: wrong code generation for partial load on big endian targets
Hi, bswap optimization pass generate wrong code on big endian targets when the result of a bit operation it analyzed is a partial load of the range of memory accessed by the original expression (when one or more bytes at lowest address were lost in the computation). This is due to the way cmpxchg and cmpnop are adjusted in find_bswap_or_nop before being compared to the result of the symbolic expression. Part of the adjustment is endian independent: it's to ignore the bytes that were not accessed by the original gimple expression. However, when the result has less byte than that original expression, some more byte need to be ignored and this is endian dependent. The current code only support loss of bytes at the highest addresses because there is no code to adjust the address of the load. However, for little and big endian targets the bytes at highest address translate into different byte significance in the result. This patch first separate cmpxchg and cmpnop adjustement into 2 steps and then deal with endianness correctly for the second step. ChangeLog entries are as follow: *** gcc/ChangeLog *** 2015-12-16 Thomas Preud'homme PR tree-optimization/67781 * tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in cmpxchg and cmpnop in two steps: first the ones not accessed in original gimple expression in a endian independent way and then the ones not accessed in the final result in an endian-specific way. *** gcc/testsuite/ChangeLog *** 2015-12-16 Thomas Preud'homme PR tree-optimization/67781 * gcc.c-torture/execute/pr67781.c: New file. diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c b/gcc/testsuite/gcc.c-torture/execute/pr67781.c new file mode 100644 index 000..bf50aa2 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c @@ -0,0 +1,34 @@ +#ifdef __UINT32_TYPE__ +typedef __UINT32_TYPE__ uint32_t; +#else +typedef unsigned uint32_t; +#endif + +#ifdef __UINT8_TYPE__ +typedef __UINT8_TYPE__ uint8_t; +#else +typedef unsigned char uint8_t; +#endif + +struct +{ + uint32_t a; + uint8_t b; +} s = { 0x123456, 0x78 }; + +int pr67781() +{ + uint32_t c = (s.a << 8) | s.b; + return c; +} + +int +main () +{ + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) +return 0; + + if (pr67781 () != 0x12345678) +__builtin_abort (); + return 0; +} diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index b00f046..e5a185f 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct symbolic_number *n, int limit) static gimple * find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap) { + unsigned rsize; + uint64_t tmpn, mask; /* The number which the find_bswap_or_nop_1 result should match in order to have a full byte swap. The number is shifted to the right according to the size of the symbolic number before using it. */ @@ -2464,24 +2466,38 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap) /* Find real size of result (highest non-zero byte). */ if (n->base_addr) -{ - int rsize; - uint64_t tmpn; - - for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++); - n->range = rsize; -} +for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++); + else +rsize = n->range; - /* Zero out the extra bits of N and CMP*. */ + /* Zero out the bits corresponding to untouched bytes in original gimple + expression. */ if (n->range < (int) sizeof (int64_t)) { - uint64_t mask; - mask = ((uint64_t) 1 << (n->range * BITS_PER_MARKER)) - 1; cmpxchg >>= (64 / BITS_PER_MARKER - n->range) * BITS_PER_MARKER; cmpnop &= mask; } + /* Zero out the bits corresponding to unused bytes in the result of the + gimple expression. */ + if (rsize < n->range) +{ + if (BYTES_BIG_ENDIAN) + { + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; + cmpxchg &= mask; + cmpnop >>= (n->range - rsize) * BITS_PER_MARKER; + } + else + { + mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1; + cmpxchg >>= (n->range - rsize) * BITS_PER_MARKER; + cmpnop &= mask; + } + n->range = rsize; +} + /* A complete byte swap should make the symbolic number to start with the largest digit in the highest order byte. Unchanged symbolic number indicates a read with same endianness as target architecture. */ Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC and on an arm-none-eabi GCC cross-compiler without any regression. I'm waiting for a slot on gcc110 to do a big endian bootstrap but at least the testcase works on mips-linux. I'll send an update once bootstrap is complete. Is this ok for trunk and 5 branch in a week time if no regression is reported? Best r
Re: [PATCH] libiberty: support demangling of rvalue reference typenames
Artemiy Volkov writes: > This patch adds handling of 'O' (rvalue ref) type codes in the C++ demangling > code which is done similarly to the 'R' (regular references) case. It also > adds > a few testcases for various demangling styles which are just mirrored versions > of the corresponding regular references demangling tests. > > libiberty/ChangeLog: > > 2016-01-04 Artemiy Volkov > > * cplus-dem.c (enum type_kind_t): Add tk_rvalue_reference > constant. > (demangle_template_value_parm): Handle tk_rvalue_reference > type kind. > (do_type): Support 'O' type id (rvalue references). Is there a compiler that actually generate these symbols? Ian
Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c
On January 4, 2016 8:08:17 PM GMT+01:00, Jeff Law wrote: >On 12/21/2015 06:13 AM, Alan Lawrence wrote: >> This is a respin of patches >> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03266.html and >> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03267.html, which were >> "too quickly" approved before concerns with efficiency were pointed >out. >> >> I tried to change the hashing just in tree-ssa-dom.c using C++ >subclassing, but >> couldn't cleanly separate this out from tree-ssa-scopedtables and >> tree-ssa-threadedge.c due to use of avail_exprs_stack. So I figured >it was >> probably appropriate to use the equivalences in jump threading too. >Also, >> using get_ref_base_and_extent unifies handling of MEM_REFs and >ARRAY_REFs Without looking at the patch, ARRAY_REFs can have non-constant indices which get_ref_base_and_extend handles conservative. You should make sure to not regress here. Richard. >> (hence only one patch rather than two). >It is appropriate. > > >> I've added a couple of testcases that show the improvement in DOM, >but in all >> cases I had to disable FRE, even PRE, to get any improvement, apart >from on >> ssa-dom-cse-2.c itself (where on the affected platforms FRE still >does not do >> the optimization). This makes me wonder if this is the right approach >or whether >> changing the references output by SRA (as per >> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01490.html , judged as >a hack to >> SRA to work around limitations in DOM - or is it?) would be better. >I just doubt it happens all that much. > > > >Jeff
Re: [PATCH, GCC] Fix PR67781: wrong code generation for partial load on big endian targets
On Tuesday, January 05, 2016 01:53:37 PM you wrote: > > Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC > and on an arm-none-eabi GCC cross-compiler without any regression. I'm > waiting for a slot on gcc110 to do a big endian bootstrap but at least the > testcase works on mips-linux. I'll send an update once bootstrap is > complete. Bootstrap went fine on gcc110 with the following language enabled: c,c++,objc,obj-c++,java,fortran,ada,go,lto. Best regards, Thomas
[PATCH, testsuite] Fix g++.dg/pr67989.C test failure when running with -march or -mcpu
Hi, g++.dg/pr67989.C passes -march=armv4t to gcc when compiling which fails if RUNTESTFLAGS passes -mcpu or -march with a different value. This patch adds a dg-skip-if directive to skip the test when such a thing happens. ChangeLog entry is as follows: *** gcc/testsuite/ChangeLog *** 2015-12-31 Thomas Preud'homme * g++.dg/pr67989.C: Skip test if already running it with -mcpu or -march with different value. diff --git a/gcc/testsuite/g++.dg/pr67989.C b/gcc/testsuite/g++.dg/pr67989.C index 90261c450b4b9429fb989f7df62f3743017c7363..61be8e172a96df5bb76f7ecd8543dadf825e7dc7 100644 --- a/gcc/testsuite/g++.dg/pr67989.C +++ b/gcc/testsuite/g++.dg/pr67989.C @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-std=c++11 -O2" } */ +/* { dg-skip-if "do not override -mcpu" { arm*-*-* } { "-march=*" "-mcpu=*" } { "-march=armv4t" } } */ /* { dg-additional-options "-marm -march=armv4t" { target arm*-*-* } } */ __extension__ typedef unsigned long long int uint64_t; Is this ok for stage3? Best regards, Thomas
Re: [PATCH] libiberty: support demangling of rvalue reference typenames
On Mon, Jan 04, 2016 at 10:06:44PM -0800, Ian Lance Taylor wrote: > Artemiy Volkov writes: > > > This patch adds handling of 'O' (rvalue ref) type codes in the C++ > > demangling > > code which is done similarly to the 'R' (regular references) case. It also > > adds > > a few testcases for various demangling styles which are just mirrored > > versions > > of the corresponding regular references demangling tests. > > > > libiberty/ChangeLog: > > > > 2016-01-04 Artemiy Volkov > > > > * cplus-dem.c (enum type_kind_t): Add tk_rvalue_reference > > constant. > > (demangle_template_value_parm): Handle tk_rvalue_reference > > type kind. > > (do_type): Support 'O' type id (rvalue references). > > Is there a compiler that actually generate these symbols? Sure, at least gcc and clang generate this. E.g. when compiling: void f(int&& b) { } you then have: ➜ nm 1.o T _Z1fOi > > Ian