[PATCH] Enable SGX intrinsics
Hi, This patch enables Intel SGX instructions (Reference: https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf page 4478 in pdf and 3D 41-1 in page numbers) Ok for trunk? Thanks, Julia 0001-Enable-SGX.patch Description: 0001-Enable-SGX.patch
Re: [PATCH] Enable SGX intrinsics
Sorry, didn't include changelog. Here is it: gcc/ * common/config/i386/i386-common.c (OPTION_MASK_ISA_SGX_UNSET, OPTION_MASK_ISA_SGX_SET): New. (ix86_handle_option): Handle OPT_msgx. * config.gcc: Added sgxintrin.h. * config/i386/cpuid.h (bit_SGX): New. * config/i386/driver-i386.c (host_detect_local_cpu): Detect sgx. * config/i386/i386-c.c (ix86_target_macros_internal): Define __SGX__. * config/i386/i386.c (ix86_target_string): Add -msgx. (PTA_SGX): New. (ix86_option_override_internal): Handle new options. (ix86_valid_target_attribute_inner_p): Add sgx. * config/i386/i386.h (TARGET_SGX, TARGET_SGX_P): New. * config/i386/i386.opt: Add msgx. * config/i386/sgxintrin.h: New file. * config/i386/x86intrin.h: Add sgxintrin.h. * testsuite/gcc.target/i386/sgx.c New test libgcc/ config/i386/cpuinfo.c (get_available_features): Handle FEATURE_SGX. config/i386/cpuinfo.h (FEATURE_SGX): New. On Thu, Dec 29, 2016 at 10:50 AM, Koval, Julia wrote: > Hi, > > This patch enables Intel SGX instructions (Reference: > https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf > page 4478 in pdf and 3D 41-1 in page numbers) Ok for trunk? > > Thanks, > Julia
Re: [PATCH] Add RejectNegative for a c option.
On 12/27/2016 07:18 PM, Sandra Loosemore wrote: > On 12/27/2016 09:26 AM, Martin Liška wrote: >> Without RejectNegative one can cause an ICE in the compiler. >> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. >> >> Ready to be installed? >> Martin > > Any chance you can also fix the manual to fully document the > -fstrong-eval-order= form of the option? It doesn't currently list > that or what the possible enum values are. I assume > -fno-strong-eval-order is a synonym for -fstrong-eval-order=none ? > > -Sandra > Hello. I've just installed the patch. To be honest, I'm not familiar with the option, I guess Jason would be the right person to add documentation entry for the that. As I've read other options allowing 'none' value, all of them have RejectNegative flag set and do not allow synonyms. Thanks, Martin
Re: [PATCH] PR 78534 Change character length from int to size_t
Hi Janne, hi FX, On Tue, 27 Dec 2016 12:56:19 +0200 Janne Blomqvist wrote: > >> I also changed the _size member in vtables from int to size_t, as > >> there were some cases where character lengths and sizes were > >> apparently mixed up and caused regressions otherwise. Although I I can confirm this. Being responsible for adding the _len component for char arrays in unlimited polymorphic objects. This is the only use case where the _len component is used to my knowledge. The separation should have been: - _size: store the size in bytes of a single character - _len: the number of characters stored in the char array in the unlimited polymorphic object. Unfortunately there were some case, which Janne also experienced, where these go stray. I at least succeeded to remove the length from the vtab's-name that is generated for storing in the unlimited polymorphic object. Over time I hope to get the separation of concerns correctly modeled as told above, but for the time being we have to stick with _size have the array size sometimes. I think that is the case when a fixed length char array is stored in the unlimited polymorphic object. Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de
[wwwdocs] news/dfa.html and news/ssa.html -- redhat.com now defaults to https
Applied. Gerald Index: news/dfa.html === RCS file: /cvs/gcc/wwwdocs/htdocs/news/dfa.html,v retrieving revision 1.6 diff -u -r1.6 dfa.html --- news/dfa.html 27 Jun 2014 15:04:40 - 1.6 +++ news/dfa.html 29 Dec 2016 11:42:26 - @@ -11,7 +11,7 @@ Last Updated May 5, 2002 We are pleased to announce that Vladimir Makarov, of http://www.redhat.com";>Red Hat, has contributed support +href="https://www.redhat.com";>Red Hat, has contributed support for using Deterministic Finite Automata (DFA) to describe structural hazards in processor pipelines to the instruction scheduler. This work is based on literature from various sources, including, but not Index: news/ssa.html === RCS file: /cvs/gcc/wwwdocs/htdocs/news/ssa.html,v retrieving revision 1.7 diff -u -r1.7 ssa.html --- news/ssa.html 28 Dec 2016 02:00:18 - 1.7 +++ news/ssa.html 29 Dec 2016 11:42:26 - @@ -13,8 +13,7 @@ We are pleased to announce that CodeSourcery, LLC and -http://www.redhat.com";>Cygnus, a Red Hat company -have +https://www.redhat.com";>Cygnus, a Red Hat company have contributed an implementation of the static single assignment (SSA) representation for the GCC compiler. SSA is used in many modern compilers to facilitate a wide range of powerful optimizations. Now
[doc] www.cilkplus.org now defaults to https (and a bit more)
Applied (as revision 243962), and I am planning to backport to the GCC 6 and probably GCC 5 branches. Gerald 2016-12-29 Gerald Pfeifer * doc/extend.texi (Cilk Plus Builtins): cilkplus.org now uses https by default. * doc/passes.texi (Cilk Plus Transformation): Ditto. * doc/generic.texi (Statements for C++): Ditto, and use @uref. Index: doc/extend.texi === --- doc/extend.texi (revision 243961) +++ doc/extend.texi (working copy) @@ -10464,7 +10464,7 @@ Further details and examples about these built-in functions are described in the Cilk Plus language manual which can be found at -@uref{http://www.cilkplus.org}. +@uref{https://www.cilkplus.org}. @node Other Builtins @section Other Built-in Functions Provided by GCC Index: doc/generic.texi === --- doc/generic.texi(revision 243961) +++ doc/generic.texi(working copy) @@ -3241,7 +3241,7 @@ @end smallexample Detailed description for usage and functionality of @code{_Cilk_spawn} can be -found at http://www.cilkplus.org +found at @uref{https://www.cilkplus.org}. @item CILK_SYNC_STMT Index: doc/passes.texi === --- doc/passes.texi (revision 243961) +++ doc/passes.texi (working copy) @@ -163,7 +163,7 @@ @end itemize Documentation about Cilk Plus and language specification is provided under the -"Learn" section in @w{@uref{http://www.cilkplus.org/}}. It is worth mentioning +"Learn" section in @w{@uref{https://www.cilkplus.org}}. It is worth mentioning that the current implementation follows ABI 1.1. @node Gimplification pass
[C++ PATCH] Fix decomp handling of fields with reference type (PR c++/78931)
Hi! When a field has reference type, we correctly used the reference type as the type of the var with value expr, but the DECL_VALUE_EXPR had the type/value after convert_from_reference, which leads to invalid IL. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-12-29 Jakub Jelinek PR c++/78931 * decl.c (cp_finish_decomp): SET_DECL_VALUE_EXPR to probe rather than tt. * g++.dg/cpp1z/decomp19.C: New test. --- gcc/cp/decl.c.jj2016-12-21 22:57:08.0 +0100 +++ gcc/cp/decl.c 2016-12-28 13:16:45.29461 +0100 @@ -7598,7 +7598,7 @@ cp_finish_decomp (tree decl, tree first, probe = TREE_OPERAND (probe, 0); TREE_TYPE (v[i]) = TREE_TYPE (probe); layout_decl (v[i], 0); - SET_DECL_VALUE_EXPR (v[i], tt); + SET_DECL_VALUE_EXPR (v[i], probe); DECL_HAS_VALUE_EXPR_P (v[i]) = 1; i++; } --- gcc/testsuite/g++.dg/cpp1z/decomp19.C.jj2016-12-28 13:18:27.093305954 +0100 +++ gcc/testsuite/g++.dg/cpp1z/decomp19.C 2016-12-28 13:18:19.0 +0100 @@ -0,0 +1,13 @@ +// PR c++/78931 +// { dg-do run { target c++11 } } +// { dg-options "" } + +int +main () +{ + int x = 99; + struct S { int &x; }; + S s{x}; + auto [p] = s;// { dg-warning "decomposition declaration only available with" "" { target c++14_down } } + return p - 99; +} Jakub
[PATCH] Fix exgettext to handle multi-line help texts from *.opt files (PR translation/78745)
Hi! As mentioned in the PR, the option handling for multi-line help texts concatenates those lines with spaces in between (essentially replaces newlines with spaces), but exgettext extracts just the first line from the multiline help text and throws away the rest. With this patch, there are changes like: #: config/i386/i386.opt:583 -msgid "Do dispatch scheduling if processor is bdver1, bdver2, bdver3, bdver4" +msgid "" +"Do dispatch scheduling if processor is bdver1, bdver2, bdver3, bdver4 or " +"znver1 and Haifa scheduling is selected." msgstr "" in gcc.pot. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-12-29 Jakub Jelinek PR translation/78745 * exgettext: Handle multi-line help texts in *.opt files. --- gcc/po/exgettext.jj 2016-01-04 14:55:54.0 +0100 +++ gcc/po/exgettext2016-12-28 19:18:08.142715830 +0100 @@ -237,6 +237,8 @@ echo "scanning option files..." >&2 field = 0 while (getline < file) { if (/^[ \t]*(;|$)/ || !/^[^ \t]/) { + if (field > 2) + printf("_(\"%s\")\n", line) field = 0 } else { if ((field == 1) && /MissingArgError/) { @@ -275,12 +277,15 @@ echo "scanning option files..." >&2 if (field == 2) { line = $0 printf("#line %d \"%s\"\n", lineno, file) - printf("_(\"%s\")\n", line) + } else if (field > 2) { + line = line " " $0 } field++; } lineno++; } +if (field > 2) + printf("_(\"%s\")\n", line) }') >> $emsg # Run the xgettext commands, with temporary added as a file to scan. Jakub
[C++ PATCH] Fix -Wunused-but-set-* false positive with ~ of vector type (PR c++/78949)
Hi! For integral arg, mark_exp_read is called during cp_perform_integral_promotions, for complex type it is called during cp_default_conversion, but for vector types nothing actually calls it. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-12-29 Jakub Jelinek PR c++/78949 * typeck.c (cp_build_unary_op): Call mark_rvalue_use on arg if it has vector type. * c-c++-common/Wunused-var-16.c: New test. --- gcc/cp/typeck.c.jj 2016-12-21 23:12:01.0 +0100 +++ gcc/cp/typeck.c 2016-12-29 13:48:58.363469890 +0100 @@ -5907,6 +5907,8 @@ cp_build_unary_op (enum tree_code code, inform (location, "did you mean to use logical not (%)?"); arg = cp_perform_integral_promotions (arg, complain); } + else if (!noconvert && VECTOR_TYPE_P (TREE_TYPE (arg))) + arg = mark_rvalue_use (arg); break; case ABS_EXPR: --- gcc/testsuite/c-c++-common/Wunused-var-16.c.jj 2016-12-29 13:51:07.143825569 +0100 +++ gcc/testsuite/c-c++-common/Wunused-var-16.c 2016-12-29 13:50:42.0 +0100 @@ -0,0 +1,15 @@ +/* PR c++/78949 */ +/* { dg-do compile } */ +/* { dg-options "-Wunused" } */ + +typedef unsigned char V __attribute__((vector_size(16))); +V v; + +void +foo () +{ + V y = {}; + V x = {};// { dg-bogus "set but not used" } + y &= ~x; + v = y; +} Jakub
Re: PR78631 fix
On Tue, Dec 27, 2016 at 06:36:11PM +0300, Alexander Ivchenko wrote: > Committed as r243942 with the ChangeLog entries Unfortunately it fails if assembler has mpx support, but hw doesn't support it. The following patch should fix that. Tested on x86_64-linux, ok for trunk? 2016-12-29 Jakub Jelinek * gcc.target/i386/mpx/memcpy-1.c: Include mpx-check.h. (main): Renamed to ... (mpx_test): ... this. Add argc and argv arguments. --- gcc/testsuite/gcc.target/i386/mpx/memcpy-1.c.jj 2016-12-28 13:14:24.0 +0100 +++ gcc/testsuite/gcc.target/i386/mpx/memcpy-1.c2016-12-29 16:07:11.135200098 +0100 @@ -8,6 +8,7 @@ #include #include +#include "mpx-check.h" char s[10]; char d[10]; @@ -16,7 +17,7 @@ __attribute__((noinline)) char* foo(char* dst, char* src, size_t size) { return memcpy(dst, src, size); } -int main() { +int mpx_test(int argc, const char **argv) { char* r = foo(d, s, 11); printf("r = %p\n", r); return 0; Jakub
Re: [PATCH], Add PowerPC ISA 3.0 vec_vinsert4b and vec_vextract4b built-in functions
Thanks. I fixed the error messages. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: PR78631 fix
On Thu, Dec 29, 2016 at 4:26 PM, Jakub Jelinek wrote: > On Tue, Dec 27, 2016 at 06:36:11PM +0300, Alexander Ivchenko wrote: >> Committed as r243942 with the ChangeLog entries > > Unfortunately it fails if assembler has mpx support, but hw doesn't support > it. > > The following patch should fix that. Tested on x86_64-linux, ok for trunk? > > 2016-12-29 Jakub Jelinek > > * gcc.target/i386/mpx/memcpy-1.c: Include mpx-check.h. > (main): Renamed to ... > (mpx_test): ... this. Add argc and argv arguments. OK. Thanks, Uros. > --- gcc/testsuite/gcc.target/i386/mpx/memcpy-1.c.jj 2016-12-28 > 13:14:24.0 +0100 > +++ gcc/testsuite/gcc.target/i386/mpx/memcpy-1.c2016-12-29 > 16:07:11.135200098 +0100 > @@ -8,6 +8,7 @@ > > #include > #include > +#include "mpx-check.h" > > char s[10]; > char d[10]; > @@ -16,7 +17,7 @@ __attribute__((noinline)) > char* foo(char* dst, char* src, size_t size) { >return memcpy(dst, src, size); > } > -int main() { > +int mpx_test(int argc, const char **argv) { >char* r = foo(d, s, 11); >printf("r = %p\n", r); >return 0; > > > Jakub
Re: [v3 PATCH] Implement 2801, Default-constructibility of unique_ptr.
On 22 December 2016 at 19:11, Jonathan Wakely wrote: >> /// Default constructor, creates a unique_ptr that owns nothing. >> + template > + typename enable_if< >> + __and_<__not_>, >> +is_default_constructible<_Up>>::value, >> + bool>::type = false> > > > Instead of repeating this condition half a dozen times, we could put > it in the __uniq_ptr_impl class template and reuse it, as in the > attached patch (and similarly for the unique_ptr specialization). > What do you think? It needs to be a bit more dependent than in that patch, I think. I adjusted the idea a bit, a new patch attached. It cleans up the code a bit, so it's better, but not a huge improvement. >> constexpr unique_ptr() noexcept >> : _M_t() >> - { static_assert(!is_pointer::value, >> -"constructed with null function pointer deleter"); } >> + { } > The bodies of these constructors should be indented now that they're > templates. Hopefully fixed correctly in the new patch, please double-check. >> --- /dev/null >> +++ b/libstdc++-v3/testsuite/20_util/unique_ptr/cons/default.cc >> @@ -0,0 +1,40 @@ >> +// { dg-do compile { target c++11 } } >> + >> +// Copyright (C) 2011-2016 Free Software Foundation, Inc. > Is this substantially copied from an existing file, or should it just > be 2016? (Not that it really matters, as I don't think we should have Should just be 2016, fixed. 2016-12-29 Ville Voutilainen Implement 2801, Default-constructibility of unique_ptr. * include/bits/unique_ptr.h (__uniq_ptr_impl::_DeleterConstraint): New. (unique_ptr::_DeleterConstraint): Likewise. (unique_ptr()): Constrain. (unique_ptr(pointer)): Likewise. (unique_ptr(nullptr_t)): Likewise. (unique_ptr<_Tp[], _Dp>::_DeleterConstraint): New. (unique_ptr<_Tp[], _Dp>::unique_ptr()): Constrain. (unique_ptr<_Tp[], _Dp>::unique_ptr(_Up)): Likewise. (unique_ptr<_Tp[], _Dp>::unique_ptr(nullptr_t)): Likewise. * testsuite/20_util/unique_ptr/assign/48635_neg.cc: Adjust. * testsuite/20_util/unique_ptr/cons/cv_qual_neg.cc: Likewise. * testsuite/20_util/unique_ptr/cons/default.cc: New. * testsuite/20_util/unique_ptr/cons/ptr_deleter_neg.cc: Adjust. diff --git a/libstdc++-v3/include/bits/unique_ptr.h b/libstdc++-v3/include/bits/unique_ptr.h index 56e6ec0..f994c59 100644 --- a/libstdc++-v3/include/bits/unique_ptr.h +++ b/libstdc++-v3/include/bits/unique_ptr.h @@ -130,6 +130,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; public: + using _DeleterConstraint = enable_if< +__and_<__not_>, + is_default_constructible<_Dp>>::value>; + using pointer = typename _Ptr<_Tp, _Dp>::type; __uniq_ptr_impl() = default; @@ -152,6 +156,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template > class unique_ptr { + template + using _DeleterConstraint = + typename __uniq_ptr_impl<_Tp, _Up>::_DeleterConstraint; + __uniq_ptr_impl<_Tp, _Dp> _M_t; public: @@ -175,10 +183,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Constructors. /// Default constructor, creates a unique_ptr that owns nothing. - constexpr unique_ptr() noexcept - : _M_t() - { static_assert(!is_pointer::value, -"constructed with null function pointer deleter"); } + template ::type> + constexpr unique_ptr() noexcept + : _M_t() +{ } /** Takes ownership of a pointer. * @@ -186,11 +195,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * * The deleter will be value-initialized. */ - explicit - unique_ptr(pointer __p) noexcept - : _M_t(__p) - { static_assert(!is_pointer::value, -"constructed with null function pointer deleter"); } + template ::type> + explicit + unique_ptr(pointer __p) noexcept + : _M_t(__p) +{ } /** Takes ownership of a pointer. * @@ -218,7 +228,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION "rvalue deleter bound to reference"); } /// Creates a unique_ptr that owns nothing. - constexpr unique_ptr(nullptr_t) noexcept : unique_ptr() { } + template ::type> + constexpr unique_ptr(nullptr_t) noexcept : unique_ptr() { } // Move constructors. @@ -384,6 +396,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template class unique_ptr<_Tp[], _Dp> { + template + using _DeleterConstraint = + typename __uniq_ptr_impl<_Tp, _Up>::_DeleterConstraint; + __uniq_ptr_impl<_Tp, _Dp> _M_t; template @@ -432,10 +448,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Constructors. /// Default constructor, creates a unique_ptr that owns nothing. - constexpr unique_ptr() noexcept - : _M_t() - { static_assert(!std::is_pointer::value, - "constructed with null function pointer deleter"
Re: [v3 PATCH] Implement 2801, Default-constructibility of unique_ptr.
On 29 December 2016 at 21:57, Ville Voutilainen wrote: >> Instead of repeating this condition half a dozen times, we could put >> it in the __uniq_ptr_impl class template and reuse it, as in the >> attached patch (and similarly for the unique_ptr specialization). >> What do you think? > > It needs to be a bit more dependent than in that patch, I think. I > adjusted the idea a bit, > a new patch attached. It cleans up the code a bit, so it's better, but > not a huge improvement. Gets a bit better by keeping the __uniq_ptr_impl as just an enable_if, aliasing its ::type locally and then using the result in the constraints. Also gets rid of a bunch of dg-errors in ptr_deleter_neg.cc. This makes me quite happy. The attached _3.diff shows the cleanup, the _4.diff shows the full patch with this cleanup applied. diff --git a/libstdc++-v3/include/bits/unique_ptr.h b/libstdc++-v3/include/bits/unique_ptr.h index f994c59..211043f 100644 --- a/libstdc++-v3/include/bits/unique_ptr.h +++ b/libstdc++-v3/include/bits/unique_ptr.h @@ -158,7 +158,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { template using _DeleterConstraint = - typename __uniq_ptr_impl<_Tp, _Up>::_DeleterConstraint; + typename __uniq_ptr_impl<_Tp, _Up>::_DeleterConstraint::type; __uniq_ptr_impl<_Tp, _Dp> _M_t; @@ -184,7 +184,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// Default constructor, creates a unique_ptr that owns nothing. template ::type> + typename = _DeleterConstraint<_Up>> constexpr unique_ptr() noexcept : _M_t() { } @@ -196,7 +196,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * The deleter will be value-initialized. */ template ::type> + typename = _DeleterConstraint<_Up>> explicit unique_ptr(pointer __p) noexcept : _M_t(__p) @@ -229,7 +229,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// Creates a unique_ptr that owns nothing. template ::type> + typename = _DeleterConstraint<_Up>> constexpr unique_ptr(nullptr_t) noexcept : unique_ptr() { } // Move constructors. @@ -398,7 +398,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { template using _DeleterConstraint = - typename __uniq_ptr_impl<_Tp, _Up>::_DeleterConstraint; + typename __uniq_ptr_impl<_Tp, _Up>::_DeleterConstraint::type; __uniq_ptr_impl<_Tp, _Dp> _M_t; @@ -449,7 +449,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// Default constructor, creates a unique_ptr that owns nothing. template ::type> + typename = _DeleterConstraint<_Up>> constexpr unique_ptr() noexcept : _M_t() { } @@ -463,7 +463,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION */ template::type, + typename = _DeleterConstraint<_Vp>, typename = typename enable_if< __safe_conversion_raw<_Up>::value, bool>::type> explicit @@ -510,7 +510,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// Creates a unique_ptr that owns nothing. template ::type> + typename = _DeleterConstraint<_Up>> constexpr unique_ptr(nullptr_t) noexcept : unique_ptr() { } templatediff --git a/libstdc++-v3/include/bits/unique_ptr.h b/libstdc++-v3/include/bits/unique_ptr.h index 56e6ec0..211043f 100644 --- a/libstdc++-v3/include/bits/unique_ptr.h +++ b/libstdc++-v3/include/bits/unique_ptr.h @@ -130,6 +130,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; public: + using _DeleterConstraint = enable_if< +__and_<__not_>, + is_default_constructible<_Dp>>::value>; + using pointer = typename _Ptr<_Tp, _Dp>::type; __uniq_ptr_impl() = default; @@ -152,6 +156,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template > class unique_ptr { + template + using _DeleterConstraint = + typename __uniq_ptr_impl<_Tp, _Up>::_DeleterConstraint::type; + __uniq_ptr_impl<_Tp, _Dp> _M_t; public: @@ -175,10 +183,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Constructors. /// Default constructor, creates a unique_ptr that owns nothing. - constexpr unique_ptr() noexcept - : _M_t() - { static_assert(!is_pointer::value, -"constructed with null function pointer deleter"); } + template > + constexpr unique_ptr() noexcept + : _M_t() +{ } /** Takes ownership of a pointer. * @@ -186,11 +195,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * * The deleter will be value-initialized. */ - explicit - unique_ptr(pointer __p) noexcept - : _M_t(__p) - { static_assert(!is_pointer::value, -"constructed with null function pointer deleter"); } + template > + explicit + unique_ptr(pointer __p) noexcept + : _M_t(__p) +{ } /** Takes ownership of a pointer. * @@ -218,7 +228,9 @@ _GLIBCXX_B
[wwwdocs] More changes around openmp.org
It turns out that the locations of the OpenMP standards documents have been moved as well. This patch adjust this for some of our references (index.html, gcc-5/changes.html) and replaces it with the more generic location of OpenMP specifications for older entries (news.html) since I expect that to be more stable. Applied Gerald Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.1035 diff -u -r1.1035 index.html --- index.html 28 Dec 2016 23:18:31 - 1.1035 +++ index.html 29 Dec 2016 00:57:09 - @@ -104,7 +104,7 @@ OpenMP 4.0 offloading support in GCC [2015-01-14] - http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf";> + http://www.openmp.org/wp-content/uploads/OpenMP4.0.0.pdf";> OpenMP 4.0 https://gcc.gnu.org/gcc-5/changes.html#offload";> offloading support was added to GCC. Contributed by Jakub Jelinek (Red Hat), Bernd Schmidt and Index: news.html === RCS file: /cvs/gcc/wwwdocs/htdocs/news.html,v retrieving revision 1.152 diff -u -r1.152 news.html --- news.html 27 Nov 2016 14:05:57 - 1.152 +++ news.html 29 Dec 2016 00:57:34 - @@ -75,7 +75,7 @@ OpenMP v4.0 [2014-06-30] An implementation of the http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf";>OpenMP v4.0 +href="http://www.openmp.org/specifications/";>OpenMP v4.0 parallel programming interface for Fortran has been added and is going to be available in the upcoming GCC 4.9.1 release. @@ -133,8 +133,8 @@ OpenMP v4.0 [2013-10-11] An implementation of the http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf";>OpenMP v4.0 -parallel programming interface for so far just C, C++ has been added. +href="http://www.openmp.org/specifications/";>OpenMP v4.0 +parallel programming interface for so far C and C++ has been added. Code was contributed by Jakub Jelinek, Aldy Hernandez, Richard Henderson of Red Hat, Inc. and Tobias Burnus. Index: gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.139 diff -u -r1.139 changes.html --- gcc-5/changes.html 3 Jun 2016 08:22:46 - 1.139 +++ gcc-5/changes.html 29 Dec 2016 00:57:39 - @@ -188,7 +188,7 @@ New Languages and Language specific improvements -http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf";> +http://www.openmp.org/wp-content/uploads/OpenMP4.0.0.pdf";> OpenMP 4.0 specification offloading features are now supported by the C, C++, and Fortran compilers. Generic changes:
[wwwdocs] Remove reference to GCJ FAQ
...from our main FAQ, plus the GCJ FAQ to begin with. Applied. Gerald Index: java/faq.html === RCS file: java/faq.html diff -N java/faq.html --- java/faq.html 29 Jun 2014 19:39:16 - 1.74 +++ /dev/null 1 Jan 1970 00:00:00 - @@ -1,443 +0,0 @@ - - -The GCJ FAQ - - - - - The GCJ FAQ - - -General Questions - -What license is used for libgcj? -How can I debug my Java program? -Can I interface byte-compiled and native java code? - - -Java Feature Support - -What Java API's are supported? How complete is - the support? -Does GCJ support using straight C native methods - ala JNI? -Why does GCJ use CNI? -What is the state of AWT support? -How about support for Swing ? -What support is there for RMI ? -Can I use any code from other projects - to supplement libgcj's current features? -What features of the Java language are/arn't supported - - -Gcj Compile/Link Questions - -Why do I get undefined reference to `main' - errors? -Can GCJ only handle source code? -"gcj -C" Doesn't seem to work like javac/jikes. - Whats going on? -Where does GCJ look for files? -How does gcj resolve wether to compile .class or - .java files? - - -Runtime Questions - -My program is dumping core! What's going on? -When I run the debugger I get a SEGV in the GC! - What's going on? -I have just compiled and benchmarked my Java application -and it seems to be running slower than than XXX JIT JVM. Is there -anything I can do to make it go faster? -Can I profile Garbage Collection? -How do I increase the runtime's initial and maximum - heap sizes? -How can I profile my application? - - -Programming Issues - -Are there any examples of how to use CNI? -Is it possible to invoke GCJ compiled Java code from a C++ application? - - - - - General Questions - - 1.1 What license is used for libgcj? - - - libgcj is distributed under the GPL, with the 'libgcc exception'. - This means that linking with libgcj does not by itself cause - your program to fall under the GPL. See LIBGCJ_LICENSE in - the source tree for more details. - - - - - 1.6 How can I debug my Java program? - - - ftp://ftp.gnu.org/pub/gnu/gdb/";>gdb 5.0 - includes support for debugging gcj-compiled Java programs. For more - information please read Java Debugging with gdb. - - - - - 1.7 Can I interface byte-compiled and native java code - - - libgcj has a bytecode interpreter that allows you to mix .class files with - compiled code. It works pretty transparently: if a compiled version of a class is - not found in the application binary or linked shared libraries, the class loader - will search for a bytecode version in your classpath, much like a VM would. Be - sure to build libgcj with the --enable-interpreter option to enable this - functionality. - - The program "gij" provides a front end to the interpreter that behaves - much like a traditional virtual machine. You can even use "gij" to run a shared library - which is compiled from java code and contains a main method: - -$ gcj -shared -o lib-HelloWorld.so HelloWorld.java -$ gij HelloWorld - - This works because gij uses Class.forName, which knows how to load shared objects. - - - - Java Feature Support - - 2.1 What Java API's are supported? How complete is -the support? - - - mailto:m...@cs.berkeley.edu";>Matt Welsh writes: - -Just look in the 'libjava' directory of libgcj and see what classes -are there. Most GUI stuff isn't there yet, that's true, but many of -the other classes are easy to add if they don't yet exist. - -I think it's important to stress that there is a big difference -between Java and the many libraries which Java supports. Unfortunately, -Sun's promise of "write once, run everywhere" assumes much -more than a JVM: you also need the full set of JDK libraries. Considering -that new Java APIs come out every week, it's g
[PATCH] genmatch fix (PR tree-optimization/71563)
Hi! On Tue, Dec 20, 2016 at 09:45:03PM +0100, Jakub Jelinek wrote: > That is what I tried first, but there is some bug in genmatch.c that > prevents it. The: > (for vec (VECTOR_CST CONSTRUCTOR) > (simplify >(shiftrotate @0 vec@1) > results in case SSA_NAME: being added to a switch: > case SSA_NAME: > if (do_valueize (valueize, op1) != NULL_TREE) > { > gimple *def_stmt = SSA_NAME_DEF_STMT (op1); > if (gassign *def = dyn_cast (def_stmt)) > switch (gimple_assign_rhs_code (def)) > { > case CONSTRUCTOR: > and the SSA_NAME@1 in another simplification resulted in another > case SSA_NAME: > into the same switch (rather than appending to the case SSA_NAME). This patch attempts to deal with that. The change for the new version of the patch with SSA_NAME@1 I'll post right away is (twice). Two case SSA_NAME: in a single switch of course don't work well. --- gimple-match.c.jj 2016-12-29 21:57:22.0 +0100 +++ gimple-match.c 2016-12-29 22:11:58.824526121 +0100 @@ -63732,6 +63732,14 @@ if (integer_all_onesp (op0)) default:; } } + { + { +/* #line 1524 "../../gcc/match.pd" */ + tree captures[2] ATTRIBUTE_UNUSED = { op0, op1 }; + if (gimple_simplify_79 (res_code, res_ops, seq, valueize, type, captures, RSHIFT_EXPR)) + return true; + } + } break; case VECTOR_CST: { @@ -63743,16 +63751,6 @@ if (integer_all_onesp (op0)) } break; } -case SSA_NAME: - { - { -/* #line 1524 "../../gcc/match.pd" */ - tree captures[2] ATTRIBUTE_UNUSED = { op0, op1 }; - if (gimple_simplify_79 (res_code, res_ops, seq, valueize, type, captures, RSHIFT_EXPR)) - return true; - } -break; - } default:; } switch (TREE_CODE (op0)) Jakub
[PATCH] Optimize X << Y with low bits of Y known to be 0 (PR tree-optimization/71563, take 2)
Hi! On Tue, Dec 20, 2016 at 09:45:03PM +0100, Jakub Jelinek wrote: > > Note that you can write (shift @0 SSA_NAME@1) in the pattern instead of a > > separate test. > > That is what I tried first, but there is some bug in genmatch.c that > prevents it. The: > (for vec (VECTOR_CST CONSTRUCTOR) > (simplify >(shiftrotate @0 vec@1) > results in case SSA_NAME: being added to a switch: > case SSA_NAME: > if (do_valueize (valueize, op1) != NULL_TREE) > { > gimple *def_stmt = SSA_NAME_DEF_STMT (op1); > if (gassign *def = dyn_cast (def_stmt)) > switch (gimple_assign_rhs_code (def)) > { > case CONSTRUCTOR: > and the SSA_NAME@1 in another simplification resulted in another > case SSA_NAME: > into the same switch (rather than appending to the case SSA_NAME). And here is the corresponding updated version of the patch: 2016-12-29 Jakub Jelinek PR tree-optimization/71563 * match.pd: Simplify X << Y into X if Y is known to be 0 or out of range value - has low bits known to be zero. * gcc.dg/tree-ssa/pr71563.c: New test. --- gcc/match.pd.jj 2016-12-21 10:00:10.809244456 +0100 +++ gcc/match.pd2016-12-29 21:56:56.891858831 +0100 @@ -1515,6 +1515,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (tem) (shiftrotate @0 { tem; })) +/* Simplify X << Y where Y's low width bits are 0 to X, as only valid + Y is 0. Similarly for X >> Y. */ +#if GIMPLE +(for shift (lshift rshift) + (simplify + (shift @0 SSA_NAME@1) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))) +(with { + int width = ceil_log2 (element_precision (TREE_TYPE (@0))); + int prec = TYPE_PRECISION (TREE_TYPE (@1)); + } + (if ((get_nonzero_bits (@1) & wi::mask (width, false, prec)) == 0) + @0) +#endif + /* Rewrite an LROTATE_EXPR by a constant into an RROTATE_EXPR by a new constant. */ (simplify --- gcc/testsuite/gcc.dg/tree-ssa/pr71563.c.jj 2016-12-29 21:56:12.668414342 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/pr71563.c 2016-12-29 21:56:12.668414342 +0100 @@ -0,0 +1,23 @@ +/* PR tree-optimization/71563 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +void link_error (void); + +void +foo (int k) +{ + int t = 1 << ((1 / k) << 8); + if (t != 1) +link_error (); +} + +void +bar (int k, int l) +{ + int t = l << (k << 8); + if (t != l) +link_error (); +} + +/* { dg-final { scan-tree-dump-not "link_error" "optimized" } } */ Jakub
Re: [PATCH] PR 78534 Change character length from int to size_t
On Tue, 20 Dec 2016, FX wrote: > Finally, if we’re making this change, we welcome any feedback on how > to make it as easy as possible to handle in user code. Documentation, > preprocessor macros, etc. I believe including this in the (yet to be created) gcc-7/porting_to.html, would be great. Historically the porting_to.html documents have mostly covered C and C++, since that is the source language of the majority of packages in a GNU/Linux distribution that GCC touches. Adding more focus on Fortran users as well feels like a good idea, though. (If you want to go ahead, but prefer the page to be created first, let me know, and I'll take care.) Gerald
[PATCH, i386]: Remove unneeded *extvqi sign-extract pattern
Hello! Attached patch removes unneeded *extvqi sign-extract pattern. Combine is smart enough to create zero-extract RTX in case QImode value is extracted to QImode register. OTOH, the following testcase --cut here-- struct S1 { char pad1; char val; short pad2; }; struct S1 test_add (struct S1 a, struct S1 b) { a.val += b.val; return a; } --cut here-- still compiles to: movl%edi, %eax movl%esi, %ecx movsbl %ah, %edx movsbl %ch, %esi addl%esi, %edx movb%dl, %ah since combine doesn't simplify sign-extract in: Trying 7, 9 -> 10: Failed to match this instruction: (set (zero_extract:SI (reg/v:SI 95 [ a ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (subreg:SI (plus:QI (subreg:QI (sign_extract:SI (reg/v:SI 95 [ a ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0) (reg:QI 98)) 0)) to a zero-extract, although we have QImode operation. This should follow the same reasoning as in the attached testcase, where: (insn 8 4 9 2 (set (reg:SI 91) (sign_extract:SI (reg/v:SI 88 [ a ]) (const_int 8 [0x8]) (const_int 8 [0x8]))) "pr78904-6.c":18 102 {*extvsi} (expr_list:REG_DEAD (reg/v:SI 88 [ a ]) (nil))) (insn 9 8 0 2 (set (mem/j:QI (plus:DI (reg/v:DI 89 [ i ]) (symbol_ref:DI ("t") [flags 0x40] )) [0 t S1 A8]) (subreg:QI (reg:SI 91) 0)) "pr78904-6.c":18 84 {*movqi_internal} (expr_list:REG_DEAD (reg:SI 91) (expr_list:REG_DEAD (reg/v:DI 89 [ i ]) (nil simplifies to Trying 8 -> 9: Successfully matched this instruction: (set (mem/j:QI (plus:DI (reg/v:DI 89 [ i ]) (symbol_ref:DI ("t") [flags 0x40] )) [0 t S1 A8]) (subreg:QI (zero_extract:SI (reg/v:SI 88 [ a ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0)) I'll open a PR for the above combine deficiency. 2016-12-29 Uros Bizjak PR target/78904 * config/i386/i386.md (*extvqi): Remove insn pattern. (divmodqi4): Update expander to generate QImode zero-extract from AH. testsuite/ChangeLog: 2016-12-29 Uros Bizjak PR target/78904 * gcc.target/i386/pr78904-6.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 243963) +++ config/i386/i386.md (working copy) @@ -2780,33 +2780,6 @@ [(set_attr "type" "imovx") (set_attr "mode" "SI")]) -(define_insn "*extvqi" - [(set (match_operand:QI 0 "nonimmediate_operand" "=QBc,?R,m") -(sign_extract:QI (match_operand 1 "ext_register_operand" "Q,Q,Q") - (const_int 8) - (const_int 8)))] - "" -{ - switch (get_attr_type (insn)) -{ -case TYPE_IMOVX: - return "movs{bl|x}\t{%h1, %k0|%k0, %h1}"; -default: - return "mov{b}\t{%h1, %0|%0, %h1}"; -} -} - [(set_attr "isa" "*,*,nox64") - (set (attr "type") - (if_then_else (and (match_operand:QI 0 "register_operand") - (ior (not (match_operand:QI 0 "QIreg_operand")) -(match_test "TARGET_MOVX"))) - (const_string "imovx") - (const_string "imov"))) - (set (attr "mode") - (if_then_else (eq_attr "type" "imovx") - (const_string "SI") - (const_string "QI")))]) - (define_expand "extzv" [(set (match_operand:SWI248 0 "register_operand") (zero_extract:SWI248 (match_operand:SWI248 1 "register_operand") @@ -7586,7 +7559,8 @@ emit_insn (gen_divmodhiqi3 (tmp0, tmp1, operands[2])); /* Extract remainder from AH. */ - tmp1 = gen_rtx_SIGN_EXTRACT (QImode, tmp0, GEN_INT (8), GEN_INT (8)); + tmp1 = gen_rtx_ZERO_EXTRACT (SImode, tmp0, GEN_INT (8), GEN_INT (8)); + tmp1 = gen_rtx_SUBREG (QImode, tmp1, 0); rtx_insn *insn = emit_move_insn (operands[3], tmp1); mod = gen_rtx_MOD (QImode, operands[1], operands[2]); Index: testsuite/gcc.target/i386/pr78904-6.c === --- testsuite/gcc.target/i386/pr78904-6.c (nonexistent) +++ testsuite/gcc.target/i386/pr78904-6.c (working copy) @@ -0,0 +1,21 @@ +/* PR target/78904 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -masm=att" } */ + +typedef __SIZE_TYPE__ size_t; + +struct S1 +{ + char pad1; + char val; + short pad2; +}; + +extern char t[256]; + +void foo (struct S1 a, size_t i) +{ + t[i] = a.val; +} + +/* { dg-final { scan-assembler "\[ \t\]movb\[\t \]*%.h," } } */
[wwwdocs] Move the generic redirect for /java/ past all more specific ones
Move the generic redirect for /java/ past all more specific ones for pages formerly under /java/... With this change, the redirects that you can see as context in the patch below actually work again. Applied. Gerald Index: .htaccess === RCS file: /cvs/gcc/wwwdocs/htdocs/.htaccess,v retrieving revision 1.39 diff -u -r1.39 .htaccess --- .htaccess 4 Dec 2016 23:33:28 - 1.39 +++ .htaccess 29 Dec 2016 21:51:18 - @@ -48,12 +48,12 @@ Redirect permanent /onlinedocs/g77_bugs.html https://gcc.gnu.org/onlinedocs/g77/Trouble.html Redirect permanent /onlinedocs/g77/ https://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/ -Redirect permanent /java/ https://gcc.gnu.org/ Redirect permanent /java/gcj.html https://gcc.gnu.org/ Redirect permanent /java/libgcj.html https://gcc.gnu.org/ Redirect permanent /java/about.htmlhttps://gcc.gnu.org/about.html Redirect permanent /java/FAQ.html https://gcc.gnu.org/faq.html Redirect permanent /java/status.html https://gcc.gnu.org/ +Redirect permanent /java/ https://gcc.gnu.org/ Redirect permanent /bugs.html https://gcc.gnu.org/bugs/ Redirect permanent /c9xstatus.html https://gcc.gnu.org/c99status.html
[wwwdocs] Move java/gcj-announce.txt to news/gcj-announce.txt
Move java/gcj-announce.txt to news/gcj-announce.txt, and since this was a significant announcement with probably references out there, add a redirect as well. (And fix an incomplete sentence in news/javaannounce.html while we are already there.) Applied. Gerald Index: news.html === RCS file: /cvs/gcc/wwwdocs/htdocs/news.html,v retrieving revision 1.153 diff -u -r1.153 news.html --- news.html 29 Dec 2016 20:57:31 - 1.153 +++ news.html 29 Dec 2016 22:34:21 - @@ -1704,7 +1704,7 @@ September 6, 1998 -Cygnus donates Java front end. +Cygnus donates Java front end. September 3, 1998 Index: gcc-2.95/features.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-2.95/features.html,v retrieving revision 1.40 diff -u -r1.40 features.html --- gcc-2.95/features.html 28 Jun 2014 07:45:10 - 1.40 +++ gcc-2.95/features.html 29 Dec 2016 22:34:27 - @@ -34,9 +34,9 @@ Many C++ improvements. https://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/News.html";>Many Fortran improvements. -Java +Java front-end has been integrated. -runtime library +A runtime library is available separately. ISO C99 support Chill front-end and runtime has been Index: java/gcj-announce.txt === RCS file: java/gcj-announce.txt diff -N java/gcj-announce.txt --- java/gcj-announce.txt 8 Feb 2001 17:34:35 - 1.3 +++ /dev/null 1 Jan 1970 00:00:00 - @@ -1,27 +0,0 @@ -September 6, 1998 - -We're happy to announce public availability of a new GCC based native -code compiler for the Java(TM) language. - -The new GNU based compiler, GCJ, compiles both Java source files and -bytecode compiled Java class files to native code for a wide range of -target architectures. - -GCJ is a work in progress. We're also working on the set of runtime -libraries needed for executing GCJ compiled code, and extensions to -GDB, the GNU debugger, for debugging GCJ compiled code. - -Full details are available on the GCJ web pages: - - http://soruces.redhat.com/java/ - -Anthony Green -Cygnus Solutions - - -Java and all Java-based marks are trademarks or registered trademarks -of Sun Microsystems, Inc. in the United States and other -countries. The Free Software Foundation and Cygnus Solutions are -independent of Sun Microsystems, Inc. - -[URL of the web page updated - Jim Kingdon, July 2000] Index: news/gcj-announce.txt === RCS file: news/gcj-announce.txt diff -N news/gcj-announce.txt --- /dev/null 1 Jan 1970 00:00:00 - +++ news/gcj-announce.txt 29 Dec 2016 22:34:27 - @@ -0,0 +1,27 @@ +September 6, 1998 + +We're happy to announce public availability of a new GCC based native +code compiler for the Java(TM) language. + +The new GNU based compiler, GCJ, compiles both Java source files and +bytecode compiled Java class files to native code for a wide range of +target architectures. + +GCJ is a work in progress. We're also working on the set of runtime +libraries needed for executing GCJ compiled code, and extensions to +GDB, the GNU debugger, for debugging GCJ compiled code. + +Full details are available on the GCJ web pages: + + http://sources.redhat.com/java/ + +Anthony Green +Cygnus Solutions + + +Java and all Java-based marks are trademarks or registered trademarks +of Sun Microsystems, Inc. in the United States and other +countries. The Free Software Foundation and Cygnus Solutions are +independent of Sun Microsystems, Inc. + +[URL of the web page updated - Jim Kingdon, July 2000] Index: .htaccess === RCS file: /cvs/gcc/wwwdocs/htdocs/.htaccess,v retrieving revision 1.40 diff -u -r1.40 .htaccess --- .htaccess 29 Dec 2016 21:53:16 - 1.40 +++ .htaccess 29 Dec 2016 22:41:46 - @@ -49,6 +49,7 @@ Redirect permanent /onlinedocs/g77/ https://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/ Redirect permanent /java/gcj.html https://gcc.gnu.org/ +Redirect permanent /java/gcj-announce.txt https://gcc.gnu.org/news/gcj-announce.txt Redirect permanent /java/libgcj.html https://gcc.gnu.org/ Redirect permanent /java/about.htmlhttps://gcc.gnu.org/about.html Redirect permanent /java/FAQ.html https://gcc.gnu.org/faq.html
[PATCH], PR 71977/70568/78823: Improve PowerPC code that uses SFmode in unions
This is both a fix to a regression (since GCC 4.9), a code improvement for GLIBC, and fixes potential bugs with the recent changes to allow small integers (32-bit integer, SImode in particular) in floating point and vector registers. The core of the problem is when a SFmode (32-bit binary floating point) value is a scalar in the floating point and vector registers, it is stored internally as a 64-bit binary floating point value. This means that if you could look at the value using the SUBREG mechanism you might see the wrong value. Before the recent changes to add small integer support went in, it was less of an issue, since the only integer type allowed in floating point and vector registers was 64-bit integers (DImode). The regression comes in in how SFmode values are moved between general purpose and floating point/vector registers. Up through the power7, the way you moved SFmode values from one register set to another was store and then load. Doing back to back store and load to the same location could cause serious performance problems on recent power systems. On the power7 and power8, we would put a special NOP that forces the two instructions to be in different dispatch groups, which helps somewhat. When the power8 (ISA 2.07) came out, it had direct move instructions and convert between scalar double precision and single precision. I added the appropriate secondary_reload support so that if the register allocator wanted to move a SFmode value between register banks, it would create a temporary and do the appropriate instructions to move the value. This worked in the GCC 4.8 time frame. Some time in the 4.9 time frame, this broke and the register allocator would more often generate store and load instead of the direct move sequences. However, simple test cases continued to use the direct move instructions. In checking on power9 (ISA 3.0) code, it more likely uses the store/load than direct move. On power8 spec runs we have seen the effect of these store/load sequences on some benchmarks from the next generation of the Spec suite. The optimization that the GLIBC implementers have requested (PR 71977) was to speed up code sequences that they use in implementing the single precision math library. They often times need to extract/modify bits in floating point values (for example, setting exponents or mantissas, etc.). For example from e_powf.c, you see code like this after macro expansion: typedef union { float value; u_int32_t word; } ieee_float_shape_type; float t1; int32_t is; /* ... */ do { ieee_float_shape_type gf_u; gf_u.value = (t1); (is) = gf_u.word; } while (0); do { ieee_float_shape_type sf_u; sf_u.word = (is&0xf000); (t1) = sf_u.value; } while (0); Originally, I just wrote a peephole2 to catch the above code, and it worked in small test cases on the power8. But it didn't work on larger programs or on the power9. I also wanted to fix the performance issue that we've seen. I also became convinced that for GCC 7, it was a ticking time bomb where eventually somebody would write code that intermixed SImode and SFmode, and it would get the wrong value. The main part of the patch is to not let the compiler generate: (set (reg:SI) (subreg:SF (reg:SI))) or (set (reg:SI) (subreg:SI (reg:SF))) Most of the register predicates eventually call gpc_reg_operand, and it was simple to put the check in there, and to other predicates that did not call gpc_reg_operand. I created new insns to do the move between formats that allocated the temporary needed with match_scratch. There were places that then needed to not have the check (the movsi/movsf expanders themselves, and the insn spliters for format conversion insns), and I added a predicate for that. I have built the patches on a little endian power8, a big endian power8 (64-bit only), and a big endian power7 (both 32-bit and 64-bit). There were no regression failures. In addition, I built spec 2006, with the fixes and without, and did a quick run comparing the results (1 run). I am re-running the spec results, with the code merged to today's trunk, and with 3 runs to isolate the occasional benchmark that goes off in the weeds. Of the 29 benchmarks in Spec 2006 CPU, 6 benchmarks had changes in the instructions generated (perlbench, gromacs, cactusADM, namd, povray, wrf). In the single run I did, there were no regressions, and 2 or 3 benchmarks improved: namd6% tonto 3% libquantum 6% However, single runs of libquantum have varied as much as 10%, so without seeing more runs, I will skip it. Namd was one of the benchmarks that saw changes in code generation, but tonto did not have changes of code. I suspect having the sepa
[PATCH/AARCH64] Add -mcpu=thunderx2t99 support
Hi, This patch adds -mcpu=thunderx2t99. Cavium has acquired the Vulcan IP from Broadcom. I am keeping the old -mcpu=vulcan as backwards compatible but renaming all of the structures to be based on the new name of the chip. In the next few weeks, I am auditing the current tuning and will be posting some changes too. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Also tested -mcpu=native on a ThunderX2 CN99xx machine. Thanks, Andrew ChangeLog: * config/aarch64/aarch64-cores.def: Add thunderx2t99. Change vulcan to reference thunderx2t99 for the tuning structure * config/aarch64/aarch64-cost-tables.h (vulcan_extra_costs): Rename to ... (thunderx2t99_extra_costs): This. * config/aarch64/aarch64-tune.md: Regenerate. * config/aarch64/aarch64.c (vulcan_addrcost_table): Rename to ... (vulcan_addrcost_table): This. (vulcan_regmove_cost): Rename to ... (thunderx2t99_regmove_cost): This. (vulcan_vector_cost): Rename to ... (thunderx2t99_vector_cost): this. (vulcan_branch_cost): Rename to ... (thunderx2t99_branch_cost): This. (vulcan_tunings): Rename to ... (thunderx2t99_tunings): This and s/vulcan/thunderx2t99 . * doc/invoke.texi (AARCH64/mtune): Add thunderx2t99. Index: gcc/config/aarch64/aarch64-cores.def === --- gcc/config/aarch64/aarch64-cores.def(revision 243968) +++ gcc/config/aarch64/aarch64-cores.def(working copy) @@ -74,7 +74,8 @@ AARCH64_CORE("xgene1", xgene1,x /* V8.1 Architecture Processors. */ /* Broadcom ('B') cores. */ -AARCH64_CORE("vulcan", vulcan, cortexa57, 8_1A, AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, vulcan, 0x42, 0x516, -1) +AARCH64_CORE("thunderx2t99", thunderx2t99, cortexa57, 8_1A, AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1) +AARCH64_CORE("vulcan", vulcan, cortexa57, 8_1A, AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1) /* V8 big.LITTLE implementations. */ Index: gcc/config/aarch64/aarch64-cost-tables.h === --- gcc/config/aarch64/aarch64-cost-tables.h(revision 243968) +++ gcc/config/aarch64/aarch64-cost-tables.h(working copy) @@ -127,7 +127,7 @@ const struct cpu_cost_table thunderx_ext } }; -const struct cpu_cost_table vulcan_extra_costs = +const struct cpu_cost_table thunderx2t99_extra_costs = { /* ALU */ { Index: gcc/config/aarch64/aarch64-tune.md === --- gcc/config/aarch64/aarch64-tune.md (revision 243968) +++ gcc/config/aarch64/aarch64-tune.md (working copy) @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,exynosm1,falkor,qdf24xx,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,vulcan,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53" + "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,exynosm1,falkor,qdf24xx,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,thunderx2t99,vulcan,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c(revision 243968) +++ gcc/config/aarch64/aarch64.c(working copy) @@ -268,7 +268,7 @@ static const struct cpu_addrcost_table q 0 /* imm_offset */ }; -static const struct cpu_addrcost_table vulcan_addrcost_table = +static const struct cpu_addrcost_table thunderx2t99_addrcost_table = { { 0, /* hi */ @@ -351,7 +351,7 @@ static const struct cpu_regmove_cost qdf 4 /* FP2FP */ }; -static const struct cpu_regmove_cost vulcan_regmove_cost = +static const struct cpu_regmove_cost thunderx2t99_regmove_cost = { 1, /* GP2GP */ /* Avoid the use of int<->fp moves for spilling. */ @@ -450,7 +450,7 @@ static const struct cpu_vector_cost xgen }; /* Costs for vector insn classes for Vulcan. */ -static const struct cpu_vector_cost vulcan_vector_cost = +static const struct cpu_vector_cost thunderx2t99_vector_cost = { 6, /* scalar_stmt_cost */ 4, /* scalar_load_cost */ @@ -482,7 +482,7 @@ static const struct cpu_branch_cost cort }; /* Branch costs for Vulcan. */ -static const struct cpu_branch_cost vulcan_branch_cost = +static const struct cpu_branch_cost thunderx2t99_branch_cost = { 1, /* Predictable. */ 3 /* Unpredictable. */ @@ -768,13 +768,13 @@ static const struct tune_params qdf24xx_ (AARCH64_EXTRA_TUNE_NONE)/* tune_flags. */ }; -static const struct tune_params vulcan_tunings = +static const struct tune_params thunderx2t99_tunings = { - &vulcan_extra_costs, - &vulcan_addrcost_table, - &vulcan_regmove_
[Committed] Lower iterator count on gcc.dg/atomic/c11-atomic-exec-5.c for AARCH64
Since AARCH64 does not have a native 128bit atomics, this testcase can take a long time with the default iteration count on a "fast" multi-core machine. This is because the thread which incrementing the counter is not able to acquire the mutex before the other thread has already acquired it. I had put the analysis in PR 59305 also if someone wants to improve libatomic. So this patch decreases the count down to 100 which should be enough time spent in the atomic loop to catch problems. Committed after a bootstrap/test on aarch64-linux-gnu (both ThunderX - CN88xx and ThunderX 2 CN99xx). Thanks, Andrew Pinski ChangeLog: * gcc.dg/atomic/c11-atomic-exec-5.c: Lower ITER_COUNT to 100 for AARCH64. Index: testsuite/gcc.dg/atomic/c11-atomic-exec-5.c === --- testsuite/gcc.dg/atomic/c11-atomic-exec-5.c (revision 243959) +++ testsuite/gcc.dg/atomic/c11-atomic-exec-5.c (working copy) @@ -24,7 +24,7 @@ | FE_OVERFLOW \ | FE_UNDERFLOW) -#if defined __alpha__ +#if defined __alpha__ || defined __aarch64__ #define ITER_COUNT 100 #else #define ITER_COUNT 1
Go patch committed: Fix length of roots array
This patch by Than McIntosh fixes the length of the type of the roots array in Gogo::register_gc_vars. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 243899) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -9a89f32811e6b3a29e22dda46e9c23811f562876 +d9be5f5d7907cbc169424fe2b8532cc3919cad5b The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/gogo.cc === --- gcc/go/gofrontend/gogo.cc (revision 243766) +++ gcc/go/gofrontend/gogo.cc (working copy) @@ -740,9 +740,9 @@ Gogo::register_gc_vars(const std::vector "__size", uint_type); Location builtin_loc = Linemap::predeclared_location(); - Expression* length = Expression::make_integer_ul(var_gc.size(), NULL, - builtin_loc); - + unsigned roots_len = var_gc.size() + this->gc_roots_.size() + 1; + Expression* length = Expression::make_integer_ul(roots_len, NULL, + builtin_loc); Array_type* root_array_type = Type::make_array_type(root_type, length); Type* ptdt = Type::make_type_descriptor_ptr_type(); Struct_type* root_list_type =