Re: [Patch, fortran] PR69834 - Collision in derived type hashes
Hi Dominique, Thanks for the heads up! I was going to review Andre's patch this morning, so I will clean my tree, apply it, confirm that it is regression free and then will generate a compatible version of my patch for PR69834. I strongly suspect that the core of the patch is OK and that it is the clean-up element that is failing to apply. Best regards Paul On 22 October 2016 at 01:04, Dominique d'Humières wrote: > Dear Paul, > > If I did not do any mistake, this patch conflicts seriously with Andre’s one > at https://gcc.gnu.org/ml/fortran/2016-10/msg00141.html. > > Cheers, > > Dominique > -- The difference between genius and stupidity is; genius has its limits. Albert Einstein
Re: [Patch, fortran] PR69834 - Collision in derived type hashes
I also see FAIL: gfortran.dg/select_type_9.f03 -O (test for errors, line 16) FAIL: gfortran.dg/select_type_9.f03 -O (test for excess errors) The errors emitted by the test have changed from /opt/gcc/_clean/gcc/testsuite/gfortran.dg/select_type_9.f03:16:11: class is (t) ! { dg-error "Double CLASS IS block" } 1 Error: Double CLASS IS block in SELECT TYPE statement at (1) to /opt/gcc/_clean/gcc/testsuite/gfortran.dg/select_type_9.f03:16:11: /opt/gcc/_clean/gcc/testsuite/gfortran.dg/select_type_9.f03:14:11: class is (t) 2 /opt/gcc/_clean/gcc/testsuite/gfortran.dg/select_type_9.f03:16:11: class is (t) ! { dg-error "Double CLASS IS block" } 1 Error: CASE label at (1) overlaps with CASE label at (2) Dominique > Le 22 oct. 2016 à 09:11, Paul Richard Thomas > a écrit : > > Hi Dominique, > > Thanks for the heads up! > > I was going to review Andre's patch this morning, so I will clean my > tree, apply it, confirm that it is regression free and then will > generate a compatible version of my patch for PR69834. I strongly > suspect that the core of the patch is OK and that it is the clean-up > element that is failing to apply. > > Best regards > > Paul > > > On 22 October 2016 at 01:04, Dominique d'Humières wrote: >> Dear Paul, >> >> If I did not do any mistake, this patch conflicts seriously with Andre’s one >> at https://gcc.gnu.org/ml/fortran/2016-10/msg00141.html. >> >> Cheers, >> >> Dominique >> > > > > -- > The difference between genius and stupidity is; genius has its limits. > > Albert Einstein
Re: relax rule for flexible array members in 6.x (78039 - fails to compile glibc tests)
> However, it was pointed out to me that apparently there is a policy > or convention of not backporting to release branches bug fixes that > cause GCC to reject code that was previously accepted, even if the > code is invalid. It's more of a judgment call I'd say, if the accept-invalid leads to wrong code in all cases, then you want to plug the hole; if it's benign in almost all cases, then it's a different discussion. -- Eric Botcazou
Re: [Fortran, Patch, PR{43366, 57117, 61337, 61376}, v1] Assign to polymorphic objects.
Dear Andre, For the bulk of the patch, I have no comments. However, for the testcase alloc_comp_class_5.f03, please eliminate the commented out lines and the TODO, as discussed on #gfortran. Add them to the testcase for for PR78053, as we agreed. In realloc_on_assign_27.f08, you have the following lines: + class(t), allocatable :: x + class(r), allocatable :: foo ! Need this declared of copy_R is not generated. + type(r) :: y = r (3, 42) + + x = y Surely, if you test for the existence of the vtable and create it if necessary for the rhs type in gfc_trans_class_assign, that would remove the need for 'foo'? The patch applies cleanly and regtests OK. Apart from the above nits, OK for trunk. Best regards Paul On 22 October 2016 at 12:19, Andre Vehreschild wrote: > Hi Paul, > > here is the patch for pr78053 so far. It is based on the one for pr43366. > Compilation of the also attached testcase now works. Unfortunately produces > the > patch a lot of regressions because the length of a char array is not stored > any > longer in the vtab *and* in the _len component for deferred length char > arrays. > That still has to be fixed. Given that you have modified a lot on how SELECT > TYPE works I fear, that when I change there, too, we get a lot of conflicts. > So > when you have a version of your patch for pr69834 I am happy to review it and > continue work on pr78053 afterwards. I think this makes the most sense to > avoid > duplicate or colliding work. > > Regards, > Andre >
Re: Use version namespace in normal mode
On 21/10/16 21:21 +0200, François Dumont wrote: Hi I configured libstdc++ to use gnu-version-namespace and there are a number of failures, see below. But none of them related to this patch so is it ok to commit ? Yes, OK to commit - it doesn't make the test results any worse than they were before :-) And I agree it is wrong for the containers to not be in the versioned namespace, so this fixes it - thanks for cleaning it up.
Re: [rs6000] Add support for signed overflow arithmetic
Hi Eric, Thanks for the patch. Unfortunately there is a big problem with it :-( On Sat, Oct 22, 2016 at 01:03:33AM +0200, Eric Botcazou wrote: > this implements support for signed overflow arithmetic on PowerPC. It's an > implementation for Power ISA v2.0x, i.e. it doesn't take account the new OV32 > flag introduced in v3.0. It doesn't implement unsigned overflow arithmetic > because my understanding is that the generic support already generates > optimal > code in most cases on PowerPC for unsigned. > > It introduces a new MODE_CC mode (CCVmode) which represents the OV flag of > the > XER, and the overflow arithmetic instructions are paired with a mcrxr. mcrxr does not exist anymore. It is not implemented in any IBM non-embedded CPU since POWER4. PA6T does not have it either. In some versions of the 2.0x ISA it does exist, but only in the optional "embedded" category. Linux emulates mcrxr, but that is very slow. You could use mfxer, but that is a slow instruction as well (it is microcoded). > The > comparisons are written in terms of UNSPECs because I used that for Visium > and > SPARC, but I can rewrite them a la x86/ARM if requested. That may work better. It also may be better if you expose the OV bit as a separate reg (just like we have CA), instead of putting two machine insns in each template. > @@ -21863,6 +21865,14 @@ print_operand (FILE *file, rtx x, int co >/* %c is output_addr_const if a CONSTANT_ADDRESS_P, otherwise >output_operand. */ > > +case 'C': > + /* X is a CR register. Print the index number of the CR. */ > + if (GET_CODE (x) != REG || ! CR_REGNO_P (REGNO (x))) > + output_operand_lossage ("invalid %%E value"); "%%C value", no space after "!". > + else > + fputs (reg_names[REGNO (x)], file); > + return; Why is this needed, do you have an assembler that wants register names but not for mcrxr? Segher
Re: [PATCH] Also fold bmi/bmi2/tbm bextr/bextri/bzhi/pext/pdep builtins
On Fri, Oct 21, 2016 at 5:37 PM, Uros Bizjak wrote: > On Fri, Oct 21, 2016 at 5:26 PM, Jakub Jelinek wrote: > >> This patch on top of the just posted patch adds folding for a couple more >> builtins (though, hundreds or thousands of other md builtins remain unfolded >> even though they actually could be folded for e.g. const arguments). Just a few words regarding other unfolded builtins. x86 intrinsics (and consequently builtins) are considered as a convenient way to emit assembly instructions. So, the same rules as when writting assembly, although slightly relaxed, should apply there. IMO, compiler optimizations with intrinsics should be an exception, not the rule. As an example, __builtin_ctz, __builtin_clz and functionaly similar target-builtins are rather messy w.r.t to "undefinedness", so I think this fact warrants some help from the compiler. But there is no need to handle every single builtin - only a competent person that knows the background of these intrinsics should use them. Uros.
Re: [PATCH] Three patches for std::experimental::filesystem
On 21/10/16 18:01 +0100, Jonathan Wakely wrote: LWG2720 implement filesystem::perms::symlink_nofollow * include/experimental/bits/fs_fwd.h (perms::resolve_symlinks): Replace with symlink_nofollow (LWG 2720). * src/filesystem/ops.cc (permissions(const path&, perms, error_code&)): Handle symlink_nofollow. * testsuite/experimental/filesystem/operations/create_symlink.cc: New test. * testsuite/experimental/filesystem/operations/permissions.cc: Test overload taking error_code. GNU libc doesn't implement AT_SYMLINK_NOFOLLOW so fchmodat always returns EOPNOTSUPP even if it's used for non-symlinks. This patch makes us ignore symlink_nofollow if the file isn't a symlink, so we don't get an error unless we're actually trying to change a symlink's permissions. Tested x86_64-linux, committed to trunk. commit 5002d3b814ff6f72ce4e110079f5f406d0ff9898 Author: Jonathan Wakely Date: Sat Oct 22 11:57:36 2016 +0100 Ignore perms::symlink_nofollow on non-symlinks * src/filesystem/ops.cc (permissions(const path&, perms, error_code&)): Ignore symlink_nofollow flag if file is not a symlink. * testsuite/experimental/filesystem/operations/permissions.cc: Test symlink_nofollow on non-symlinks. diff --git a/libstdc++-v3/src/filesystem/ops.cc b/libstdc++-v3/src/filesystem/ops.cc index 68343a9..2286e22 100644 --- a/libstdc++-v3/src/filesystem/ops.cc +++ b/libstdc++-v3/src/filesystem/ops.cc @@ -1097,7 +1097,8 @@ fs::permissions(const path& p, perms prms) _GLIBCXX_THROW_OR_ABORT(filesystem_error("cannot set permissions", p, ec)); } -void fs::permissions(const path& p, perms prms, error_code& ec) noexcept +void +fs::permissions(const path& p, perms prms, error_code& ec) noexcept { const bool add = is_set(prms, perms::add_perms); const bool remove = is_set(prms, perms::remove_perms); @@ -1110,27 +,33 @@ void fs::permissions(const path& p, perms prms, error_code& ec) noexcept prms &= perms::mask; - if (add || remove) + file_status st; + if (add || remove || nofollow) { - auto st = nofollow ? symlink_status(p, ec) : status(p, ec); + st = nofollow ? symlink_status(p, ec) : status(p, ec); if (ec) return; auto curr = st.permissions(); if (add) prms |= curr; - else + else if (remove) prms = curr & ~prms; } + int err = 0; #if _GLIBCXX_USE_FCHMODAT - const int flag = nofollow ? AT_SYMLINK_NOFOLLOW : 0; + const int flag = (nofollow && is_symlink(st)) ? AT_SYMLINK_NOFOLLOW : 0; if (::fchmodat(AT_FDCWD, p.c_str(), static_cast(prms), flag)) +err = errno; #else - if (nofollow) + if (nofollow && is_symlink(st)) ec = std::make_error_code(std::errc::operation_not_supported); else if (::chmod(p.c_str(), static_cast(prms))) +err = errno; #endif -ec.assign(errno, std::generic_category()); + + if (err) +ec.assign(err, std::generic_category()); else ec.clear(); } diff --git a/libstdc++-v3/testsuite/experimental/filesystem/operations/permissions.cc b/libstdc++-v3/testsuite/experimental/filesystem/operations/permissions.cc index 839cfef..61471a3 100644 --- a/libstdc++-v3/testsuite/experimental/filesystem/operations/permissions.cc +++ b/libstdc++-v3/testsuite/experimental/filesystem/operations/permissions.cc @@ -121,6 +121,26 @@ test04() remove(p); } +void +test05() +{ + using perms = std::experimental::filesystem::perms; + std::error_code ec; + + __gnu_test::scoped_file f; + auto p = perms::owner_write; + + // symlink_nofollow should not give an error for non-symlinks + permissions(f.path, p|perms::symlink_nofollow, ec); + VERIFY( !ec ); + auto st = status(f.path); + VERIFY( st.permissions() == p ); + p |= perms::owner_read; + permissions(f.path, p|perms::symlink_nofollow, ec); + st = status(f.path); + VERIFY( st.permissions() == p ); +} + int main() { @@ -128,4 +148,5 @@ main() test02(); test03(); test04(); + test05(); }
Re: [PATCH] PR fortran/78033 -- This was a REAL pain
See comments in pr78033. What are the plans to handle pr54730? Dominique
[patch, fortran] Fix PR 78021
Hello world, this rather self-explanatory patch fixes a problem where two function invocations with 'c' and 'c' as arguments were considered equal. Regression-tested. OK for trunk and 6 and 5 branches? Regards Thomas 2016-10-22 Thomas Koenig PR fortran/78021 * gfc_compare_functions: Strings with different lengths in argument lists compare unequal. 2016-10-22 Thomas Koenig PR fortran/78021 * gfortran.dg/string_length-3.f90: New test. ! { dg-do run } ! { dg-options "-ffrontend-optimize -fdump-tree-original" } ! PR 78021 - calls to mylen were folded after shortening the ! argument list. PROGRAM test_o_char implicit none integer :: n n = mylen('c') + mylen('c ') if (n /= 5) call abort CONTAINS FUNCTION mylen(c) CHARACTER(len=*),INTENT(in) :: c INTEGER :: mylen mylen=LEN(c) END FUNCTION mylen END PROGRAM test_o_char ! { dg-final { scan-tree-dump-times "__var" 0 "original" } } Index: dependency.c === --- dependency.c (Revision 240928) +++ dependency.c (Arbeitskopie) @@ -226,10 +226,27 @@ gfc_dep_compare_functions (gfc_expr *e1, gfc_expr if ((args1->expr == NULL) ^ (args2->expr == NULL)) return -2; - if (args1->expr != NULL && args2->expr != NULL - && gfc_dep_compare_expr (args1->expr, args2->expr) != 0) - return -2; + if (args1->expr != NULL && args2->expr != NULL) + { + gfc_expr *e1, *e2; + e1 = args1->expr; + e2 = args2->expr; + if (gfc_dep_compare_expr (e1, e2) != 0) + return -2; + + /* Special case: String arguments which compare equal can have + different lengths, which makes them different in calls to + procedures. */ + + if (e1->expr_type == EXPR_CONSTANT + && e1->ts.type == BT_CHARACTER + && e2->expr_type == EXPR_CONSTANT + && e2->ts.type == BT_CHARACTER + && e1->value.character.length != e2->value.character.length) + return -2; + } + args1 = args1->next; args2 = args2->next; }
Re: [Fortran, Patch, PR{43366, 57117, 61337, 61376}, v1] Assign to polymorphic objects.
Hi Paul, thanks for the review. Committed as r241439. The first nit has gone to the patch for pr78053 as agreed upon. The second nit: > + class(r), allocatable :: foo ! Need this declared of copy_R is not > generated. has magically disappeared. I assume that it was necessary on an intermediate stage of the patch only. I now have stripped the above line from the commit and everything works fine. Thanks again for the review. Regards, Andre On Sat, 22 Oct 2016 12:41:19 +0200 Paul Richard Thomas wrote: > Dear Andre, > > For the bulk of the patch, I have no comments. However, for the > testcase alloc_comp_class_5.f03, please eliminate the commented out > lines and the TODO, as discussed on #gfortran. Add them to the > testcase for for PR78053, as we agreed. > > In realloc_on_assign_27.f08, you have the following lines: > + class(t), allocatable :: x > + class(r), allocatable :: foo ! Need this declared of copy_R is not > generated. > + type(r) :: y = r (3, 42) > + > + x = y > > Surely, if you test for the existence of the vtable and create it if > necessary for the rhs type in gfc_trans_class_assign, that would > remove the need for 'foo'? > > The patch applies cleanly and regtests OK. Apart from the above nits, > OK for trunk. > > Best regards > > Paul > > On 22 October 2016 at 12:19, Andre Vehreschild wrote: > > Hi Paul, > > > > here is the patch for pr78053 so far. It is based on the one for pr43366. > > Compilation of the also attached testcase now works. Unfortunately produces > > the patch a lot of regressions because the length of a char array is not > > stored any longer in the vtab *and* in the _len component for deferred > > length char arrays. That still has to be fixed. Given that you have > > modified a lot on how SELECT TYPE works I fear, that when I change there, > > too, we get a lot of conflicts. So when you have a version of your patch > > for pr69834 I am happy to review it and continue work on pr78053 > > afterwards. I think this makes the most sense to avoid duplicate or > > colliding work. > > > > Regards, > > Andre > > -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: [PATCH] PR fortran/78033 -- This was a REAL pain
On Sat, Oct 22, 2016 at 02:22:47PM +0200, Dominique d'Humières wrote: > See comments in pr78033. What are the plans to handle pr54730? > Not sure what you mean. I certainly will not remove Mikael's checkpointing in the array constructor code, leaving array_constructor_43.f90 broken. -- Steve
Re: [PATCH] Also fold bmi/bmi2/tbm bextr/bextri/bzhi/pext/pdep builtins
On Sat, Oct 22, 2016 at 01:46:30PM +0200, Uros Bizjak wrote: > On Fri, Oct 21, 2016 at 5:37 PM, Uros Bizjak wrote: > > On Fri, Oct 21, 2016 at 5:26 PM, Jakub Jelinek wrote: > > > >> This patch on top of the just posted patch adds folding for a couple more > >> builtins (though, hundreds or thousands of other md builtins remain > >> unfolded > >> even though they actually could be folded for e.g. const arguments). > > Just a few words regarding other unfolded builtins. x86 intrinsics > (and consequently builtins) are considered as a convenient way to emit > assembly instructions. So, the same rules as when writting assembly, > although slightly relaxed, should apply there. IMO, compiler > optimizations with intrinsics should be an exception, not the rule. As > an example, __builtin_ctz, __builtin_clz and functionaly similar > target-builtins are rather messy w.r.t to "undefinedness", so I think > this fact warrants some help from the compiler. But there is no need > to handle every single builtin - only a competent person that knows > the background of these intrinsics should use them. Generally constant folding what we can is a good thing, usually people will not use the intrinsics when they are passing constants directly, but constants could appear there through inlining and other optimizations. If we do constant fold the x86 intrinsics, we allow further constant folding and optimizations down the road. For various x86 intrinsics we do some constant folding, but only late (during RTL optimizations), and only if the insn patterns don't contain UNSPECs. Besides the BMI/BMI2/TBM/LZCNT intrinsics that are already folded or I've posted patch for, intrinsics that IMHO would be nice to be folded are e.g. __builtin_ia32_bsr*, __builtin_ia32_ro[rl]*, maybe __builtin_ia32_{,r}sqrtps*, __builtin_ia32_rcpps, etc. For __builtin_ia32_addps and the like the question is why we have those builtins at all, it would be better to just use normal vector arithmetics. __builtin_ia32_cmp*p[sd], __builtin_ia32_{min,max}[ps][sd] etc. are also nicely constant foldable, etc. Jakub
backport fix for c/71115 to branches?
Bug 71115 - [5/6 Regression] Missing warning: excess elements in struct initializer, was fixed on trunk but the bug is still open since the patch hasn't been backported to the affected branches. Is it okay to go ahead and backport it to 6.x and 5.x? The original patch was posted here: https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01423.html Thanks Martin
[v3 PATCH] Cross-port the latest resolution of LWG2756 and some bug-fixes to experimental::optional.
Tested on Linux-x64. Ok for trunk and the gcc-6 branch? 2016-10-22 Ville Voutilainen Cross-port the latest resolution of LWG2756 and some bug-fixes to experimental::optional. PR libstc++/77288 PR libstdc++/77727 * include/experimental/optional (_Optional_base): Remove constructors that take a _Tp. (__is_optional_impl, __is_optional): Remove. (__converts_from_optional): New. (optional(_Up&&)): Fix constraints, call base with in_place. (optional(const optional<_Up>&)): Fix constraints, use emplace. (optional(optional<_Up>&&)): Likewise. (operator=(_Up&&)): Fix constraints. (operator=(const optional<_Up>&)): Likewise. (operator=(optional<_Up>&&)): Likewise. (emplace(_Args&&...)): Constrain. (emplace(initializer_list<_Up>, _Args&&...)): Likewise. * testsuite/experimental/optional/77288.cc: New. * testsuite/experimental/optional/assignment/5.cc: Adjust. * testsuite/experimental/optional/cons/77727.cc: New. * testsuite/experimental/optional/cons/value.cc: Adjust. diff --git a/libstdc++-v3/include/experimental/optional b/libstdc++-v3/include/experimental/optional index 7191eca..a631158 100644 --- a/libstdc++-v3/include/experimental/optional +++ b/libstdc++-v3/include/experimental/optional @@ -214,12 +214,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION : _Optional_base{} { } // Constructors for engaged optionals. - constexpr _Optional_base(const _Tp& __t) - : _M_payload(__t), _M_engaged(true) { } - - constexpr _Optional_base(_Tp&& __t) - : _M_payload(std::move(__t)), _M_engaged(true) { } - template constexpr explicit _Optional_base(in_place_t, _Args&&... __args) : _M_payload(std::forward<_Args>(__args)...), _M_engaged(true) { } @@ -356,12 +350,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION constexpr _Optional_base(nullopt_t) noexcept : _Optional_base{} { } - constexpr _Optional_base(const _Tp& __t) - : _M_payload(__t), _M_engaged(true) { } - - constexpr _Optional_base(_Tp&& __t) - : _M_payload(std::move(__t)), _M_engaged(true) { } - template constexpr explicit _Optional_base(in_place_t, _Args&&... __args) : _M_payload(std::forward<_Args>(__args)...), _M_engaged(true) { } @@ -474,19 +462,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template class optional; - template -struct __is_optional_impl : false_type -{ }; - - template - struct __is_optional_impl> : true_type -{ }; - - template -struct __is_optional -: public __is_optional_impl>> -{ }; - + template +using __converts_from_optional = + __or_&>, + is_constructible<_Tp, optional<_Up>&>, + is_constructible<_Tp, const optional<_Up>&&>, + is_constructible<_Tp, optional<_Up>&&>, + is_convertible&, _Tp>, + is_convertible&, _Tp>, + is_convertible&&, _Tp>, + is_convertible&&, _Tp>>; + + template +using __assigns_from_optional = + __or_&>, + is_assignable<_Tp&, optional<_Up>&>, + is_assignable<_Tp&, const optional<_Up>&&>, + is_assignable<_Tp&, optional<_Up>&&>>; /** * @brief Class template for optional values. @@ -522,75 +514,75 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION constexpr optional() = default; // Converting constructors for engaged optionals. - template >, + __not_, decay_t<_Up>>>, is_constructible<_Tp, _Up&&>, is_convertible<_Up&&, _Tp> >::value, bool> = true> constexpr optional(_Up&& __t) -: _Base(_Tp(std::forward<_Up>(__t))) { } +: _Base(in_place, std::forward<_Up>(__t)) { } - template >, - is_constructible<_Tp, _Up&&>, - __not_> - >::value, bool> = false> + __not_, decay_t<_Up>>>, + is_constructible<_Tp, _Up&&>, + __not_> + >::value, bool> = false> explicit constexpr optional(_Up&& __t) -: _Base(_Tp(std::forward<_Up>(__t))) { } +: _Base(in_place, std::forward<_Up>(__t)) { } template >, - __not_&>>, - __not_&, _Tp>>, is_constructible<_Tp, const _Up&>, - is_convertible + is_convertible, + __not_<__converts_from_optional<_Tp, _Up>> >::value, bool> = true> constexpr optional(const optional<_Up>& __t) -: _Base(__t ? optional<_Tp>(*__t) : optional<_Tp>()) { } + { + if (__t) + emplace(*__t); + } template >, - __not_&>>, - __not_&, _Tp>>,
Re: [v3 PATCH] Cross-port the latest resolution of LWG2756 and some bug-fixes to experimental::optional.
On 22 October 2016 at 19:34, Ville Voutilainen wrote: > Cross-port the latest resolution of LWG2756 and some > bug-fixes to experimental::optional. > PR libstc++/77288 > PR libstdc++/77727 And yes, I'll fix that first PR reference before committing the ChangeLog changes. :)
Re: [PATCH] Also fold bmi/bmi2/tbm bextr/bextri/bzhi/pext/pdep builtins
On Sat, 22 Oct 2016, Jakub Jelinek wrote: On Sat, Oct 22, 2016 at 01:46:30PM +0200, Uros Bizjak wrote: On Fri, Oct 21, 2016 at 5:37 PM, Uros Bizjak wrote: On Fri, Oct 21, 2016 at 5:26 PM, Jakub Jelinek wrote: This patch on top of the just posted patch adds folding for a couple more builtins (though, hundreds or thousands of other md builtins remain unfolded even though they actually could be folded for e.g. const arguments). Just a few words regarding other unfolded builtins. x86 intrinsics (and consequently builtins) are considered as a convenient way to emit assembly instructions. So, the same rules as when writting assembly, although slightly relaxed, should apply there. IMO, compiler optimizations with intrinsics should be an exception, not the rule. As an example, __builtin_ctz, __builtin_clz and functionaly similar target-builtins are rather messy w.r.t to "undefinedness", so I think this fact warrants some help from the compiler. But there is no need to handle every single builtin - only a competent person that knows the background of these intrinsics should use them. Generally constant folding what we can is a good thing, usually people will not use the intrinsics when they are passing constants directly, but constants could appear there through inlining and other optimizations. If we do constant fold the x86 intrinsics, we allow further constant folding and optimizations down the road. +1 For various x86 intrinsics we do some constant folding, but only late (during RTL optimizations), and only if the insn patterns don't contain UNSPECs. Besides the BMI/BMI2/TBM/LZCNT intrinsics that are already folded or I've posted patch for, intrinsics that IMHO would be nice to be folded are e.g. __builtin_ia32_bsr*, __builtin_ia32_ro[rl]*, maybe __builtin_ia32_{,r}sqrtps*, __builtin_ia32_rcpps, etc. For __builtin_ia32_addps and the like the question is why we have those builtins at all, it would be better to just use normal vector arithmetics. Note that we do use operator+ directly in *intrin.h. We only keep the builtin __builtin_ia32_addps because ada maintainers asked us to. We could lower them to normal vector arithmetics early in gimple, but it doesn't seem worth touching them since they are legacy. __builtin_ia32_cmp*p[sd], __builtin_ia32_{min,max}[ps][sd] etc. are also nicely constant foldable, etc. I think _mm_cmpeq_pd could use the vector extensions instead of __builtin_ia32_cmpeqpd if they were ported from C++ to C, same for a few more. Some others which don't have such a close match in the vector extensions could still be lowered (in gimple) to vector operations, which would allow constant folding as well as other optimizations. -- Marc Glisse
[testsuite] UnXFAIL gcc.dg/tree-ssa/pr71347.c on SPARC
Tested on SPARC/Solaris, applied on the mainline. 2016-10-22 Eric Botcazou * gcc.dg/tree-ssa/pr71347.c: Remove XFAIL on SPARC. -- Eric BotcazouIndex: gcc.dg/tree-ssa/pr71347.c === --- gcc.dg/tree-ssa/pr71347.c (revision 241437) +++ gcc.dg/tree-ssa/pr71347.c (working copy) @@ -14,4 +14,4 @@ void foo (void) } /* Load of X[i - i] can be omitted by reusing X[i] in previous iteration. */ -/* { dg-final { scan-tree-dump-not ".* = MEM.*;" "optimized" { xfail { ia64-*-* arm*-*-* m68k*-*-* sparc*-*-* } } } } */ +/* { dg-final { scan-tree-dump-not ".* = MEM.*;" "optimized" { xfail { ia64-*-* arm*-*-* m68k*-*-* } } } } */
Re: [PATCH] PR fortran/78033 -- This was a REAL pain
> Le 22 oct. 2016 à 16:42, Steve Kargl a > écrit : > > On Sat, Oct 22, 2016 at 02:22:47PM +0200, Dominique d'Humières wrote: >> See comments in pr78033. What are the plans to handle pr54730? >> > > Not sure what you mean. I certainly will not remove > Mikael's checkpointing in the array constructor code, > leaving array_constructor_43.f90 broken. Well I have applied the patch at https://gcc.gnu.org/ml/fortran/2016-10/msg00156.html and noticed the failures of the test array_constructor_42.f90 and those in pr54730. So more explicit questions: Is the patch the right one? If yes, what will be done for the failures? TIA Dominique > -- > Steve
Re: [PATCH v3] gcc/config/tilegx/tilegx.c (tilegx_function_profiler): Save r10 to stack before call mcount
On 10/21/2016 6:24 PM, Chen Gang wrote: On 10/20/16 06:42, Jeff Law wrote: On 6/4/16 21:25, cheng...@emindsoft.com.cn wrote: From: Chen Gang r10 may also be as parameter stack pointer for the nested function, so need save it before call mcount. Also clean up code: use '!' instead of "== 0" for checking static_chain_decl and compute_total_frame_size. 2016-06-04 Chen Gang gcc/ PR target/71331 * config/tilegx/tilegx.c (tilegx_function_profiler): Save r10 to stack before call mcount. (tilegx_can_use_return_insn_p): Clean up code. So if I understand the tilegx architecture correctly, you're issuing the r10 save & sp adjustment as a bundle, and the restore & sp adjustment as a bundle. The problem is the semantics of bunding on the tilegx effectively mean that all source operands are read in parallel, then all outputs occur in parallel. So if we take the bundle {addi sp,sp,-8 ; st sp, r10} The address used for the st is the value of the stack pointer before the addi instruction. Similarly for the restore r10 bundle. The address used for the load is sp before adjustment. Given my understanding of the tilegx bundling semantics, that seems wrong. Jeff The comments on 1st page of "TILE-Gx Instruction Set Architecture": Individual instructions within a bundle must comply with certain register semantics. Read-after-write (RAW) dependencies are enforced between instruction bundles. There is no ordering within a bundle, and the numbering of pipelines or instruction slots within a bundle is only used for convenience and does not imply any ordering. Within an instruction bundle, it is valid to encode an output operand that is the same as an input operand. Because there is explicitly no implied dependency within a bundle, the semantics for this specify that the input operands for all instructions in a bundle are read before any of the output operands are written. Write-after-write (WAW) semantics between two bundles are defined as: the latest write over-writes earlier writes. Within a bundle, WAW dependencies are forbidden. If more than one instruction in a bundle writes to the same output operand register, unpredictable results for any destination operand within that bundle can occur. Also, implementations are free to signal this case as an illegal instruction. There is one exception to this rule—multiple instructions within a bundle may legally target the zero register. Lastly, some instructions, such as instructions that implicitly write the link register, implicitly write registers. If an instruction implicitly writes to a register that another instruction in the same bundle writes to, unpredictable results can occur for any output register used by that bundle and/or an illegal instruction interrupt can occur. On Page 221, ld instruction is: ld Dest, Src On Page 251, st instruction is: st SrcA, SrcB So for me: Bundle {addi sp, sp, 8; ld r10, sp} is OK, it is RAW. Bundle {addi sp, sp, -8; st sp, r10} is OK, too, it is RAW (not WAW -- both SrcA and SrcB are input operands). Please help check, if need the related document, please let me know. As you wrote, RAW applies "between instruction bundles". In this case you are looking at register usage within a single bundle, and as you wrote, "the input operands for all instructions in a bundle are read before any of the output operands are written." So for your two bundles quoted above, the "sp" input operand for both instructions will have the same value, i.e. the load/store will have the pre-adjusted "sp" value. -- Chris Metcalf, Mellanox Technologies http://www.mellanox.com