RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, On Mon, 16 Mar 2015 20:31:53, Hans-Peter Nilsson wrote: > > On Mon, 16 Mar 2015, Eric Botcazou wrote: >>> If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit >>> architecture. I doubt that the strict alignment code makes any sense for >>> modesize> BIGGEST_ALIGNMENT. >> >> Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of >> BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK. > > The context of the above statement is somewhat ambiguous: if the > statement is taken to include "no matter what is specified in > the program, including use of __attribute__ ((__aligned__ ...))" > then I have to object. (I guess Eric, you didn't actually mean > that, though.) > > The definition of BIGGEST_ALIGNMENT is (tm.texi.in): > "Biggest alignment that any data type can require on this > machine, in bits. Note that this is not the biggest alignment > that is supported, just the biggest alignment that, when > violated, may cause a fault." > > So, IMNSHO we'd better *support* a *larger* alignment, as in "if > the code specified it by explicit means" at least up to > MAX_OFILE_ALIGNMENT. But, "perfectly OK" when the code didn't > explicitly say anything else. > Absolutely. The idea if this patch, is to *require* the use of __attribute__((__aligned__(x))) for all structures that need the strict volatile bitfields semantics. At least if the ABI has BIGGEST_ALIGNMENT
Re: [PATCH][expmed] Calculate mult-by-const cost properly in mult_by_coeff_cost
On 17/03/15 19:11, Jeff Law wrote: On 03/16/2015 04:12 AM, Kyrill Tkachov wrote: Hi all, Eyeballing the mult_by_coeff_cost function I think it has a typo/bug. It's supposed to return the cost of multiplying by a constant 'coeff'. It calculates that by taking the cost of a MULT rtx by that constant and comparing it to the cost of synthesizing that multiplication, and returning the cheapest. However, in the MULT rtx cost calculations it creates a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as I would expect. This patches fixes that in the obvious way. Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu. I'm guessing this is stage 1 material at this point? Thanks, Kyrill 2015-03-13 Kyrylo Tkachov * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost calculation rather than fake_reg. I'd think stage1, unless you can point to a bug, particularly a regression. No regression that I know of. I'll queue it up for stage 1 if it's ok code-wise. Thanks, Kyrill Jeff
Re: [PATCH][expmed] Calculate mult-by-const cost properly in mult_by_coeff_cost
On Wed, Mar 18, 2015 at 5:06 PM, Kyrill Tkachov wrote: > > On 17/03/15 19:11, Jeff Law wrote: >> >> On 03/16/2015 04:12 AM, Kyrill Tkachov wrote: >>> >>> Hi all, >>> >>> Eyeballing the mult_by_coeff_cost function I think it has a typo/bug. >>> It's supposed to return the cost of multiplying by a constant 'coeff'. >>> It calculates that by taking the cost of a MULT rtx by that constant >>> and comparing it to the cost of synthesizing that multiplication, and >>> returning >>> the cheapest. However, in the MULT rtx cost calculations it creates >>> a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as >>> I would >>> expect. This patches fixes that in the obvious way. >>> >>> Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu. >>> I'm guessing this is stage 1 material at this point? >>> >>> Thanks, >>> Kyrill >>> >>> 2015-03-13 Kyrylo Tkachov >>> >>> * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost >>> calculation rather than fake_reg. >> >> I'd think stage1, unless you can point to a bug, particularly a >> regression. > > > No regression that I know of. I'll queue it up for stage 1 if it's ok > code-wise. This function just estimate the max_cost roughly, since it is only used by Tree passes. It shouldn't have much impact on generated code. Maybe some targets doesn't have proper cost function for reg * const rtl expression, also most of calls are in IVOPT, so it would be better if you run some benchmark to make sure there is no surprise. Thanks, bin > > Thanks, > Kyrill > >> Jeff >> > >
Re: [PATCH][expmed] Calculate mult-by-const cost properly in mult_by_coeff_cost
On 18/03/15 09:36, Bin.Cheng wrote: On Wed, Mar 18, 2015 at 5:06 PM, Kyrill Tkachov wrote: On 17/03/15 19:11, Jeff Law wrote: On 03/16/2015 04:12 AM, Kyrill Tkachov wrote: Hi all, Eyeballing the mult_by_coeff_cost function I think it has a typo/bug. It's supposed to return the cost of multiplying by a constant 'coeff'. It calculates that by taking the cost of a MULT rtx by that constant and comparing it to the cost of synthesizing that multiplication, and returning the cheapest. However, in the MULT rtx cost calculations it creates a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as I would expect. This patches fixes that in the obvious way. Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu. I'm guessing this is stage 1 material at this point? Thanks, Kyrill 2015-03-13 Kyrylo Tkachov * expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost calculation rather than fake_reg. I'd think stage1, unless you can point to a bug, particularly a regression. No regression that I know of. I'll queue it up for stage 1 if it's ok code-wise. This function just estimate the max_cost roughly, since it is only used by Tree passes. It shouldn't have much impact on generated code. Maybe some targets doesn't have proper cost function for reg * const rtl expression, also most of calls are in IVOPT, so it would be better if you run some benchmark to make sure there is no surprise. Hi Bin, Thanks for the guidance. I'll have a look. Kyrill Thanks, bin Thanks, Kyrill Jeff
Re: [PATCH] Add pre-reload splitter for low part SI/DImode extraction out of vector regs (PR target/65078)
On Tue, Mar 17, 2015 at 7:15 PM, Jakub Jelinek wrote: > Hi! > > This patch fixes a regression where since the removal of specialized > builtin from _mm_storel_epi64 we force the extraction of DImode (or SImode) > low value out of 16/32/64 byte vector registers into memory. > As the vector extraction is from a vector register with a different > element mode, the expander doesn't know it might be beneficial to subreg it > to a vector mode with the same size, but different element mode and do > vector extraction out of that. This patch adds a pre-reload splitter that > will turn it into such a vector extraction. At least for the -m32 > DImode extraction directly into memory, I think teaching RA to do that would > be much harder. Agreed. > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2015-03-17 Jakub Jelinek > > PR target/65078 > * config/i386/sse.md (movsi/movdi -> vec_extract_*_0 splitter): New. > > * gcc.target/i386/pr65078-1.c: New test. > * gcc.target/i386/pr65078-2.c: New test. > * gcc.target/i386/pr65078-3.c: New test. > * gcc.target/i386/pr65078-4.c: New test. > * gcc.target/i386/pr65078-5.c: New test. > * gcc.target/i386/pr65078-6.c: New test. OK for mainline. Thanks, Uros.
Re: [PATCH] Document -m{arch,tune}=knl (PR target/65222)
On Tue, Mar 10, 2015 at 8:19 PM, Jakub Jelinek wrote: > This patch documents knl. The PR is asking for also documenting > slm, but I believe those were intentionally removed from the documentation > as deprecated aliases, at least that is my reading of the > 2013-12-23 H.J. Lu > Tocar Ilya > ... > * doc/invoke.texi: Replace corei7, corei7-avx, core-avx-i, > core-avx2, atom, slm with nehalem, sandybridge, ivybridge, > haswell, bonnel, silvermont. Add westmere. > change, so I think documenting knl is all we want right now. > Or should we use knightslanding and make knl an undocumented alias? H.J., any preferences here? I think that knightslanding is just too long (and two words), so I'd prefer knl. > 2015-03-10 Jakub Jelinek > > PR target/65222 > * doc/invoke.texi: Add knl as x86 -march=/-mtune= CPU type. Let's wait for HJ, the patch is otherwise OK and safe for mainline. Thanks, Uros.
Re: [PATCH] PR target/65248: Copy relocation against protected symbol doesn't work
On Fri, Mar 6, 2015 at 2:42 PM, H.J. Lu wrote: > On Thu, Mar 05, 2015 at 05:31:51PM -0800, H.J. Lu wrote: >> Protected data symbol means that it can't be pre-emptied. It doesn't mean >> its address won't be external. This is true for pointer to protected >> function. With copy relocation, address of protected data defined in the >> shared library may also be external. We only know that for sure at >> run-time. TARGET_BINDS_LOCAL_P should return false on protected data >> symbol. OK for trunk? >> >> Thanks. >> >> H.J. >> --- >> PR target/65248 >> * output.h (default_binds_local_p_2): New. >> * varasm.c (default_binds_local_p_2): Renamed to ... >> (default_binds_local_p_3): This. Don't return true on protected >> data symbol if protected data may be external. >> (default_binds_local_p): Use default_binds_local_p_3. >> (default_binds_local_p_1): Likewise. >> (default_binds_local_p_2): New. >> * config/i386/i386.c (TARGET_BINDS_LOCAL_P): Replace >> darwin_binds_local_p with default_binds_local_p_2. > > Here is the updated patch with testcases. Tested on Linux/x86. OK > for trunk? > > Thanks. > > > H.J. > > gcc/ > > PR target/65248 > * output.h (default_binds_local_p_2): New. > * varasm.c (default_binds_local_p_2): Renamed to ... > (default_binds_local_p_3): This. Don't return true on protected > data symbol if protected data may be external. > (default_binds_local_p): Use default_binds_local_p_3. > (default_binds_local_p_1): Likewise. > (default_binds_local_p_2): New. > * config/i386/i386.c (TARGET_BINDS_LOCAL_P): Replace > darwin_binds_local_p with default_binds_local_p_2. > > gcc/testsuite/ > > PR target/65248 > * gcc.target/i386/pr65248-1.c: New file. > * gcc.target/i386/pr65248-2.c: Likewise. > * gcc.target/i386/pr65248-3.c: Likewise. > * gcc.target/i386/pr65248-4.c: Likewise. This patch needs global reviewer approval (I have added Jakub to CC) and Darwin maintainer approval. Uros.
Re: C++ PATCH for c++/65398 (valid constexpr rejected)
On Fri, Mar 13, 2015 at 3:40 PM, Marek Polacek wrote: > We started to reject this (IMHO valid) testcase with r214941 that did away > with > try_move_mult_to_index -- meaning that we are no longer able to fold *(&s[0] > + 1) > into s[1], while we are able to fold *(s + 1) into s[1]. > > I suppose cxx_fold_indirect_ref ought to be able to handle both cases, so I > added > some code to that effect, it should handle now at least the simple cases... > Or should that be handled in the middle end? It's "correct" for constexpr folding but not correct to hand s[1] down to the middle-end IL (both cases). Well, in the particular case with in-array-bound constant and a non-pointer base it's good enough at least. Richard. > I have also tried: > > constexpr char s[] = "abc"; > constexpr int i = 0; > constexpr char c = *(&s[0] + i); > > and that is accepted even without this patch. > > Bootstrapped/regtested on x86_64-linux, ok for trunk? > > 2015-03-13 Marek Polacek > > PR c++/65398 > * constexpr.c (cxx_fold_indirect_ref): Transform *(&A[i] p+ j) into > A[i + j]. > > * g++.dg/cpp0x/pr65398.C: New test. > > diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c > index 1b5f50c..18f4d8c 100644 > --- gcc/cp/constexpr.c > +++ gcc/cp/constexpr.c > @@ -2427,6 +2427,17 @@ cxx_fold_indirect_ref (location_t loc, tree type, tree > op0, bool *empty_base) > break; > } > } > + /* *(&A[i] p+ j) => A[i + j] */ > + else if (TREE_CODE (op00) == ARRAY_REF > + && TREE_CODE (TREE_OPERAND (op00, 1)) == INTEGER_CST > + && TREE_CODE (op01) == INTEGER_CST) > + { > + tree t = fold_convert_loc (loc, TREE_TYPE (op01), > +TREE_OPERAND (op00, 1)); > + t = size_binop_loc (loc, PLUS_EXPR, op01, t); > + return build4_loc (loc, ARRAY_REF, type, TREE_OPERAND (op00, 0), > +t, NULL_TREE, NULL_TREE); > + } > } > } >/* *(foo *)fooarrptr => (*fooarrptr)[0] */ > diff --git gcc/testsuite/g++.dg/cpp0x/pr65398.C > gcc/testsuite/g++.dg/cpp0x/pr65398.C > index e69de29..dc22d27 100644 > --- gcc/testsuite/g++.dg/cpp0x/pr65398.C > +++ gcc/testsuite/g++.dg/cpp0x/pr65398.C > @@ -0,0 +1,19 @@ > +// PR c++/65398 > +// { dg-do compile { target c++11 } } > + > +#define SA(X) static_assert((X),#X) > + > +constexpr char s[] = "abc"; > +constexpr char c1 = *(&s[0] + 0); > +constexpr char c2 = *(&s[0] + 1); > +constexpr char c3 = *(&s[1] + 0); > +constexpr char c4 = *(&s[1] + 1); > +constexpr char c5 = *(&s[2] + 0); > +constexpr char c6 = *(&s[0] + 2); > + > +SA (c1 == 'a'); > +SA (c2 == 'b'); > +SA (c3 == 'b'); > +SA (c4 == 'c'); > +SA (c5 == 'c'); > +SA (c6 == 'c'); > > Marek
Re: [PATCH, stage1] Make parloops gate more strict
On Fri, Mar 13, 2015 at 4:28 PM, Tom de Vries wrote: > On 13-03-15 13:36, Richard Biener wrote: >> >> On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek wrote: >>> >>> On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) >>> >>> >>> If it runs with cfun == NULL, then supposedly the gates that are >>> dependent >>> on current function should for -fdump-passes purposes also return true >>> if cfun == NULL (well, of course do all the unconditional checks). >>> Though of course, with optimize/target attributes this is harder, as >>> different functions can use different options. >> >> >> Yes, one reason why I think -fdump-passes is just broken >> implementation-wise. >> > > Atm fdump-passes doesn't run with cfun == NULL. > > From pass_manager::dump_passes: > ... > FOR_EACH_FUNCTION (n) > if (DECL_STRUCT_FUNCTION (n->decl)) > { > node = n; > break; > } > > if (!node) > return; > > push_cfun (DECL_STRUCT_FUNCTION (node->decl)); Um - this now picks a random function which may be one with an optimize or target attribute associated to it. Richard. > ... > > This was discussed here: > https://gcc.gnu.org/ml/gcc-patches/2011-06/msg00856.html > > Thanks, > - Tom
[PATCH, ARM, PR64208] LRA ICE Fix
Hi, This is a fix for PR64208 where LRA loops when dealing with iwmmxt_arm_movdi insn. As explain in the PR, the issue was introduced on trunk and 4.9 branch by fix of PR rtl-optimization/60969 and then workaround by r211798 (-fuse-caller-save enable for ARM). The changes in IRA cost made by PR60969, changed the register class of this insn output from GENERAL_REGS to IWMMXT_REGS, and the redundancies in the insn pattern alternatives description force LRA to reload the pseudo, which generates the same iwmmxt_arm_movdi insn, which can't be resolved, and so on ... Removing the redundancies fixes the issue, as LRA find that alternative 8 (Uy => y) matches. This issue is present in 4.9 branch, but latent on trunk (the clobbering of IP and CC information added during -fuse-caller-save patch changed the register allocation). Cross compiled and regression tested on ARM targets (but not on an IWMMXT one), is it ok for trunk and 4.9 branch ? Rq: I think that adding IP and CC clobbers to CALL_INSN_FUNCTION_USAGE, as specified by AAPCS, in 4.9 branch is something we need too, I've a patch for that if you agree on that. Thanks, Yvan 2105-03-17 Yvan Roux PR target/64208 * config/arm/iwmmxt.md ("*iwmmxt_arm_movdi"): Cleanup redundant alternatives. diff --git a/gcc/config/arm/iwmmxt.md b/gcc/config/arm/iwmmxt.md index fda3c2c..d1a60ff 100644 --- a/gcc/config/arm/iwmmxt.md +++ b/gcc/config/arm/iwmmxt.md @@ -107,8 +107,8 @@ ) (define_insn "*iwmmxt_arm_movdi" - [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r, r, r, m,y,y,yr,y,yrUy,*w, r,*w,*w, *Uv") -(match_operand:DI 1 "di_operand" "rDa,Db,Dc,mi,r,y,yr,y,yrUy,y, r,*w,*w,*Uvi,*w"))] + [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r, r, r, m,y,y,r, y,Uy,*w, r,*w,*w, *Uv") +(match_operand:DI 1 "di_operand" "rDa,Db,Dc,mi,r,y,r,y,Uy,y, r,*w,*w,*Uvi,*w"))] "TARGET_REALLY_IWMMXT && ( register_operand (operands[0], DImode) || register_operand (operands[1], DImode))"
Re: [PATCH, ARM, PR64208] LRA ICE Fix
Hi Yvan, On 18/03/15 10:19, Yvan Roux wrote: Hi, This is a fix for PR64208 where LRA loops when dealing with iwmmxt_arm_movdi insn. As explain in the PR, the issue was introduced on trunk and 4.9 branch by fix of PR rtl-optimization/60969 and then workaround by r211798 (-fuse-caller-save enable for ARM). The changes in IRA cost made by PR60969, changed the register class of this insn output from GENERAL_REGS to IWMMXT_REGS, and the redundancies in the insn pattern alternatives description force LRA to reload the pseudo, which generates the same iwmmxt_arm_movdi insn, which can't be resolved, and so on ... Removing the redundancies fixes the issue, as LRA find that alternative 8 (Uy => y) matches. This issue is present in 4.9 branch, but latent on trunk (the clobbering of IP and CC information added during -fuse-caller-save patch changed the register allocation). Cross compiled and regression tested on ARM targets (but not on an IWMMXT one), is it ok for trunk and 4.9 branch ? Rq: I think that adding IP and CC clobbers to CALL_INSN_FUNCTION_USAGE, as specified by AAPCS, in 4.9 branch is something we need too, I've a patch for that if you agree on that. Thanks, Yvan 2105-03-17 Yvan Roux PR target/64208 * config/arm/iwmmxt.md ("*iwmmxt_arm_movdi"): Cleanup redundant alternatives. As a general note, I think this patch should include the testcase from the PR. I can see that it's fairly self-contained. Cheers, Kyrill
Re: [patch] libstdc++/64847 add autoconf checks for pthread_rwlock_t
On 13/03/15 13:46 +, Jonathan Wakely wrote: On 12/03/15 18:49 +, Jonathan Wakely wrote: I assumed that Pthreads was enough to ensure pthread_rwlock_t but https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64847 shows that isn't true for HPUX (seems it was optional prior to POSIX 1003.1-2001). This adds an autoconf check to decide whether to use pthread_rwlock_t or the fallback implementation in terms of std::condition_variable and std::mutex. Actually the autoconf changes aren't needed, as PR 64847 is already fixed. Maybe it's better to have an explicit check for pthread_rwlock_t anyway though ... I'll have a think. Here's what I've committed, it has Torvald's fixes for retrying on EAGAIN, lengthy comments explaining why we retry, and also the autoconf macros (as having a _GLIBCXX_USE_PTHREAD_RWLOCK_T macro that can be explicitly overriden in a per-target c++config.h is probably useful). Tested x86_64-linux, committed to trunk. commit 9ffe38de54cc4f07c818a6d9119ae9ef6d8bdbda Author: Jonathan Wakely Date: Fri Mar 13 13:31:36 2015 + 2015-03-18 Jonathan Wakely Torvald Riegel * acinclude.m4 (GLIBCXX_CHECK_GTHREADS): Check for pthread_rwlock_t. * config.h.in: Regenerate. * configure: Regenerate. * include/std/shared_mutex: Check _GLIBCXX_USE_PTHREAD_RWLOCK_T. (shared_timed_mutex::_M_rwlock): Use PTHREAD_RWLOCK_INITIALIZER. (shared_timed_mutex::lock_shared()): Retry on EAGAIN. (shared_timed_mutex::try_lock_shared_until()): Retry on EAGAIN and EDEADLK. diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 index 0b8c0f0..74f5a65 100644 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -3563,6 +3563,13 @@ AC_DEFUN([GLIBCXX_CHECK_GTHREADS], [ if test x"$ac_has_gthreads" = x"yes"; then AC_DEFINE(_GLIBCXX_HAS_GTHREADS, 1, [Define if gthreads library is available.]) + +# Also check for pthread_rwlock_t for std::shared_timed_mutex in C++14 +AC_CHECK_TYPE([pthread_rwlock_t], +[AC_DEFINE([_GLIBCXX_USE_PTHREAD_RWLOCK_T], 1, +[Define if POSIX read/write locks are available in .])], +[], +[#include "gthr.h"]) fi CXXFLAGS="$ac_save_CXXFLAGS" diff --git a/libstdc++-v3/include/std/shared_mutex b/libstdc++-v3/include/std/shared_mutex index 5dcc295..ab1b45b 100644 --- a/libstdc++-v3/include/std/shared_mutex +++ b/libstdc++-v3/include/std/shared_mutex @@ -57,10 +57,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// shared_timed_mutex class shared_timed_mutex { -#if defined(__GTHREADS_CXX0X) +#ifdef _GLIBCXX_USE_PTHREAD_RWLOCK_T typedef chrono::system_clock __clock_t; -pthread_rwlock_t _M_rwlock; +#ifdef PTHREAD_RWLOCK_INITIALIZER +pthread_rwlock_t _M_rwlock = PTHREAD_RWLOCK_INITIALIZER; + + public: +shared_timed_mutex() = default; +~shared_timed_mutex() = default; +#else +pthread_rwlock_t _M_rwlock; public: shared_timed_mutex() @@ -82,6 +89,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Errors not handled: EBUSY, EINVAL _GLIBCXX_DEBUG_ASSERT(__ret == 0); } +#endif shared_timed_mutex(const shared_timed_mutex&) = delete; shared_timed_mutex& operator=(const shared_timed_mutex&) = delete; @@ -165,12 +173,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION void lock_shared() { - int __ret = pthread_rwlock_rdlock(&_M_rwlock); + int __ret; + // We retry if we exceeded the maximum number of read locks supported by + // the POSIX implementation; this can result in busy-waiting, but this + // is okay based on the current specification of forward progress + // guarantees by the standard. + do + __ret = pthread_rwlock_rdlock(&_M_rwlock); + while (__ret == EAGAIN); if (__ret == EDEADLK) __throw_system_error(int(errc::resource_deadlock_would_occur)); - if (__ret == EAGAIN) - // Maximum number of read locks has been exceeded. - __throw_system_error(int(errc::device_or_resource_busy)); // Errors not handled: EINVAL _GLIBCXX_DEBUG_ASSERT(__ret == 0); } @@ -210,11 +222,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION static_cast(__ns.count()) }; - int __ret = pthread_rwlock_timedrdlock(&_M_rwlock, &__ts); - // If the maximum number of read locks has been exceeded, or we would - // deadlock, we just fail to acquire the lock. Unlike for lock(), - // we are not allowed to throw an exception. - if (__ret == ETIMEDOUT || __ret == EAGAIN || __ret == EDEADLK) + int __ret; + // Unlike for lock(), we are not allowed to throw an exception so if + // the maximum number of read locks has been exceeded, or we would + // deadlock, we just try to acquire the lock again (and will time out + // eventually). + // In cases where we would exceed the maximum number of read locks + // throughout the whole time until the timeout, we will fail to + // acquire the lock even if it would be logically free; however, this + // is allowed by the
Re: [PATCH, stage1] Make parloops gate more strict
On 18-03-15 11:16, Richard Biener wrote: On Fri, Mar 13, 2015 at 4:28 PM, Tom de Vries wrote: On 13-03-15 13:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek wrote: On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. Atm fdump-passes doesn't run with cfun == NULL. From pass_manager::dump_passes: ... FOR_EACH_FUNCTION (n) if (DECL_STRUCT_FUNCTION (n->decl)) { node = n; break; } if (!node) return; push_cfun (DECL_STRUCT_FUNCTION (node->decl)); Um - this now picks a random function which may be one with an optimize or target attribute associated to it. Indeed. Attached patch removes that code, and runs the gates with cfun == NULL for -fdump-passes. It at least builds, and allows us to compile src/gcc/testsuite/gcc.dg/dump-pass.c with -O2 -fdump-passes. Should I bootstrap and reg-test, or do you see a problem with this approach? Thanks, - Tom Fix -fdump-passes --- gcc/bb-reorder.c| 5 +++-- gcc/cprop.c | 8 +--- gcc/except.c| 10 +- gcc/gcse.c | 28 gcc/loop-init.c | 11 +++ gcc/omp-low.c | 3 ++- gcc/passes.c| 16 +--- gcc/store-motion.c | 10 ++ gcc/tree-chkp-opt.c | 14 -- gcc/tree-chkp.c | 11 ++- gcc/tree-complex.c | 3 ++- gcc/tree-eh.c | 10 -- gcc/tree-if-conv.c | 6 +- gcc/tree-into-ssa.c | 3 ++- gcc/tree-ssa-loop.c | 16 +--- gcc/tree-ssa.c | 3 ++- gcc/tree-stdarg.c | 3 ++- gcc/tree-vect-generic.c | 3 ++- 18 files changed, 95 insertions(+), 68 deletions(-) diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c index c2a3be3..b8f2a4b 100644 --- a/gcc/bb-reorder.c +++ b/gcc/bb-reorder.c @@ -2709,8 +2709,9 @@ pass_partition_blocks::gate (function *fun) /* See gate_handle_reorder_blocks. We should not partition if we are going to omit the reordering. */ && optimize_function_for_speed_p (fun) - && !DECL_COMDAT_GROUP (current_function_decl) - && !user_defined_section_attribute); + && !user_defined_section_attribute + && (fun == NULL + || !DECL_COMDAT_GROUP (current_function_decl))); } unsigned diff --git a/gcc/cprop.c b/gcc/cprop.c index c9fb2fc..bbea008 100644 --- a/gcc/cprop.c +++ b/gcc/cprop.c @@ -1961,9 +1961,11 @@ public: opt_pass * clone () { return new pass_rtl_cprop (m_ctxt); } virtual bool gate (function *fun) { - return optimize > 0 && flag_gcse - && !fun->calls_setjmp - && dbg_cnt (cprop); + return (optimize > 0 + && flag_gcse + && (fun == NULL + || (!fun->calls_setjmp + && dbg_cnt (cprop; } virtual unsigned int execute (function *) { return execute_rtl_cprop (); } diff --git a/gcc/except.c b/gcc/except.c index 833ec21..f81ade6 100644 --- a/gcc/except.c +++ b/gcc/except.c @@ -2677,14 +2677,14 @@ public: }; // class pass_convert_to_eh_region_ranges bool -pass_convert_to_eh_region_ranges::gate (function *) +pass_convert_to_eh_region_ranges::gate (function *fun) { - /* Nothing to do for SJLJ exceptions or if no regions created. */ - if (cfun->eh->region_tree == NULL) -return false; if (targetm_common.except_unwind_info (&global_options) == UI_SJLJ) return false; - return true; + + /* Nothing to do for SJLJ exceptions or if no regions created. */ + return (fun == NULL + || fun->eh->region_tree != NULL); } } // anon namespace diff --git a/gcc/gcse.c b/gcc/gcse.c index e03b36c..80bdd5f 100644 --- a/gcc/gcse.c +++ b/gcc/gcse.c @@ -4252,10 +4252,12 @@ public: bool pass_rtl_pre::gate (function *fun) { - return optimize > 0 && flag_gcse -&& !fun->calls_setjmp -&& optimize_function_for_speed_p (fun) -&& dbg_cnt (pre); + return (optimize > 0 + && flag_gcse + && optimize_function_for_speed_p (fun) + && (fun == NULL + || (!fun->calls_setjmp + && dbg_cnt (pre; } } // anon namespace @@ -4295,15 +4297,17 @@ public: }; // class pass_rtl_hoist bool -pass_rtl_hoist::gate (function *) +pass_rtl_hoist::gate (function *fun) { - return optimize > 0 && flag_gcse -&& !cfun->calls_setjmp -/* It does not make sense to run code hoisting unless we are optimizing - for code size -- it rarely makes programs faster,
Re: [PATCH] Run DCE after if conversion
On Tue, Mar 17, 2015 at 7:29 PM, Jeff Law wrote: > On 03/17/2015 02:17 AM, Andreas Krebbel wrote: >> >> >> Just to have some numbers I did run a -j1 GCC bootstrap twice with and >> without the patch on x86_64. >> Best results for both are: >> >> clean: 21459s >> patched: 21314s >> >> There rather appears to be a trend towards reduced compile time perhaps >> due to the reduced number of >> INSNs to be processed in the RTL passes between the two ifcvt runs (loop >> optimization, combine, >> fwprop, dse,...)?! >> >> I also tried to measure the testsuite runs but the results show a big >> variance. So what I have right >> now does not qualify as a benchmark. > > And reality is it's getting harder and harder to benchmark this kind of > thing with turbo modes and such.A single run isn't sufficient unless > you've locked the box into a particular cpu frequency. For the particular patch I wonder if you really need to change all three if-conversion pass instances or if changing the one before combine (pass_rtl_ifcvt, thus rest_of_handle_if_conversion) is enough. That already runs an unconditonal (huh...) cleanup_cfg (0) at the end which could be changed so that DCE is performed (CLEANUP_EXPENSIVE, runs delete_trivially_dead_insns). At least that makes the patch smaller and its impact restricted to one of the three ifcvt passes. OTOH ifcvt performs a DCE at its start (to be not confused by dead instructions I guess), so why doesn't combine do that as well (oh, it does!?)? And maybe _that_ DCE can be removed as if_convert () already performs a DF_LR_RUN_DCE on the first pass. Richard. > jeff >> >> >
Re: [PATCH] PR target/65240, Fix Power{7,8} insn constraint issue with -O3 -ffast-math
On Wed, Mar 18, 2015 at 12:18 AM, Michael Meissner wrote: > On Thu, Mar 12, 2015 at 11:37:14AM -0400, David Edelsohn wrote: >> Please check on the performance implications of removing the special >> constant support. I know that it is late, but I think that ripping it >> out is less risky than trying to fix this, if the performance impact >> is not bad. > > Now, I haven't drilled down to exactly what is causing the performance > differences, but I've done some Spec 2006 runs comparing subversion id 221194, > with the two patches. > > The first patch is a rewrite of the code that I originally put into the > compiler to move floating point constants under -ffast-math during the first > split pass. A minor tweak would need to be done to the original patch so that > it works with -mcmodel=small or -m32 options. > > The second patch completely eliminates keeping the non-0 constant around in > RTL, and pushes it to memory during the initial RTL generation, since it is > felt that the RTL optimizations no longer need the constant in RTL to convert > division by constant into multiplication by the reciprocal. > > The benchmarcks that show a difference are. Note, I do not count benchmarks > that differ by less than 2% to be significant. Percentages more than 100% > mean > the benchmark ran faster: > > Benchmark Patch-1 Patch-2 > = === === > 401.bzip2 102.59% 103.51% > 462.libquantum 100.28% 97.52% > 483.xalancbmk97.72% 97.90% > 435.gromacs 104.48% 99.39% > 436.cactusADM 102.19% 102.90% > 470.lbm 100.39% 97.45% > Spec INT score 99.86% 99.86% > Spec FP score 100.50% 99.81% > > Patch #1 had 3 faster benchmarks and 1 slower benchmark. Patch #2 had 2 > faster > benchmarks, and 3 slower benchmarks. Did you double-check if there are any differences in generated code? Esp. the SPEC INT benchmarks look odd - they don't contain any FP code. Richard. > I tend to feel patch #2 is cleaner, though it is slightly slower. However, I > can go with patch #1 if desired. > > Patch #2 bootstrapped fine, and had no regressions in the test suite. Did > you want me to install patch #1, patch #2, or do you want more information? > > [gcc] > 2015-03-17 Michael Meissner > > PR target/65240 > * config/rs6000/predicates.md (easy_fp_constant): Remove special > -ffast-math handling that kept non-0 constants live in the RTL > until reload. Remove logic testing the number of instructions it > took to create a constant in a GPR that was never used, due to a > test for soft-float earlier. > (memory_fp_constant): Delete, no longer used. > > * config/rs6000/rs6000.md (mov_hardfloat): Remove > alternatives for loading non-0 constants into GPRs for hard > floating point that is no longer needed due to changes in > easy_fp_constant. Add support for loading 0.0 into GPRs. > (mov_hardfloat32): Likewise. > (mov_hardfloat64): Likewise. > (mov_64bit_dm): Likewise. > (movtd_64bit_nodm): Likewise. > (pre-reload move FP constant define_split): Delete define_split, > since it is no longer used. > (extenddftf2_internal): Remove GHF constraints that are not valid > for extenddftf2. > > [gcc/testsuite] > 2015-03-17 Michael Meissner > > PR target/65240 > * gcc/testsuite/g++.dg/pr65240.h: Add tests for PR 65240. > * gcc/testsuite/g++.dg/pr65240-1.C: Likewise. > * gcc/testsuite/g++.dg/pr65240-2.C: Likewise. > * gcc/testsuite/g++.dg/pr65240-3.C: Likewise. > * gcc/testsuite/g++.dg/pr65240-4.C: Likewise. > > -- > Michael Meissner, IBM > IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA > email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: Fix for PRs 36043, 58744 and 65408
On Wed, Mar 18, 2015 at 5:22 AM, Alan Modra wrote: > On Tue, Mar 17, 2015 at 01:28:41PM -0600, Jeff Law wrote: >> On 03/14/2015 07:02 AM, Alan Modra wrote: >> > PR target/65408 >> > PR target/58744 >> > PR middle-end/36043 >> > * calls.c (load_register_parameters): Don't load past end of >> > mem unless suitably aligned. >> I think this is probably a stage1 item. Richi, Jakub, Joseph, do any of you >> think we should try to push this into gcc-5? > > Some (severity) background to PR65408. The bug came from SAP HANA > (en.wikipedia.org/wiki/SAP_HANA), a crash that happens on powerpc64 > and powerpc64le. aarch64 would also be susceptible to the crash since > it also loads 16 bytes for the 12-byte struct. x86_64 only loads 12 > bytes (i386.c:construct_container generates a parallel with a DImode > and SImode load). However the underlying bug is there and hits x86_64 > too for the pr58744 and pr36043 testcases.. It's a very very very old bug though. I'd be interested in any odd code-generation difference for compiling, say, the linux kernel (you _can_ get quite ugly code generated because of this fix). I'm leaning towards waiting for stage1 and then consider a backport to 5.1. I'm sure the HAHA guys can work-around by forcing an extra temporary on the stack and passing that. Richard. > -- > Alan Modra > Australia Development Lab, IBM
Re: [PATCH, stage1] Make parloops gate more strict
On Wed, Mar 18, 2015 at 12:03 PM, Tom de Vries wrote: > On 18-03-15 11:16, Richard Biener wrote: >> >> On Fri, Mar 13, 2015 at 4:28 PM, Tom de Vries >> wrote: >>> >>> On 13-03-15 13:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek wrote: > > > On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: >> >> >> Not really (I don't like -fdump-passes ...), but we need to make sure >> that -fdump-passes doesn't crash (because it runs very early and >> with cfun == NULL I think) > > > > If it runs with cfun == NULL, then supposedly the gates that are > dependent > on current function should for -fdump-passes purposes also return true > if cfun == NULL (well, of course do all the unconditional checks). > Though of course, with optimize/target attributes this is harder, as > different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. >>> >>> Atm fdump-passes doesn't run with cfun == NULL. >>> >>> From pass_manager::dump_passes: >>> ... >>>FOR_EACH_FUNCTION (n) >>> if (DECL_STRUCT_FUNCTION (n->decl)) >>>{ >>> node = n; >>> break; >>>} >>> >>>if (!node) >>> return; >>> >>>push_cfun (DECL_STRUCT_FUNCTION (node->decl)); >> >> >> Um - this now picks a random function which may be one with >> an optimize or target attribute associated to it. >> > > Indeed. > > Attached patch removes that code, and runs the gates with cfun == NULL for > -fdump-passes. It at least builds, and allows us to compile > src/gcc/testsuite/gcc.dg/dump-pass.c with -O2 -fdump-passes. > > Should I bootstrap and reg-test, or do you see a problem with this approach? Yeah - it makes the -fdump-passes "hack" more pervasive throughout the compiler. I suppose it should instead build & push a "dummy" sturct function. Well, or simply don't care for it's brokeness. Richard. > Thanks, > - Tom > >
Re: [PATCH, ARM, PR64208] LRA ICE Fix
HI Kyrill, On 18 March 2015 at 11:24, Kyrill Tkachov wrote: > Hi Yvan, > > > On 18/03/15 10:19, Yvan Roux wrote: >> >> Hi, >> >> This is a fix for PR64208 where LRA loops when dealing with >> iwmmxt_arm_movdi insn. As explain in the PR, the issue was introduced >> on trunk and 4.9 branch by fix of PR rtl-optimization/60969 and then >> workaround by r211798 (-fuse-caller-save enable for ARM). >> >> The changes in IRA cost made by PR60969, changed the register class of >> this insn output from GENERAL_REGS to IWMMXT_REGS, and the >> redundancies in the insn pattern alternatives description force LRA to >> reload the pseudo, which generates the same iwmmxt_arm_movdi insn, >> which can't be resolved, and so on ... >> >> Removing the redundancies fixes the issue, as LRA find that >> alternative 8 (Uy => y) matches. >> >> This issue is present in 4.9 branch, but latent on trunk (the >> clobbering of IP and CC information added during -fuse-caller-save >> patch changed the register allocation). >> >> Cross compiled and regression tested on ARM targets (but not on an >> IWMMXT one), is it ok for trunk and 4.9 branch ? >> >> Rq: I think that adding IP and CC clobbers to >> CALL_INSN_FUNCTION_USAGE, as specified by AAPCS, in 4.9 branch is >> something we need too, I've a patch for that if you agree on that. >> >> Thanks, >> Yvan >> >> 2105-03-17 Yvan Roux >> >> PR target/64208 >> * config/arm/iwmmxt.md ("*iwmmxt_arm_movdi"): Cleanup redundant >> alternatives. > > > As a general note, I think this patch should include the testcase from the > PR. > I can see that it's fairly self-contained. Yes, I wanted to find one that exhibits the issue on trunk as the PR testcase doesn't, but don't find one so far. If it's fine to include a testcase that don't fail without that patch on trunk, I can include it. Cheers, Yvan
Re: [PATCH][ARM] New testcase to check parameter passing bug
Hi Honggyu, On 16/03/15 00:53, Honggyu Kim wrote: new file mode 100644 index 000..3790764 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr65358.c @@ -0,0 +1,33 @@ +/* { dg-do run */ Forgot to close the brace here after 'run' Dejagnu happily ignores that and transforms this into a compile-only testcase without that. Otherwise looks good to me. Kyrill +/* { dg-options "-O2" } */ + +struct pack +{ + int fine; + int victim; + int killer; +}; +
[PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
Hi, This patch fixes PR target/65444 by passing '-z bndplt' to linker when appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. Will commit it to trunk in a couple of days if no objections arise. Thanks, Ilya -- gcc/ 2015-03-18 Ilya Enkovich PR driver/65444 * config/i386/linux-common.h (MPX_SPEC): New. (CHKP_SPEC): Add MPX_SPEC. libmpx/ 2015-03-18 Ilya Enkovich PR driver/65444 * configure.ac: Add check for '-z bndplt' support by linker. Add link_mpx output variable. * libmpx.spec.in (link_mpx): New. * configure: Regenerate. diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h index 9c6560b..dd79ec6 100644 --- a/gcc/config/i386/linux-common.h +++ b/gcc/config/i386/linux-common.h @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see %:include(libmpx.spec)%(link_libmpx)" #endif +#ifndef MPX_SPEC +#define MPX_SPEC "\ + %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" +#endif + #ifndef LIBMPX_SPEC #if defined(HAVE_LD_STATIC_DYNAMIC) #define LIBMPX_SPEC "\ @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see #ifndef CHKP_SPEC #define CHKP_SPEC "\ -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" MPX_SPEC #endif diff --git a/libmpx/configure.ac b/libmpx/configure.ac index fe0d3f2..3f8b50f 100644 --- a/libmpx/configure.ac +++ b/libmpx/configure.ac @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) link_libmpx="-lpthread" +link_mpx="" +AC_MSG_CHECKING([whether ld accepts -z bndplt]) +echo "int main() {};" > conftest.c +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest conftest.c 1>&AS_MESSAGE_LOG_FD]) +then +AC_MSG_RESULT([yes]) +link_mpx="$link_mpx -z bndplt" +else +AC_MSG_RESULT([no]) +fi AC_SUBST(link_libmpx) +AC_SUBST(link_mpx) AM_INIT_AUTOMAKE(foreign no-dist no-dependencies) AM_ENABLE_MULTILIB(, ..) diff --git a/libmpx/libmpx.spec.in b/libmpx/libmpx.spec.in index a265e28..34d0bdf 100644 --- a/libmpx/libmpx.spec.in +++ b/libmpx/libmpx.spec.in @@ -1,3 +1,5 @@ # This spec file is read by gcc when linking. It is used to specify the -# standard libraries we need in order to link with libcilkrts. +# standard libraries we need in order to link with libmpx. *link_libmpx: @link_libmpx@ + +*link_mpx: @link_mpx@
Re: [PATCH][expmed][cleanup] Use std::swap instead of manual swapping
On 16/03/15 10:10, Kyrill Tkachov wrote: Hi all, This patch replaces manual swapping in synth_mult with std::swap. Not much else to say about this. This code could arguably be refactored a bit but that's another story. I believe these are considered obvious at this point. I'll apply it in 24 hours unless somebody objects Committed with r221486 after re-bootstrapping on x86_64-linux-gnu. Kyrill Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu. Thanks, Kyrill 2015-03-16 Kyrylo Tkachov * expmed.c (synth_mult): Use std::swap instead of manually swapping algorithms.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich wrote: > Hi, > > This patch fixes PR target/65444 by passing '-z bndplt' to linker when > appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. Will > commit it to trunk in a couple of days if no objections arise. > > Thanks, > Ilya > -- > gcc/ > > 2015-03-18 Ilya Enkovich > > PR driver/65444 > * config/i386/linux-common.h (MPX_SPEC): New. > (CHKP_SPEC): Add MPX_SPEC. > > libmpx/ > > 2015-03-18 Ilya Enkovich > > PR driver/65444 > * configure.ac: Add check for '-z bndplt' support > by linker. Add link_mpx output variable. > * libmpx.spec.in (link_mpx): New. > * configure: Regenerate. > > > diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h > index 9c6560b..dd79ec6 100644 > --- a/gcc/config/i386/linux-common.h > +++ b/gcc/config/i386/linux-common.h > @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see > %:include(libmpx.spec)%(link_libmpx)" > #endif > > +#ifndef MPX_SPEC > +#define MPX_SPEC "\ > + > %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" > +#endif > + > #ifndef LIBMPX_SPEC > #if defined(HAVE_LD_STATIC_DYNAMIC) > #define LIBMPX_SPEC "\ > @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see > > #ifndef CHKP_SPEC > #define CHKP_SPEC "\ > -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" > +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" MPX_SPEC > #endif > diff --git a/libmpx/configure.ac b/libmpx/configure.ac > index fe0d3f2..3f8b50f 100644 > --- a/libmpx/configure.ac > +++ b/libmpx/configure.ac > @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) > AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) > > link_libmpx="-lpthread" > +link_mpx="" > +AC_MSG_CHECKING([whether ld accepts -z bndplt]) > +echo "int main() {};" > conftest.c > +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest conftest.c > 1>&AS_MESSAGE_LOG_FD]) > +then > +AC_MSG_RESULT([yes]) > +link_mpx="$link_mpx -z bndplt" > +else > +AC_MSG_RESULT([no]) > +fi > AC_SUBST(link_libmpx) > +AC_SUBST(link_mpx) > Without -z bndplt, MPX won't work correctly. We should always pass -z bndplt to linker. If linker doesn't support it, ld will issue a warning, not error and users will know their linker is too old. When they update linker, they don't have to rebuild GCC. -- H.J.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
2015-03-18 15:02 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich wrote: >> Hi, >> >> This patch fixes PR target/65444 by passing '-z bndplt' to linker when >> appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. Will >> commit it to trunk in a couple of days if no objections arise. >> >> Thanks, >> Ilya >> -- >> gcc/ >> >> 2015-03-18 Ilya Enkovich >> >> PR driver/65444 >> * config/i386/linux-common.h (MPX_SPEC): New. >> (CHKP_SPEC): Add MPX_SPEC. >> >> libmpx/ >> >> 2015-03-18 Ilya Enkovich >> >> PR driver/65444 >> * configure.ac: Add check for '-z bndplt' support >> by linker. Add link_mpx output variable. >> * libmpx.spec.in (link_mpx): New. >> * configure: Regenerate. >> >> >> diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h >> index 9c6560b..dd79ec6 100644 >> --- a/gcc/config/i386/linux-common.h >> +++ b/gcc/config/i386/linux-common.h >> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see >> %:include(libmpx.spec)%(link_libmpx)" >> #endif >> >> +#ifndef MPX_SPEC >> +#define MPX_SPEC "\ >> + >> %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" >> +#endif >> + >> #ifndef LIBMPX_SPEC >> #if defined(HAVE_LD_STATIC_DYNAMIC) >> #define LIBMPX_SPEC "\ >> @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see >> >> #ifndef CHKP_SPEC >> #define CHKP_SPEC "\ >> -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >> +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" MPX_SPEC >> #endif >> diff --git a/libmpx/configure.ac b/libmpx/configure.ac >> index fe0d3f2..3f8b50f 100644 >> --- a/libmpx/configure.ac >> +++ b/libmpx/configure.ac >> @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) >> AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) >> >> link_libmpx="-lpthread" >> +link_mpx="" >> +AC_MSG_CHECKING([whether ld accepts -z bndplt]) >> +echo "int main() {};" > conftest.c >> +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest conftest.c >> 1>&AS_MESSAGE_LOG_FD]) >> +then >> +AC_MSG_RESULT([yes]) >> +link_mpx="$link_mpx -z bndplt" >> +else >> +AC_MSG_RESULT([no]) >> +fi >> AC_SUBST(link_libmpx) >> +AC_SUBST(link_mpx) >> > > Without -z bndplt, MPX won't work correctly. We should always pass -z bndplt > to linker. If linker doesn't support it, ld will issue a warning, not > error and users > will know their linker is too old. When they update linker, they don't have > to > rebuild GCC. If ld issues a warning instead of an error, then configure test passes and we pass '-z bndplt' to linker. Ilya >1 > > -- > H.J.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich wrote: > 2015-03-18 15:02 GMT+03:00 H.J. Lu : >> On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich >> wrote: >>> Hi, >>> >>> This patch fixes PR target/65444 by passing '-z bndplt' to linker when >>> appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. Will >>> commit it to trunk in a couple of days if no objections arise. >>> >>> Thanks, >>> Ilya >>> -- >>> gcc/ >>> >>> 2015-03-18 Ilya Enkovich >>> >>> PR driver/65444 >>> * config/i386/linux-common.h (MPX_SPEC): New. >>> (CHKP_SPEC): Add MPX_SPEC. >>> >>> libmpx/ >>> >>> 2015-03-18 Ilya Enkovich >>> >>> PR driver/65444 >>> * configure.ac: Add check for '-z bndplt' support >>> by linker. Add link_mpx output variable. >>> * libmpx.spec.in (link_mpx): New. >>> * configure: Regenerate. >>> >>> >>> diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h >>> index 9c6560b..dd79ec6 100644 >>> --- a/gcc/config/i386/linux-common.h >>> +++ b/gcc/config/i386/linux-common.h >>> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see >>> %:include(libmpx.spec)%(link_libmpx)" >>> #endif >>> >>> +#ifndef MPX_SPEC >>> +#define MPX_SPEC "\ >>> + >>> %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" >>> +#endif >>> + >>> #ifndef LIBMPX_SPEC >>> #if defined(HAVE_LD_STATIC_DYNAMIC) >>> #define LIBMPX_SPEC "\ >>> @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see >>> >>> #ifndef CHKP_SPEC >>> #define CHKP_SPEC "\ >>> -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >>> +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >>> MPX_SPEC >>> #endif >>> diff --git a/libmpx/configure.ac b/libmpx/configure.ac >>> index fe0d3f2..3f8b50f 100644 >>> --- a/libmpx/configure.ac >>> +++ b/libmpx/configure.ac >>> @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) >>> AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) >>> >>> link_libmpx="-lpthread" >>> +link_mpx="" >>> +AC_MSG_CHECKING([whether ld accepts -z bndplt]) >>> +echo "int main() {};" > conftest.c >>> +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest conftest.c >>> 1>&AS_MESSAGE_LOG_FD]) >>> +then >>> +AC_MSG_RESULT([yes]) >>> +link_mpx="$link_mpx -z bndplt" >>> +else >>> +AC_MSG_RESULT([no]) >>> +fi >>> AC_SUBST(link_libmpx) >>> +AC_SUBST(link_mpx) >>> >> >> Without -z bndplt, MPX won't work correctly. We should always pass -z bndplt >> to linker. If linker doesn't support it, ld will issue a warning, not >> error and users >> will know their linker is too old. When they update linker, they don't have >> to >> rebuild GCC. > > If ld issues a warning instead of an error, then configure test passes > and we pass '-z bndplt' to linker. > Can you verify it with an older linker? The unknown XXX in -z XXX is always warned and ignored in Linux linker. If testing it on Linux always passes, it is useless. -- H.J.
Re: [PATCH] Document -m{arch,tune}=knl (PR target/65222)
On Wed, Mar 18, 2015 at 2:49 AM, Uros Bizjak wrote: > On Tue, Mar 10, 2015 at 8:19 PM, Jakub Jelinek wrote: > >> This patch documents knl. The PR is asking for also documenting >> slm, but I believe those were intentionally removed from the documentation >> as deprecated aliases, at least that is my reading of the >> 2013-12-23 H.J. Lu >> Tocar Ilya >> ... >> * doc/invoke.texi: Replace corei7, corei7-avx, core-avx-i, >> core-avx2, atom, slm with nehalem, sandybridge, ivybridge, >> haswell, bonnel, silvermont. Add westmere. >> change, so I think documenting knl is all we want right now. >> Or should we use knightslanding and make knl an undocumented alias? > > H.J., any preferences here? I think that knightslanding is just too > long (and two words), so I'd prefer knl. > >> 2015-03-10 Jakub Jelinek >> >> PR target/65222 >> * doc/invoke.texi: Add knl as x86 -march=/-mtune= CPU type. > > Let's wait for HJ, the patch is otherwise OK and safe for mainline. > It is fine with me. Thanks. -- H.J.
[PATCH] Fix PR ipa/65439
Hi. This is fix for PR65439 which extends number of expected equivalences for IPA ICF that can be different on darwin target. Ready for trunk? Thanks Martin >From 1fb6c73e6ee74a2923aaccbac9658b46005aeba5 Mon Sep 17 00:00:00 2001 From: mliska Date: Wed, 18 Mar 2015 09:52:29 +0100 Subject: [PATCH] Fix PR65439. gcc/testsuite/ChangeLog: 2015-03-18 Martin Liska PR ipa/65439 * g++.dg/ipa/ipa-icf-4.C: Extend expected number of equivalences either to 6 or 7. --- gcc/testsuite/g++.dg/ipa/ipa-icf-4.C | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C b/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C index 2cd7a2e..e5d3123 100644 --- a/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C +++ b/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C @@ -44,5 +44,5 @@ int main() } /* { dg-final { scan-ipa-dump "\(Unified; Variable alias has been created\)|\(Symbol aliases are not supported by target\)" "icf" } } */ -/* { dg-final { scan-ipa-dump "Equal symbols: 6" "icf" } } */ +/* { dg-final { scan-ipa-dump "Equal symbols: \[67\]" "icf" } } */ /* { dg-final { cleanup-ipa-dump "icf" } } */ -- 2.1.2
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
2015-03-18 15:08 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich wrote: >> 2015-03-18 15:02 GMT+03:00 H.J. Lu : >>> On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich >>> wrote: Hi, This patch fixes PR target/65444 by passing '-z bndplt' to linker when appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. Will commit it to trunk in a couple of days if no objections arise. Thanks, Ilya -- gcc/ 2015-03-18 Ilya Enkovich PR driver/65444 * config/i386/linux-common.h (MPX_SPEC): New. (CHKP_SPEC): Add MPX_SPEC. libmpx/ 2015-03-18 Ilya Enkovich PR driver/65444 * configure.ac: Add check for '-z bndplt' support by linker. Add link_mpx output variable. * libmpx.spec.in (link_mpx): New. * configure: Regenerate. diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h index 9c6560b..dd79ec6 100644 --- a/gcc/config/i386/linux-common.h +++ b/gcc/config/i386/linux-common.h @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see %:include(libmpx.spec)%(link_libmpx)" #endif +#ifndef MPX_SPEC +#define MPX_SPEC "\ + %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" +#endif + #ifndef LIBMPX_SPEC #if defined(HAVE_LD_STATIC_DYNAMIC) #define LIBMPX_SPEC "\ @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see #ifndef CHKP_SPEC #define CHKP_SPEC "\ -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" MPX_SPEC #endif diff --git a/libmpx/configure.ac b/libmpx/configure.ac index fe0d3f2..3f8b50f 100644 --- a/libmpx/configure.ac +++ b/libmpx/configure.ac @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) link_libmpx="-lpthread" +link_mpx="" +AC_MSG_CHECKING([whether ld accepts -z bndplt]) +echo "int main() {};" > conftest.c +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest conftest.c 1>&AS_MESSAGE_LOG_FD]) +then +AC_MSG_RESULT([yes]) +link_mpx="$link_mpx -z bndplt" +else +AC_MSG_RESULT([no]) +fi AC_SUBST(link_libmpx) +AC_SUBST(link_mpx) >>> >>> Without -z bndplt, MPX won't work correctly. We should always pass -z >>> bndplt >>> to linker. If linker doesn't support it, ld will issue a warning, not >>> error and users >>> will know their linker is too old. When they update linker, they don't >>> have to >>> rebuild GCC. >> >> If ld issues a warning instead of an error, then configure test passes >> and we pass '-z bndplt' to linker. >> > > Can you verify it with an older linker? The unknown XXX in -z XXX is always > warned and ignored in Linux linker. If testing it on Linux always passes, > it is useless. Old ld issues a warning: ld: warning: -z bndplt ignored. But gold issues an error: ld.gold: bndplt: unknown -z option ld.gold: use the --help option for usage information Ilya > > -- > H.J.
[CHKP] Never expand instrumentation thunks
Hi, This patch disables attempts to expand instrumentation thunks which appear when we create specialized function versions. Problem was found during SPEC2006 insatrumentation with '-Ofast -flto'. Unfortunately I couldn't make a small reproducer. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- 2015-03-18 Ilya Enkovich * cgraphunit.c (cgraph_node::expand_thunk): Don't expand instrumentation thunks. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index e640907..abc9cfe 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1508,6 +1508,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) tree thunk_fndecl = decl; tree a; + /* Instrumentation thunk is the same function with + a different signature. Never need to expand it. */ + if (thunk.add_pointer_bounds_args) +return false; if (!force_gimple_thunk && this_adjusting && targetm.asm_out.can_output_mi_thunk (thunk_fndecl, fixed_offset,
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 5:13 AM, Ilya Enkovich wrote: > 2015-03-18 15:08 GMT+03:00 H.J. Lu : >> On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich >> wrote: >>> 2015-03-18 15:02 GMT+03:00 H.J. Lu : On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich wrote: > Hi, > > This patch fixes PR target/65444 by passing '-z bndplt' to linker when > appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. Will > commit it to trunk in a couple of days if no objections arise. > > Thanks, > Ilya > -- > gcc/ > > 2015-03-18 Ilya Enkovich > > PR driver/65444 > * config/i386/linux-common.h (MPX_SPEC): New. > (CHKP_SPEC): Add MPX_SPEC. > > libmpx/ > > 2015-03-18 Ilya Enkovich > > PR driver/65444 > * configure.ac: Add check for '-z bndplt' support > by linker. Add link_mpx output variable. > * libmpx.spec.in (link_mpx): New. > * configure: Regenerate. > > > diff --git a/gcc/config/i386/linux-common.h > b/gcc/config/i386/linux-common.h > index 9c6560b..dd79ec6 100644 > --- a/gcc/config/i386/linux-common.h > +++ b/gcc/config/i386/linux-common.h > @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see > %:include(libmpx.spec)%(link_libmpx)" > #endif > > +#ifndef MPX_SPEC > +#define MPX_SPEC "\ > + > %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" > +#endif > + > #ifndef LIBMPX_SPEC > #if defined(HAVE_LD_STATIC_DYNAMIC) > #define LIBMPX_SPEC "\ > @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see > > #ifndef CHKP_SPEC > #define CHKP_SPEC "\ > -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" > +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" > MPX_SPEC > #endif > diff --git a/libmpx/configure.ac b/libmpx/configure.ac > index fe0d3f2..3f8b50f 100644 > --- a/libmpx/configure.ac > +++ b/libmpx/configure.ac > @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) > AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) > > link_libmpx="-lpthread" > +link_mpx="" > +AC_MSG_CHECKING([whether ld accepts -z bndplt]) > +echo "int main() {};" > conftest.c > +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest conftest.c > 1>&AS_MESSAGE_LOG_FD]) > +then > +AC_MSG_RESULT([yes]) > +link_mpx="$link_mpx -z bndplt" > +else > +AC_MSG_RESULT([no]) > +fi > AC_SUBST(link_libmpx) > +AC_SUBST(link_mpx) > Without -z bndplt, MPX won't work correctly. We should always pass -z bndplt to linker. If linker doesn't support it, ld will issue a warning, not error and users will know their linker is too old. When they update linker, they don't have to rebuild GCC. >>> >>> If ld issues a warning instead of an error, then configure test passes >>> and we pass '-z bndplt' to linker. >>> >> >> Can you verify it with an older linker? The unknown XXX in -z XXX is always >> warned and ignored in Linux linker. If testing it on Linux always passes, >> it is useless. > > Old ld issues a warning: > > ld: warning: -z bndplt ignored. Does configure test pass? > But gold issues an error: > > ld.gold: bndplt: unknown -z option > ld.gold: use the --help option for usage information If gold is used, MPX won't work. What should we do here? Should we hardcode -fuse-ld=bfd for MPX? -- H.J.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 1:25 PM, H.J. Lu wrote: > On Wed, Mar 18, 2015 at 5:13 AM, Ilya Enkovich wrote: >> 2015-03-18 15:08 GMT+03:00 H.J. Lu : >>> On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich >>> wrote: 2015-03-18 15:02 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich > wrote: >> Hi, >> >> This patch fixes PR target/65444 by passing '-z bndplt' to linker when >> appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. Will >> commit it to trunk in a couple of days if no objections arise. >> >> Thanks, >> Ilya >> -- >> gcc/ >> >> 2015-03-18 Ilya Enkovich >> >> PR driver/65444 >> * config/i386/linux-common.h (MPX_SPEC): New. >> (CHKP_SPEC): Add MPX_SPEC. >> >> libmpx/ >> >> 2015-03-18 Ilya Enkovich >> >> PR driver/65444 >> * configure.ac: Add check for '-z bndplt' support >> by linker. Add link_mpx output variable. >> * libmpx.spec.in (link_mpx): New. >> * configure: Regenerate. >> >> >> diff --git a/gcc/config/i386/linux-common.h >> b/gcc/config/i386/linux-common.h >> index 9c6560b..dd79ec6 100644 >> --- a/gcc/config/i386/linux-common.h >> +++ b/gcc/config/i386/linux-common.h >> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see >> %:include(libmpx.spec)%(link_libmpx)" >> #endif >> >> +#ifndef MPX_SPEC >> +#define MPX_SPEC "\ >> + >> %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" >> +#endif >> + >> #ifndef LIBMPX_SPEC >> #if defined(HAVE_LD_STATIC_DYNAMIC) >> #define LIBMPX_SPEC "\ >> @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see >> >> #ifndef CHKP_SPEC >> #define CHKP_SPEC "\ >> -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >> +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >> MPX_SPEC >> #endif >> diff --git a/libmpx/configure.ac b/libmpx/configure.ac >> index fe0d3f2..3f8b50f 100644 >> --- a/libmpx/configure.ac >> +++ b/libmpx/configure.ac >> @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) >> AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) >> >> link_libmpx="-lpthread" >> +link_mpx="" >> +AC_MSG_CHECKING([whether ld accepts -z bndplt]) >> +echo "int main() {};" > conftest.c >> +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest conftest.c >> 1>&AS_MESSAGE_LOG_FD]) >> +then >> +AC_MSG_RESULT([yes]) >> +link_mpx="$link_mpx -z bndplt" >> +else >> +AC_MSG_RESULT([no]) >> +fi >> AC_SUBST(link_libmpx) >> +AC_SUBST(link_mpx) >> > > Without -z bndplt, MPX won't work correctly. We should always pass -z > bndplt > to linker. If linker doesn't support it, ld will issue a warning, not > error and users > will know their linker is too old. When they update linker, they don't > have to > rebuild GCC. If ld issues a warning instead of an error, then configure test passes and we pass '-z bndplt' to linker. >>> >>> Can you verify it with an older linker? The unknown XXX in -z XXX is always >>> warned and ignored in Linux linker. If testing it on Linux always passes, >>> it is useless. >> >> Old ld issues a warning: >> >> ld: warning: -z bndplt ignored. > > Does configure test pass? > >> But gold issues an error: >> >> ld.gold: bndplt: unknown -z option >> ld.gold: use the --help option for usage information > > If gold is used, MPX won't work. What should we do here? > Should we hardcode -fuse-ld=bfd for MPX? Is MPX disabled when the host linker is gold and gld isn't available? Richard. > -- > H.J.
[PATCH] Fix PR ipa/65439
Hi. Sorry for previous mail, which has wrongly assigned To address. Original message: This is fix for PR65439 which extends number of expected equivalences for IPA ICF that can be different on darwin target. Ready for trunk? Thanks Martin >From 1fb6c73e6ee74a2923aaccbac9658b46005aeba5 Mon Sep 17 00:00:00 2001 From: mliska Date: Wed, 18 Mar 2015 09:52:29 +0100 Subject: [PATCH] Fix PR65439. gcc/testsuite/ChangeLog: 2015-03-18 Martin Liska PR ipa/65439 * g++.dg/ipa/ipa-icf-4.C: Extend expected number of equivalences either to 6 or 7. --- gcc/testsuite/g++.dg/ipa/ipa-icf-4.C | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C b/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C index 2cd7a2e..e5d3123 100644 --- a/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C +++ b/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C @@ -44,5 +44,5 @@ int main() } /* { dg-final { scan-ipa-dump "\(Unified; Variable alias has been created\)|\(Symbol aliases are not supported by target\)" "icf" } } */ -/* { dg-final { scan-ipa-dump "Equal symbols: 6" "icf" } } */ +/* { dg-final { scan-ipa-dump "Equal symbols: \[67\]" "icf" } } */ /* { dg-final { cleanup-ipa-dump "icf" } } */ -- 2.1.2
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
2015-03-18 15:42 GMT+03:00 Richard Biener : > On Wed, Mar 18, 2015 at 1:25 PM, H.J. Lu wrote: >> On Wed, Mar 18, 2015 at 5:13 AM, Ilya Enkovich >> wrote: >>> 2015-03-18 15:08 GMT+03:00 H.J. Lu : On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich wrote: > 2015-03-18 15:02 GMT+03:00 H.J. Lu : >> On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich >> wrote: >>> Hi, >>> >>> This patch fixes PR target/65444 by passing '-z bndplt' to linker when >>> appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. >>> Will commit it to trunk in a couple of days if no objections arise. >>> >>> Thanks, >>> Ilya >>> -- >>> gcc/ >>> >>> 2015-03-18 Ilya Enkovich >>> >>> PR driver/65444 >>> * config/i386/linux-common.h (MPX_SPEC): New. >>> (CHKP_SPEC): Add MPX_SPEC. >>> >>> libmpx/ >>> >>> 2015-03-18 Ilya Enkovich >>> >>> PR driver/65444 >>> * configure.ac: Add check for '-z bndplt' support >>> by linker. Add link_mpx output variable. >>> * libmpx.spec.in (link_mpx): New. >>> * configure: Regenerate. >>> >>> >>> diff --git a/gcc/config/i386/linux-common.h >>> b/gcc/config/i386/linux-common.h >>> index 9c6560b..dd79ec6 100644 >>> --- a/gcc/config/i386/linux-common.h >>> +++ b/gcc/config/i386/linux-common.h >>> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see >>> %:include(libmpx.spec)%(link_libmpx)" >>> #endif >>> >>> +#ifndef MPX_SPEC >>> +#define MPX_SPEC "\ >>> + >>> %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" >>> +#endif >>> + >>> #ifndef LIBMPX_SPEC >>> #if defined(HAVE_LD_STATIC_DYNAMIC) >>> #define LIBMPX_SPEC "\ >>> @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see >>> >>> #ifndef CHKP_SPEC >>> #define CHKP_SPEC "\ >>> -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >>> +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >>> MPX_SPEC >>> #endif >>> diff --git a/libmpx/configure.ac b/libmpx/configure.ac >>> index fe0d3f2..3f8b50f 100644 >>> --- a/libmpx/configure.ac >>> +++ b/libmpx/configure.ac >>> @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) >>> AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) >>> >>> link_libmpx="-lpthread" >>> +link_mpx="" >>> +AC_MSG_CHECKING([whether ld accepts -z bndplt]) >>> +echo "int main() {};" > conftest.c >>> +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest >>> conftest.c 1>&AS_MESSAGE_LOG_FD]) >>> +then >>> +AC_MSG_RESULT([yes]) >>> +link_mpx="$link_mpx -z bndplt" >>> +else >>> +AC_MSG_RESULT([no]) >>> +fi >>> AC_SUBST(link_libmpx) >>> +AC_SUBST(link_mpx) >>> >> >> Without -z bndplt, MPX won't work correctly. We should always pass -z >> bndplt >> to linker. If linker doesn't support it, ld will issue a warning, not >> error and users >> will know their linker is too old. When they update linker, they don't >> have to >> rebuild GCC. > > If ld issues a warning instead of an error, then configure test passes > and we pass '-z bndplt' to linker. > Can you verify it with an older linker? The unknown XXX in -z XXX is always warned and ignored in Linux linker. If testing it on Linux always passes, it is useless. >>> >>> Old ld issues a warning: >>> >>> ld: warning: -z bndplt ignored. >> >> Does configure test pass? >> >>> But gold issues an error: >>> >>> ld.gold: bndplt: unknown -z option >>> ld.gold: use the --help option for usage information >> >> If gold is used, MPX won't work. What should we do here? >> Should we hardcode -fuse-ld=bfd for MPX? > > Is MPX disabled when the host linker is gold and gld isn't available? No. You may use MPX with gold and old ld but you would loose passed bounds when make a call via plt. Ilya > > Richard. > >> -- >> H.J.
[PATCH] Fix PR ipa/65432
Hello. Following patch wraps symtab_node::{asm_}name with xstrdup_for_dump. Ready for trunk? Thanks, Martin >From 06d7667b7e2be23e21b3ea6599ebb2303074b310 Mon Sep 17 00:00:00 2001 From: mliska Date: Wed, 18 Mar 2015 13:59:49 +0100 Subject: [PATCH] Fix PR ipa/65432 gcc/ChangeLog: 2015-03-18 Martin Liska PR ipa/65432 * ipa-icf.c (sem_item_optimizer::read_section): Wrap symtab_node::name and symtab_node::asm_name with xstrdup_for_dump. * ipa-icf.h: Likewise. --- gcc/ipa-icf.c | 3 ++- gcc/ipa-icf.h | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index 25b8306..476076d 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -1995,7 +1995,8 @@ sem_item_optimizer::read_section (lto_file_decl_data *file_data, gcc_assert (node->definition); if (dump_file) - fprintf (dump_file, "Symbol added:%s (tree: %p, uid:%u)\n", node->asm_name (), + fprintf (dump_file, "Symbol added:%s (tree: %p, uid:%u)\n", + xstrdup_for_dump (node->asm_name ()), (void *) node->decl, node->order); if (is_a (node)) diff --git a/gcc/ipa-icf.h b/gcc/ipa-icf.h index c51bb4a..6d03758 100644 --- a/gcc/ipa-icf.h +++ b/gcc/ipa-icf.h @@ -174,13 +174,13 @@ public: /* Gets symbol name of the item. */ const char *name (void) { -return node->name (); +return xstrdup_for_dump (node->name ()); } /* Gets assembler name of the item. */ const char *asm_name (void) { -return node->asm_name (); +return xstrdup_for_dump (node->asm_name ()); } /* Fast equality function based on knowledge known in WPA. */ -- 2.1.2
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 6:24 AM, Ilya Enkovich wrote: > 2015-03-18 15:42 GMT+03:00 Richard Biener : >> On Wed, Mar 18, 2015 at 1:25 PM, H.J. Lu wrote: >>> On Wed, Mar 18, 2015 at 5:13 AM, Ilya Enkovich >>> wrote: 2015-03-18 15:08 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich > wrote: >> 2015-03-18 15:02 GMT+03:00 H.J. Lu : >>> On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich >>> wrote: Hi, This patch fixes PR target/65444 by passing '-z bndplt' to linker when appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. Will commit it to trunk in a couple of days if no objections arise. Thanks, Ilya -- gcc/ 2015-03-18 Ilya Enkovich PR driver/65444 * config/i386/linux-common.h (MPX_SPEC): New. (CHKP_SPEC): Add MPX_SPEC. libmpx/ 2015-03-18 Ilya Enkovich PR driver/65444 * configure.ac: Add check for '-z bndplt' support by linker. Add link_mpx output variable. * libmpx.spec.in (link_mpx): New. * configure: Regenerate. diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h index 9c6560b..dd79ec6 100644 --- a/gcc/config/i386/linux-common.h +++ b/gcc/config/i386/linux-common.h @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see %:include(libmpx.spec)%(link_libmpx)" #endif +#ifndef MPX_SPEC +#define MPX_SPEC "\ + %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" +#endif + #ifndef LIBMPX_SPEC #if defined(HAVE_LD_STATIC_DYNAMIC) #define LIBMPX_SPEC "\ @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see #ifndef CHKP_SPEC #define CHKP_SPEC "\ -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" MPX_SPEC #endif diff --git a/libmpx/configure.ac b/libmpx/configure.ac index fe0d3f2..3f8b50f 100644 --- a/libmpx/configure.ac +++ b/libmpx/configure.ac @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) link_libmpx="-lpthread" +link_mpx="" +AC_MSG_CHECKING([whether ld accepts -z bndplt]) +echo "int main() {};" > conftest.c +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest conftest.c 1>&AS_MESSAGE_LOG_FD]) +then +AC_MSG_RESULT([yes]) +link_mpx="$link_mpx -z bndplt" +else +AC_MSG_RESULT([no]) +fi AC_SUBST(link_libmpx) +AC_SUBST(link_mpx) >>> >>> Without -z bndplt, MPX won't work correctly. We should always pass -z >>> bndplt >>> to linker. If linker doesn't support it, ld will issue a warning, not >>> error and users >>> will know their linker is too old. When they update linker, they don't >>> have to >>> rebuild GCC. >> >> If ld issues a warning instead of an error, then configure test passes >> and we pass '-z bndplt' to linker. >> > > Can you verify it with an older linker? The unknown XXX in -z XXX is > always > warned and ignored in Linux linker. If testing it on Linux always passes, > it is useless. Old ld issues a warning: ld: warning: -z bndplt ignored. >>> >>> Does configure test pass? >>> But gold issues an error: ld.gold: bndplt: unknown -z option ld.gold: use the --help option for usage information >>> >>> If gold is used, MPX won't work. What should we do here? >>> Should we hardcode -fuse-ld=bfd for MPX? >> >> Is MPX disabled when the host linker is gold and gld isn't available? > > No. You may use MPX with gold and old ld but you would loose passed > bounds when make a call via plt. > If gold is default linker, the configure test will fail and we never pass -z bndplt to linker even if ld.bfd is available and ld.gold is fixed later. I'd rather always pass -z bndplt to ld. -- H.J.
Re: [PATCH] Fix PR ipa/65432
On Wed, Mar 18, 2015 at 2:24 PM, Martin Liška wrote: > Hello. > > Following patch wraps symtab_node::{asm_}name with xstrdup_for_dump. > > Ready for trunk? /* Gets symbol name of the item. */ const char *name (void) { -return node->name (); +return xstrdup_for_dump (node->name ()); shouldn't the methods be called dump_name () then? And why's node->name () not already dup-ing the string? That said, I wonder where we use ->name / ->asm_name. And why that's different for ICF. The patch would be more obvious if all fixes were to dumping sites. Richard. > Thanks, > Martin
[PATCH] Shrink data-ref
This removes dead code/data from tree-data-ref.h. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2015-03-18 Richard Biener * tree-data-ref.h (struct access_matrix): Remove. (AM_LOOP_NEST, AM_NB_INDUCTION_VARS, AM_PARAMETERS, AM_MATRIX, AM_NB_PARAMETERS, AM_CONST_COLUMN_INDEX, AM_NB_COLUMNS, AM_GET_SUBSCRIPT_ACCESS_VECTOR, AM_GET_ACCESS_MATRIX_ELEMENT): Likewise. (am_vector_index_for_loop): Likewise. (struct data_reference): Remove access_matrix member. (DR_ACCESS_MATRIX): Remove. (lambda_vector_new): Add comment. (lambda_matrix_new): Use XOBNEWVEC. Index: gcc/tree-data-ref.h === --- gcc/tree-data-ref.h (revision 221411) +++ gcc/tree-data-ref.h (working copy) @@ -100,66 +100,7 @@ typedef int *lambda_vector; all vectors are the same length). */ typedef lambda_vector *lambda_matrix; -/* Each vector of the access matrix represents a linear access - function for a subscript. First elements correspond to the - leftmost indices, ie. for a[i][j] the first vector corresponds to - the subscript in "i". The elements of a vector are relative to - the loop nests in which the data reference is considered, - i.e. the vector is relative to the SCoP that provides the context - in which this data reference occurs. - - For example, in - - | loop_1 - |loop_2 - | a[i+3][2*j+n-1] - - if "i" varies in loop_1 and "j" varies in loop_2, the access - matrix with respect to the loop nest {loop_1, loop_2} is: - - | loop_1 loop_2 param_n cst - | 1 00 3 - | 0 21 -1 - - whereas the access matrix with respect to loop_2 considers "i" as - a parameter: - - | loop_2 param_i param_n cst - | 0 1 0 3 - | 2 0 1 -1 -*/ -struct access_matrix -{ - vec loop_nest; - int nb_induction_vars; - vec parameters; - vec *matrix; -}; - -#define AM_LOOP_NEST(M) (M)->loop_nest -#define AM_NB_INDUCTION_VARS(M) (M)->nb_induction_vars -#define AM_PARAMETERS(M) (M)->parameters -#define AM_MATRIX(M) (M)->matrix -#define AM_NB_PARAMETERS(M) (AM_PARAMETERS (M)).length () -#define AM_CONST_COLUMN_INDEX(M) (AM_NB_INDUCTION_VARS (M) + AM_NB_PARAMETERS (M)) -#define AM_NB_COLUMNS(M) (AM_NB_INDUCTION_VARS (M) + AM_NB_PARAMETERS (M) + 1) -#define AM_GET_SUBSCRIPT_ACCESS_VECTOR(M, I) AM_MATRIX (M)[I] -#define AM_GET_ACCESS_MATRIX_ELEMENT(M, I, J) AM_GET_SUBSCRIPT_ACCESS_VECTOR (M, I)[J] - -/* Return the column in the access matrix of LOOP_NUM. */ - -static inline int -am_vector_index_for_loop (struct access_matrix *access_matrix, int loop_num) -{ - int i; - loop_p l; - for (i = 0; AM_LOOP_NEST (access_matrix).iterate (i, &l); i++) -if (l->num == loop_num) - return i; - - gcc_unreachable (); -} struct data_reference { @@ -183,9 +124,6 @@ struct data_reference /* Alias information for the data reference. */ struct dr_alias alias; - - /* Matrix representation for the data access functions. */ - struct access_matrix *access_matrix; }; #define DR_STMT(DR)(DR)->stmt @@ -202,7 +140,6 @@ struct data_reference #define DR_STEP(DR)(DR)->innermost.step #define DR_PTR_INFO(DR)(DR)->alias.ptr_info #define DR_ALIGNED_TO(DR) (DR)->innermost.aligned_to -#define DR_ACCESS_MATRIX(DR) (DR)->access_matrix typedef struct data_reference *data_reference_p; @@ -560,6 +497,7 @@ lambda_vector_gcd (lambda_vector vector, static inline lambda_vector lambda_vector_new (int size) { + /* ??? We shouldn't abuse the GC allocator here. */ return ggc_cleared_vec_alloc (size); } @@ -611,11 +549,10 @@ lambda_matrix_new (int m, int n, struct lambda_matrix mat; int i; - mat = (lambda_matrix) obstack_alloc (lambda_obstack, - sizeof (lambda_vector *) * m); + mat = XOBNEWVEC (lambda_obstack, lambda_vector, m); for (i = 0; i < m; i++) -mat[i] = lambda_vector_new (n); +mat[i] = XOBNEWVEC (lambda_obstack, int, n); return mat; }
Re: [PATCH] Fix PR ipa/65432
On Wed, Mar 18, 2015 at 02:24:26PM +0100, Martin Liška wrote: > >From 06d7667b7e2be23e21b3ea6599ebb2303074b310 Mon Sep 17 00:00:00 2001 > From: mliska > Date: Wed, 18 Mar 2015 13:59:49 +0100 > Subject: [PATCH] Fix PR ipa/65432 > > gcc/ChangeLog: > > 2015-03-18 Martin Liska > > PR ipa/65432 > * ipa-icf.c (sem_item_optimizer::read_section): Wrap symtab_node::name > and > symtab_node::asm_name with xstrdup_for_dump. > * ipa-icf.h: Likewise. > --- > gcc/ipa-icf.c | 3 ++- > gcc/ipa-icf.h | 4 ++-- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c > index 25b8306..476076d 100644 > --- a/gcc/ipa-icf.c > +++ b/gcc/ipa-icf.c > @@ -1995,7 +1995,8 @@ sem_item_optimizer::read_section (lto_file_decl_data > *file_data, >gcc_assert (node->definition); > >if (dump_file) > - fprintf (dump_file, "Symbol added:%s (tree: %p, uid:%u)\n", > node->asm_name (), > + fprintf (dump_file, "Symbol added:%s (tree: %p, uid:%u)\n", > + xstrdup_for_dump (node->asm_name ()), >(void *) node->decl, node->order); This doesn't make much sense (unless say operator * has side effects or similar mess I hope will never show up in GCC). There is just one call among the arguments, so I fail to see what could clobber it. xstrdup_for_dump is for the case where you have two or more function calls in the *printf arguments that can produce transient strings. It seems cgraph_node::get_create has similar bug though (using xstrdup_for_dump when it shouldn't). > --- a/gcc/ipa-icf.h > +++ b/gcc/ipa-icf.h > @@ -174,13 +174,13 @@ public: >/* Gets symbol name of the item. */ >const char *name (void) >{ > -return node->name (); > +return xstrdup_for_dump (node->name ()); >} > >/* Gets assembler name of the item. */ >const char *asm_name (void) >{ > -return node->asm_name (); > +return xstrdup_for_dump (node->asm_name ()); >} > >/* Fast equality function based on knowledge known in WPA. */ But for this reason I must say I don't like this change either, if you use sem_item::name or asm_name just once (happens in various places), there is no need to preserve it any way. IMNSHO you should instead stick it in the calls that have more than one %s format specifiers (and only those that can have more than one transient strings). That is probably just the call in sem_function::equals (4x), sem_variable::equals (ditto), sem_item_optimizer::merge_classes (twice 2x) and that's it. Jakub
[PATCH] Fix align/misalign info in DR_PTR_INFO on vectorizer created SSA_NAMEs (PR tree-optimization/65450)
Hi! As mentioned in the PR, DR_PTR_INFO is for the base of the DR, thus the points-to into in it of course applies to any new SSA_NAME pointers derived from the DR, but the alignment info might not. vect_create_addr_base_for_vector_ref seems to be handling it right, but vect_create_data_ref_ptr didn't, this patch moves the code to handle it from vect_create_addr_base_for_vector_ref into a new helper function (except for the offset/byte_offset case) and uses the new helper in vect_create_data_ref_ptr. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-03-18 Jakub Jelinek PR tree-optimization/65450 * tree-vect-data-refs.c (vect_duplicate_ssa_name_ptr_info): New function. (vect_create_addr_base_for_vector_ref, vect_create_data_ref_ptr): Use it instead of duplicate_ssa_name_ptr_info. * gfortran.dg/pr65450.f90: New test. --- gcc/tree-vect-data-refs.c.jj2015-03-10 07:27:43.0 +0100 +++ gcc/tree-vect-data-refs.c 2015-03-18 09:54:11.060172717 +0100 @@ -3845,6 +3845,20 @@ vect_get_new_vect_var (tree type, enum v return new_vect_var; } +/* Duplicate ptr info and set alignment/misaligment on NAME from DR. */ + +static void +vect_duplicate_ssa_name_ptr_info (tree name, data_reference *dr, + stmt_vec_info stmt_info) +{ + duplicate_ssa_name_ptr_info (name, DR_PTR_INFO (dr)); + unsigned int align = TYPE_ALIGN_UNIT (STMT_VINFO_VECTYPE (stmt_info)); + int misalign = DR_MISALIGNMENT (dr); + if (misalign == -1) +mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (name)); + else +set_ptr_info_alignment (SSA_NAME_PTR_INFO (name), align, misalign); +} /* Function vect_create_addr_base_for_vector_ref. @@ -3964,13 +3978,9 @@ vect_create_addr_base_for_vector_ref (gi if (DR_PTR_INFO (dr) && TREE_CODE (addr_base) == SSA_NAME) { - duplicate_ssa_name_ptr_info (addr_base, DR_PTR_INFO (dr)); - unsigned int align = TYPE_ALIGN_UNIT (STMT_VINFO_VECTYPE (stmt_info)); - int misalign = DR_MISALIGNMENT (dr); - if (offset || byte_offset || (misalign == -1)) + vect_duplicate_ssa_name_ptr_info (addr_base, dr, stmt_info); + if (offset || byte_offset) mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (addr_base)); - else - set_ptr_info_alignment (SSA_NAME_PTR_INFO (addr_base), align, misalign); } if (dump_enabled_p ()) @@ -4210,7 +4220,7 @@ vect_create_data_ref_ptr (gimple stmt, t aggr_ptr_init = make_ssa_name (aggr_ptr, vec_stmt); /* Copy the points-to information if it exists. */ if (DR_PTR_INFO (dr)) - duplicate_ssa_name_ptr_info (aggr_ptr_init, DR_PTR_INFO (dr)); + vect_duplicate_ssa_name_ptr_info (aggr_ptr_init, dr, stmt_info); gimple_assign_set_lhs (vec_stmt, aggr_ptr_init); if (pe) { @@ -4253,8 +4263,8 @@ vect_create_data_ref_ptr (gimple stmt, t /* Copy the points-to information if it exists. */ if (DR_PTR_INFO (dr)) { - duplicate_ssa_name_ptr_info (indx_before_incr, DR_PTR_INFO (dr)); - duplicate_ssa_name_ptr_info (indx_after_incr, DR_PTR_INFO (dr)); + vect_duplicate_ssa_name_ptr_info (indx_before_incr, dr, stmt_info); + vect_duplicate_ssa_name_ptr_info (indx_after_incr, dr, stmt_info); } if (ptr_incr) *ptr_incr = incr; @@ -4283,8 +4293,8 @@ vect_create_data_ref_ptr (gimple stmt, t /* Copy the points-to information if it exists. */ if (DR_PTR_INFO (dr)) { - duplicate_ssa_name_ptr_info (indx_before_incr, DR_PTR_INFO (dr)); - duplicate_ssa_name_ptr_info (indx_after_incr, DR_PTR_INFO (dr)); + vect_duplicate_ssa_name_ptr_info (indx_before_incr, dr, stmt_info); + vect_duplicate_ssa_name_ptr_info (indx_after_incr, dr, stmt_info); } if (ptr_incr) *ptr_incr = incr; --- gcc/testsuite/gfortran.dg/pr65450.f90.jj2015-03-18 11:41:04.478543474 +0100 +++ gcc/testsuite/gfortran.dg/pr65450.f90 2015-03-18 11:40:09.0 +0100 @@ -0,0 +1,35 @@ +! PR tree-optimization/65450 +! { dg-do run } +! { dg-additional-options "-mtune=amdfam10" { target x86_64-*-* i?86-*-* } } + +program pr65450 + integer :: n, m, o, i, k + double precision :: u(500,60,3), h(500,60,3) + double precision :: v(500,60) + u = 0 + h = 0 + o = 1 + m = 2 + n = 3 + do k = 1, 50 +v = foo (u(:,:,m)) +u(2:499,1:60,n) = u(2:499,1:60,o)+16.d0 +h(1:500,2:59,n) = h(1:500,2:59,o)-4.d0*v(1:500,2:59)-32.0d0 +i = o +o = m +m = n +n = i + end do + if (abs (v(17, 23) + h(17, 23, 2) + 768.0d0) > 0.5d0) call abort +contains + function foo(a) +double precision :: a(:,:) +double precision :: foo(size(a,dim=1),size(a,dim=2)) +integer :: i, j +i = size(a,dim=1) +j = size(a,dim=2) +foo(2:i-1,1:j) = a(3:i,1:j)-a(1:i-2,1:j) +foo(1,1:j) = 2*(a(2,1:j)-a(1,1:j)) +foo(i,1:j) = 2*(a(i,1:j)-a(i-1,1:j)) +
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
2015-03-18 16:31 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 6:24 AM, Ilya Enkovich wrote: >> 2015-03-18 15:42 GMT+03:00 Richard Biener : >>> On Wed, Mar 18, 2015 at 1:25 PM, H.J. Lu wrote: On Wed, Mar 18, 2015 at 5:13 AM, Ilya Enkovich wrote: > 2015-03-18 15:08 GMT+03:00 H.J. Lu : >> On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich >> wrote: >>> 2015-03-18 15:02 GMT+03:00 H.J. Lu : On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich wrote: > Hi, > > This patch fixes PR target/65444 by passing '-z bndplt' to linker > when appropriate. Bootstrapped and tested on > x86_64-unknown-linux-gnu. Will commit it to trunk in a couple of > days if no objections arise. > > Thanks, > Ilya > -- > gcc/ > > 2015-03-18 Ilya Enkovich > > PR driver/65444 > * config/i386/linux-common.h (MPX_SPEC): New. > (CHKP_SPEC): Add MPX_SPEC. > > libmpx/ > > 2015-03-18 Ilya Enkovich > > PR driver/65444 > * configure.ac: Add check for '-z bndplt' support > by linker. Add link_mpx output variable. > * libmpx.spec.in (link_mpx): New. > * configure: Regenerate. > > > diff --git a/gcc/config/i386/linux-common.h > b/gcc/config/i386/linux-common.h > index 9c6560b..dd79ec6 100644 > --- a/gcc/config/i386/linux-common.h > +++ b/gcc/config/i386/linux-common.h > @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see > %:include(libmpx.spec)%(link_libmpx)" > #endif > > +#ifndef MPX_SPEC > +#define MPX_SPEC "\ > + > %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" > +#endif > + > #ifndef LIBMPX_SPEC > #if defined(HAVE_LD_STATIC_DYNAMIC) > #define LIBMPX_SPEC "\ > @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see > > #ifndef CHKP_SPEC > #define CHKP_SPEC "\ > -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" > +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" > MPX_SPEC > #endif > diff --git a/libmpx/configure.ac b/libmpx/configure.ac > index fe0d3f2..3f8b50f 100644 > --- a/libmpx/configure.ac > +++ b/libmpx/configure.ac > @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) > AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = > "xyes"]) > > link_libmpx="-lpthread" > +link_mpx="" > +AC_MSG_CHECKING([whether ld accepts -z bndplt]) > +echo "int main() {};" > conftest.c > +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest > conftest.c 1>&AS_MESSAGE_LOG_FD]) > +then > +AC_MSG_RESULT([yes]) > +link_mpx="$link_mpx -z bndplt" > +else > +AC_MSG_RESULT([no]) > +fi > AC_SUBST(link_libmpx) > +AC_SUBST(link_mpx) > Without -z bndplt, MPX won't work correctly. We should always pass -z bndplt to linker. If linker doesn't support it, ld will issue a warning, not error and users will know their linker is too old. When they update linker, they don't have to rebuild GCC. >>> >>> If ld issues a warning instead of an error, then configure test passes >>> and we pass '-z bndplt' to linker. >>> >> >> Can you verify it with an older linker? The unknown XXX in -z XXX is >> always >> warned and ignored in Linux linker. If testing it on Linux always >> passes, >> it is useless. > > Old ld issues a warning: > > ld: warning: -z bndplt ignored. Does configure test pass? > But gold issues an error: > > ld.gold: bndplt: unknown -z option > ld.gold: use the --help option for usage information If gold is used, MPX won't work. What should we do here? Should we hardcode -fuse-ld=bfd for MPX? >>> >>> Is MPX disabled when the host linker is gold and gld isn't available? >> >> No. You may use MPX with gold and old ld but you would loose passed >> bounds when make a call via plt. >> > > If gold is default linker, the configure test will fail and we never pass > -z bndplt to linker even if ld.bfd is available and ld.gold is fixed later. > I'd rather always pass -z bndplt to ld. If gold is used and it doesn't support '-z bndplt' then it doesn't mean user can't use MPX. Ilya > > -- > H.J.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 6:41 AM, Ilya Enkovich wrote: > 2015-03-18 16:31 GMT+03:00 H.J. Lu : >> On Wed, Mar 18, 2015 at 6:24 AM, Ilya Enkovich >> wrote: >>> 2015-03-18 15:42 GMT+03:00 Richard Biener : On Wed, Mar 18, 2015 at 1:25 PM, H.J. Lu wrote: > On Wed, Mar 18, 2015 at 5:13 AM, Ilya Enkovich > wrote: >> 2015-03-18 15:08 GMT+03:00 H.J. Lu : >>> On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich >>> wrote: 2015-03-18 15:02 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich > wrote: >> Hi, >> >> This patch fixes PR target/65444 by passing '-z bndplt' to linker >> when appropriate. Bootstrapped and tested on >> x86_64-unknown-linux-gnu. Will commit it to trunk in a couple of >> days if no objections arise. >> >> Thanks, >> Ilya >> -- >> gcc/ >> >> 2015-03-18 Ilya Enkovich >> >> PR driver/65444 >> * config/i386/linux-common.h (MPX_SPEC): New. >> (CHKP_SPEC): Add MPX_SPEC. >> >> libmpx/ >> >> 2015-03-18 Ilya Enkovich >> >> PR driver/65444 >> * configure.ac: Add check for '-z bndplt' support >> by linker. Add link_mpx output variable. >> * libmpx.spec.in (link_mpx): New. >> * configure: Regenerate. >> >> >> diff --git a/gcc/config/i386/linux-common.h >> b/gcc/config/i386/linux-common.h >> index 9c6560b..dd79ec6 100644 >> --- a/gcc/config/i386/linux-common.h >> +++ b/gcc/config/i386/linux-common.h >> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see >> %:include(libmpx.spec)%(link_libmpx)" >> #endif >> >> +#ifndef MPX_SPEC >> +#define MPX_SPEC "\ >> + >> %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" >> +#endif >> + >> #ifndef LIBMPX_SPEC >> #if defined(HAVE_LD_STATIC_DYNAMIC) >> #define LIBMPX_SPEC "\ >> @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see >> >> #ifndef CHKP_SPEC >> #define CHKP_SPEC "\ >> -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >> +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >> MPX_SPEC >> #endif >> diff --git a/libmpx/configure.ac b/libmpx/configure.ac >> index fe0d3f2..3f8b50f 100644 >> --- a/libmpx/configure.ac >> +++ b/libmpx/configure.ac >> @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) >> AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = >> "xyes"]) >> >> link_libmpx="-lpthread" >> +link_mpx="" >> +AC_MSG_CHECKING([whether ld accepts -z bndplt]) >> +echo "int main() {};" > conftest.c >> +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest >> conftest.c 1>&AS_MESSAGE_LOG_FD]) >> +then >> +AC_MSG_RESULT([yes]) >> +link_mpx="$link_mpx -z bndplt" >> +else >> +AC_MSG_RESULT([no]) >> +fi >> AC_SUBST(link_libmpx) >> +AC_SUBST(link_mpx) >> > > Without -z bndplt, MPX won't work correctly. We should always pass > -z bndplt > to linker. If linker doesn't support it, ld will issue a warning, not > error and users > will know their linker is too old. When they update linker, they > don't have to > rebuild GCC. If ld issues a warning instead of an error, then configure test passes and we pass '-z bndplt' to linker. >>> >>> Can you verify it with an older linker? The unknown XXX in -z XXX is >>> always >>> warned and ignored in Linux linker. If testing it on Linux always >>> passes, >>> it is useless. >> >> Old ld issues a warning: >> >> ld: warning: -z bndplt ignored. > > Does configure test pass? > >> But gold issues an error: >> >> ld.gold: bndplt: unknown -z option >> ld.gold: use the --help option for usage information > > If gold is used, MPX won't work. What should we do here? > Should we hardcode -fuse-ld=bfd for MPX? Is MPX disabled when the host linker is gold and gld isn't available? >>> >>> No. You may use MPX with gold and old ld but you would loose passed >>> bounds when make a call via plt. >>> >> >> If gold is default linker, the configure test will fail and we never pass >> -z bndplt to linker even if ld.bfd is available and ld.gold is fixed later. >> I'd rather always pass -z bndplt to ld. > > If gold is used and it doesn't support '-z bndplt' then it does
Re: [PATCH] Fix align/misalign info in DR_PTR_INFO on vectorizer created SSA_NAMEs (PR tree-optimization/65450)
On Wed, 18 Mar 2015, Jakub Jelinek wrote: > Hi! > > As mentioned in the PR, DR_PTR_INFO is for the base of the DR, thus the > points-to into in it of course applies to any new SSA_NAME pointers derived > from the DR, but the alignment info might not. > vect_create_addr_base_for_vector_ref seems to be handling it right, but > vect_create_data_ref_ptr didn't, this patch moves the code to handle it > from vect_create_addr_base_for_vector_ref into a new helper function > (except for the offset/byte_offset case) and uses the new helper in > vect_create_data_ref_ptr. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. > 2015-03-18 Jakub Jelinek > > PR tree-optimization/65450 > * tree-vect-data-refs.c (vect_duplicate_ssa_name_ptr_info): New > function. > (vect_create_addr_base_for_vector_ref, vect_create_data_ref_ptr): Use > it instead of duplicate_ssa_name_ptr_info. > > * gfortran.dg/pr65450.f90: New test. > > --- gcc/tree-vect-data-refs.c.jj 2015-03-10 07:27:43.0 +0100 > +++ gcc/tree-vect-data-refs.c 2015-03-18 09:54:11.060172717 +0100 > @@ -3845,6 +3845,20 @@ vect_get_new_vect_var (tree type, enum v >return new_vect_var; > } > > +/* Duplicate ptr info and set alignment/misaligment on NAME from DR. */ > + > +static void > +vect_duplicate_ssa_name_ptr_info (tree name, data_reference *dr, > + stmt_vec_info stmt_info) > +{ > + duplicate_ssa_name_ptr_info (name, DR_PTR_INFO (dr)); > + unsigned int align = TYPE_ALIGN_UNIT (STMT_VINFO_VECTYPE (stmt_info)); > + int misalign = DR_MISALIGNMENT (dr); > + if (misalign == -1) > +mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (name)); > + else > +set_ptr_info_alignment (SSA_NAME_PTR_INFO (name), align, misalign); > +} > > /* Function vect_create_addr_base_for_vector_ref. > > @@ -3964,13 +3978,9 @@ vect_create_addr_base_for_vector_ref (gi >if (DR_PTR_INFO (dr) >&& TREE_CODE (addr_base) == SSA_NAME) > { > - duplicate_ssa_name_ptr_info (addr_base, DR_PTR_INFO (dr)); > - unsigned int align = TYPE_ALIGN_UNIT (STMT_VINFO_VECTYPE (stmt_info)); > - int misalign = DR_MISALIGNMENT (dr); > - if (offset || byte_offset || (misalign == -1)) > + vect_duplicate_ssa_name_ptr_info (addr_base, dr, stmt_info); > + if (offset || byte_offset) > mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (addr_base)); > - else > - set_ptr_info_alignment (SSA_NAME_PTR_INFO (addr_base), align, misalign); > } > >if (dump_enabled_p ()) > @@ -4210,7 +4220,7 @@ vect_create_data_ref_ptr (gimple stmt, t >aggr_ptr_init = make_ssa_name (aggr_ptr, vec_stmt); >/* Copy the points-to information if it exists. */ >if (DR_PTR_INFO (dr)) > - duplicate_ssa_name_ptr_info (aggr_ptr_init, DR_PTR_INFO (dr)); > + vect_duplicate_ssa_name_ptr_info (aggr_ptr_init, dr, stmt_info); >gimple_assign_set_lhs (vec_stmt, aggr_ptr_init); >if (pe) > { > @@ -4253,8 +4263,8 @@ vect_create_data_ref_ptr (gimple stmt, t >/* Copy the points-to information if it exists. */ >if (DR_PTR_INFO (dr)) > { > - duplicate_ssa_name_ptr_info (indx_before_incr, DR_PTR_INFO (dr)); > - duplicate_ssa_name_ptr_info (indx_after_incr, DR_PTR_INFO (dr)); > + vect_duplicate_ssa_name_ptr_info (indx_before_incr, dr, stmt_info); > + vect_duplicate_ssa_name_ptr_info (indx_after_incr, dr, stmt_info); > } >if (ptr_incr) > *ptr_incr = incr; > @@ -4283,8 +4293,8 @@ vect_create_data_ref_ptr (gimple stmt, t >/* Copy the points-to information if it exists. */ >if (DR_PTR_INFO (dr)) > { > - duplicate_ssa_name_ptr_info (indx_before_incr, DR_PTR_INFO (dr)); > - duplicate_ssa_name_ptr_info (indx_after_incr, DR_PTR_INFO (dr)); > + vect_duplicate_ssa_name_ptr_info (indx_before_incr, dr, stmt_info); > + vect_duplicate_ssa_name_ptr_info (indx_after_incr, dr, stmt_info); > } >if (ptr_incr) > *ptr_incr = incr; > --- gcc/testsuite/gfortran.dg/pr65450.f90.jj 2015-03-18 11:41:04.478543474 > +0100 > +++ gcc/testsuite/gfortran.dg/pr65450.f90 2015-03-18 11:40:09.0 > +0100 > @@ -0,0 +1,35 @@ > +! PR tree-optimization/65450 > +! { dg-do run } > +! { dg-additional-options "-mtune=amdfam10" { target x86_64-*-* i?86-*-* } } > + > +program pr65450 > + integer :: n, m, o, i, k > + double precision :: u(500,60,3), h(500,60,3) > + double precision :: v(500,60) > + u = 0 > + h = 0 > + o = 1 > + m = 2 > + n = 3 > + do k = 1, 50 > +v = foo (u(:,:,m)) > +u(2:499,1:60,n) = u(2:499,1:60,o)+16.d0 > +h(1:500,2:59,n) = h(1:500,2:59,o)-4.d0*v(1:500,2:59)-32.0d0 > +i = o > +o = m > +m = n > +n = i > + end do > + if (abs (v(17, 23) + h(17, 23, 2) + 768.0d0) > 0.5d0) call abort > +contains > + function foo(a) > +double precision :: a(:,:) > +
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
2015-03-18 16:52 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 6:41 AM, Ilya Enkovich wrote: >> 2015-03-18 16:31 GMT+03:00 H.J. Lu : >>> On Wed, Mar 18, 2015 at 6:24 AM, Ilya Enkovich >>> wrote: 2015-03-18 15:42 GMT+03:00 Richard Biener : > On Wed, Mar 18, 2015 at 1:25 PM, H.J. Lu wrote: >> On Wed, Mar 18, 2015 at 5:13 AM, Ilya Enkovich >> wrote: >>> 2015-03-18 15:08 GMT+03:00 H.J. Lu : On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich wrote: > 2015-03-18 15:02 GMT+03:00 H.J. Lu : >> On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich >> wrote: >>> Hi, >>> >>> This patch fixes PR target/65444 by passing '-z bndplt' to linker >>> when appropriate. Bootstrapped and tested on >>> x86_64-unknown-linux-gnu. Will commit it to trunk in a couple of >>> days if no objections arise. >>> >>> Thanks, >>> Ilya >>> -- >>> gcc/ >>> >>> 2015-03-18 Ilya Enkovich >>> >>> PR driver/65444 >>> * config/i386/linux-common.h (MPX_SPEC): New. >>> (CHKP_SPEC): Add MPX_SPEC. >>> >>> libmpx/ >>> >>> 2015-03-18 Ilya Enkovich >>> >>> PR driver/65444 >>> * configure.ac: Add check for '-z bndplt' support >>> by linker. Add link_mpx output variable. >>> * libmpx.spec.in (link_mpx): New. >>> * configure: Regenerate. >>> >>> >>> diff --git a/gcc/config/i386/linux-common.h >>> b/gcc/config/i386/linux-common.h >>> index 9c6560b..dd79ec6 100644 >>> --- a/gcc/config/i386/linux-common.h >>> +++ b/gcc/config/i386/linux-common.h >>> @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not >>> see >>> %:include(libmpx.spec)%(link_libmpx)" >>> #endif >>> >>> +#ifndef MPX_SPEC >>> +#define MPX_SPEC "\ >>> + >>> %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" >>> +#endif >>> + >>> #ifndef LIBMPX_SPEC >>> #if defined(HAVE_LD_STATIC_DYNAMIC) >>> #define LIBMPX_SPEC "\ >>> @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see >>> >>> #ifndef CHKP_SPEC >>> #define CHKP_SPEC "\ >>> -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" >>> +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC >>> "}}" MPX_SPEC >>> #endif >>> diff --git a/libmpx/configure.ac b/libmpx/configure.ac >>> index fe0d3f2..3f8b50f 100644 >>> --- a/libmpx/configure.ac >>> +++ b/libmpx/configure.ac >>> @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) >>> AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = >>> "xyes"]) >>> >>> link_libmpx="-lpthread" >>> +link_mpx="" >>> +AC_MSG_CHECKING([whether ld accepts -z bndplt]) >>> +echo "int main() {};" > conftest.c >>> +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest >>> conftest.c 1>&AS_MESSAGE_LOG_FD]) >>> +then >>> +AC_MSG_RESULT([yes]) >>> +link_mpx="$link_mpx -z bndplt" >>> +else >>> +AC_MSG_RESULT([no]) >>> +fi >>> AC_SUBST(link_libmpx) >>> +AC_SUBST(link_mpx) >>> >> >> Without -z bndplt, MPX won't work correctly. We should always pass >> -z bndplt >> to linker. If linker doesn't support it, ld will issue a warning, >> not >> error and users >> will know their linker is too old. When they update linker, they >> don't have to >> rebuild GCC. > > If ld issues a warning instead of an error, then configure test passes > and we pass '-z bndplt' to linker. > Can you verify it with an older linker? The unknown XXX in -z XXX is always warned and ignored in Linux linker. If testing it on Linux always passes, it is useless. >>> >>> Old ld issues a warning: >>> >>> ld: warning: -z bndplt ignored. >> >> Does configure test pass? >> >>> But gold issues an error: >>> >>> ld.gold: bndplt: unknown -z option >>> ld.gold: use the --help option for usage information >> >> If gold is used, MPX won't work. What should we do here? >> Should we hardcode -fuse-ld=bfd for MPX? > > Is MPX disabled when the host linker is gold and gld isn't available? No. You may use MPX with gold and old ld but you would loose passed bounds when make a call via plt. >>> >>> If gold is default linker, the configure test will fail and we never
Re: [PATCH] Fix PR ipa/65432
TOn 03/18/2015 02:33 PM, Richard Biener wrote: On Wed, Mar 18, 2015 at 2:24 PM, Martin Liška wrote: Hello. Following patch wraps symtab_node::{asm_}name with xstrdup_for_dump. Ready for trunk? /* Gets symbol name of the item. */ const char *name (void) { -return node->name (); +return xstrdup_for_dump (node->name ()); shouldn't the methods be called dump_name () then? And why's node->name () not already dup-ing the string? That said, I wonder where we use ->name / ->asm_name. And why that's different for ICF. The patch would be more obvious if all fixes were to dumping sites. Richard. Hi. I decided to remove sem_item::{asm_}name as it really just calls symtab_node::{asm_}name. And the patch uses xstrdup_for_dump just dumping sites with more than one {asm_}name calls. Martin Thanks, Martin >From 5d2e883524ba23155d80cd4ad6d58ba5c73a1f90 Mon Sep 17 00:00:00 2001 From: mliska Date: Wed, 18 Mar 2015 13:59:49 +0100 Subject: [PATCH] Fix PR ipa/65432 gcc/ChangeLog: 2015-03-18 Martin Liska * cgraph.c (cgraph_node::get_create): Remove unnecessary xstrdup_for_dump wrapper. * ipa-icf.c (sem_item::dump): Use symtab_node::name instead of sem_item::name. (sem_function::equals): Wrap symtab_node::name and symtab_node::asm_name with xstrdup_for_dump. (sem_variable::equals): Likewise. (sem_item_optimizer::read_section): Use symtab_node::name instead of sem_item::name. (sem_item_optimizer::parse_funcs_and_vars): Likewise. (sem_item_optimizer::merge_classes): Wrap symtab_node::name and symtab_node::asm_name with xstrdup_for_dump. (congruence_class::dump): Use symtab_node::name instead of sem_item::name. * ipa-icf.h (symtab_node::name): Remove. (symtab_node::asm_name): Likewise. --- gcc/cgraph.c | 5 ++--- gcc/ipa-icf.c | 35 +++ gcc/ipa-icf.h | 12 3 files changed, 25 insertions(+), 27 deletions(-) diff --git a/gcc/cgraph.c b/gcc/cgraph.c index ede58bf..e2958c4 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -553,12 +553,11 @@ cgraph_node::get_create (tree decl) if (dump_file) fprintf (dump_file, "Introduced new external node " "(%s/%i) and turned into root of the clone tree.\n", - xstrdup_for_dump (node->name ()), node->order); + node->name (), node->order); } else if (dump_file) fprintf (dump_file, "Introduced new external node " - "(%s/%i).\n", xstrdup_for_dump (node->name ()), - node->order); + "(%s/%i).\n", node->name (), node->order); return node; } diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index 25b8306..f68d23c 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -239,12 +239,12 @@ sem_item::dump (void) if (dump_file) { fprintf (dump_file, "[%s] %s (%u) (tree:%p)\n", type == FUNC ? "func" : "var", - name(), node->order, (void *) node->decl); + node->name(), node->order, (void *) node->decl); fprintf (dump_file, " hash: %u\n", get_hash ()); fprintf (dump_file, " references: "); for (unsigned i = 0; i < refs.length (); i++) - fprintf (dump_file, "%s%s ", refs[i]->name (), + fprintf (dump_file, "%s%s ", refs[i]->node->name (), i < refs.length() - 1 ? "," : ""); fprintf (dump_file, "\n"); @@ -575,8 +575,13 @@ sem_function::equals (sem_item *item, if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Equals called for:%s:%s (%u:%u) (%s:%s) with result: %s\n\n", - name(), item->name (), node->order, item->node->order, asm_name (), - item->asm_name (), eq ? "true" : "false"); + xstrdup_for_dump (node->name()), + xstrdup_for_dump (item->node->name ()), + node->order, + item->node->order, + xstrdup_for_dump (node->asm_name ()), + xstrdup_for_dump (item->node->asm_name ()), + eq ? "true" : "false"); return eq; } @@ -1522,8 +1527,11 @@ sem_variable::equals (sem_item *item, if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Equals called for vars:%s:%s (%u:%u) (%s:%s) with result: %s\n\n", - name(), item->name (), node->order, item->node->order, asm_name (), - item->asm_name (), ret ? "true" : "false"); + xstrdup_for_dump (node->name()), + xstrdup_for_dump (item->node->name ()), + node->order, item->node->order, + xstrdup_for_dump (node->asm_name ()), + xstrdup_for_dump (item->node->asm_name ()), ret ? "true" : "false"); return ret; } @@ -1995,8 +2003,8 @@ sem_item_optimizer::read_section (lto_file_decl_data *file_data, gcc_assert (node->definition); if (dump_file) - fprintf (dump_file, "Symbol added:%s (tree: %p, uid:%u)\n", node->asm_name (), - (void *) node->decl, node->order); + fprintf (dump_file, "Symbol added:%s (tree: %p, uid:%u)\n", + node->asm_name (), (void *) node->decl, node->order); if (is_a (node)) { @@ -2259,7 +2267,7 @@ sem_item_optimizer::parse_funcs_and_vars (void) m_symtab_node_map.put (cnode, f);
Patch ping
Hi! I'd like to ping following patch: PR65238 - P1 - http://gcc.gnu.org/ml/gcc-patches/2015-03/msg00647.html - fix __has_{,cpp_}attribute() with -traditional-cpp Thanks. Jakub
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 04:59:05PM +0300, Ilya Enkovich wrote: > Which is a weird thing to do just to have a warning instead of an > error. You don't guarantee MPX PLT generation by always passing '-z > bndplt' but remove an opportunity to use gold at all. With current > check you may use any linker and manually provide additional options > if you want to. Yeah, I agree, the configure check is a reasonable thing to do. Jakub
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
Do we really want to quote to this level? This message has 11 levels of quotes, the most I have ever seen. If everyone does this, the whole thread is in every message and that seems unnecessary. I don't know if there are gcc guidelines on this??? On 3/18/2015 9:59 AM, Ilya Enkovich wrote: 2015-03-18 16:52 GMT+03:00 H.J. Lu : On Wed, Mar 18, 2015 at 6:41 AM, Ilya Enkovich wrote: 2015-03-18 16:31 GMT+03:00 H.J. Lu : On Wed, Mar 18, 2015 at 6:24 AM, Ilya Enkovich wrote: 2015-03-18 15:42 GMT+03:00 Richard Biener : On Wed, Mar 18, 2015 at 1:25 PM, H.J. Lu wrote: On Wed, Mar 18, 2015 at 5:13 AM, Ilya Enkovich wrote: 2015-03-18 15:08 GMT+03:00 H.J. Lu : On Wed, Mar 18, 2015 at 5:05 AM, Ilya Enkovich wrote: 2015-03-18 15:02 GMT+03:00 H.J. Lu : On Wed, Mar 18, 2015 at 4:56 AM, Ilya Enkovich wrote: Hi, This patch fixes PR target/65444 by passing '-z bndplt' to linker when appropriate. Bootstrapped and tested on x86_64-unknown-linux-gnu. Will commit it to trunk in a couple of days if no objections arise. Thanks, Ilya -- gcc/ 2015-03-18 Ilya Enkovich PR driver/65444 * config/i386/linux-common.h (MPX_SPEC): New. (CHKP_SPEC): Add MPX_SPEC. libmpx/ 2015-03-18 Ilya Enkovich PR driver/65444 * configure.ac: Add check for '-z bndplt' support by linker. Add link_mpx output variable. * libmpx.spec.in (link_mpx): New. * configure: Regenerate. diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h index 9c6560b..dd79ec6 100644 --- a/gcc/config/i386/linux-common.h +++ b/gcc/config/i386/linux-common.h @@ -59,6 +59,11 @@ along with GCC; see the file COPYING3. If not see %:include(libmpx.spec)%(link_libmpx)" #endif +#ifndef MPX_SPEC +#define MPX_SPEC "\ + %{mmpx:%{fcheck-pointer-bounds:%{!static:%:include(libmpx.spec)%(link_mpx)}}}" +#endif + #ifndef LIBMPX_SPEC #if defined(HAVE_LD_STATIC_DYNAMIC) #define LIBMPX_SPEC "\ @@ -89,5 +94,5 @@ along with GCC; see the file COPYING3. If not see #ifndef CHKP_SPEC #define CHKP_SPEC "\ -%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" +%{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" MPX_SPEC #endif diff --git a/libmpx/configure.ac b/libmpx/configure.ac index fe0d3f2..3f8b50f 100644 --- a/libmpx/configure.ac +++ b/libmpx/configure.ac @@ -40,7 +40,18 @@ AC_MSG_RESULT($LIBMPX_SUPPORTED) AM_CONDITIONAL(LIBMPX_SUPPORTED, [test "x$LIBMPX_SUPPORTED" = "xyes"]) link_libmpx="-lpthread" +link_mpx="" +AC_MSG_CHECKING([whether ld accepts -z bndplt]) +echo "int main() {};" > conftest.c +if AC_TRY_COMMAND([${CC} ${CFLAGS} -Wl,-z,bndplt -o conftest conftest.c 1>&AS_MESSAGE_LOG_FD]) +then +AC_MSG_RESULT([yes]) +link_mpx="$link_mpx -z bndplt" +else +AC_MSG_RESULT([no]) +fi AC_SUBST(link_libmpx) +AC_SUBST(link_mpx) Without -z bndplt, MPX won't work correctly. We should always pass -z bndplt to linker. If linker doesn't support it, ld will issue a warning, not error and users will know their linker is too old. When they update linker, they don't have to rebuild GCC. If ld issues a warning instead of an error, then configure test passes and we pass '-z bndplt' to linker. Can you verify it with an older linker? The unknown XXX in -z XXX is always warned and ignored in Linux linker. If testing it on Linux always passes, it is useless. Old ld issues a warning: ld: warning: -z bndplt ignored. Does configure test pass? But gold issues an error: ld.gold: bndplt: unknown -z option ld.gold: use the --help option for usage information If gold is used, MPX won't work. What should we do here? Should we hardcode -fuse-ld=bfd for MPX? Is MPX disabled when the host linker is gold and gld isn't available? No. You may use MPX with gold and old ld but you would loose passed bounds when make a call via plt. If gold is default linker, the configure test will fail and we never pass -z bndplt to linker even if ld.bfd is available and ld.gold is fixed later. I'd rather always pass -z bndplt to ld. If gold is used and it doesn't support '-z bndplt' then it doesn't mean user can't use MPX. They can use -fuse-ld=bfd to select bfd linker if gold fails to generate proper MPX binary. Which is a weird thing to do just to have a warning instead of an error. You don't guarantee MPX PLT generation by always passing '-z bndplt' but remove an opportunity to use gold at all. With current check you may use any linker and manually provide additional options if you want to. Ilya -- H.J.
Re: [PATCH] Fix PR ipa/65432
On Wed, Mar 18, 2015 at 02:59:10PM +0100, Martin Liška wrote: > >From 5d2e883524ba23155d80cd4ad6d58ba5c73a1f90 Mon Sep 17 00:00:00 2001 > From: mliska > Date: Wed, 18 Mar 2015 13:59:49 +0100 > Subject: [PATCH] Fix PR ipa/65432 > > gcc/ChangeLog: > > 2015-03-18 Martin Liska > Missing PR ipa/65432 here... > * cgraph.c (cgraph_node::get_create): Remove unnecessary > xstrdup_for_dump wrapper. > * ipa-icf.c (sem_item::dump): Use symtab_node::name instead of > sem_item::name. > (sem_function::equals): Wrap symtab_node::name and symtab_node::asm_name > with xstrdup_for_dump. > (sem_variable::equals): Likewise. > (sem_item_optimizer::read_section): Use symtab_node::name instead of > sem_item::name. > (sem_item_optimizer::parse_funcs_and_vars): Likewise. > (sem_item_optimizer::merge_classes): Wrap symtab_node::name and > symtab_node::asm_name with xstrdup_for_dump. > (congruence_class::dump): Use symtab_node::name instead of > sem_item::name. > * ipa-icf.h (symtab_node::name): Remove. > (symtab_node::asm_name): Likewise. Otherwise LGTM. Jakub
Re: [PATCH] Fix PR ipa/65432
On 03/18/2015 03:05 PM, Jakub Jelinek wrote: On Wed, Mar 18, 2015 at 02:59:10PM +0100, Martin Liška wrote: >From 5d2e883524ba23155d80cd4ad6d58ba5c73a1f90 Mon Sep 17 00:00:00 2001 From: mliska Date: Wed, 18 Mar 2015 13:59:49 +0100 Subject: [PATCH] Fix PR ipa/65432 gcc/ChangeLog: 2015-03-18 Martin Liska Missing PR ipa/65432 here... * cgraph.c (cgraph_node::get_create): Remove unnecessary xstrdup_for_dump wrapper. * ipa-icf.c (sem_item::dump): Use symtab_node::name instead of sem_item::name. (sem_function::equals): Wrap symtab_node::name and symtab_node::asm_name with xstrdup_for_dump. (sem_variable::equals): Likewise. (sem_item_optimizer::read_section): Use symtab_node::name instead of sem_item::name. (sem_item_optimizer::parse_funcs_and_vars): Likewise. (sem_item_optimizer::merge_classes): Wrap symtab_node::name and symtab_node::asm_name with xstrdup_for_dump. (congruence_class::dump): Use symtab_node::name instead of sem_item::name. * ipa-icf.h (symtab_node::name): Remove. (symtab_node::asm_name): Likewise. Otherwise LGTM. Jakub Thank you. Adding missing comment and going to install the patch. Martin
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 7:02 AM, Jakub Jelinek wrote: > On Wed, Mar 18, 2015 at 04:59:05PM +0300, Ilya Enkovich wrote: >> Which is a weird thing to do just to have a warning instead of an >> error. You don't guarantee MPX PLT generation by always passing '-z >> bndplt' but remove an opportunity to use gold at all. With current >> check you may use any linker and manually provide additional options >> if you want to. > > Yeah, I agree, the configure check is a reasonable thing to do. > We should either always pass -z bndplt to linker or disable MPX. -- H.J.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On 2015.03.18 at 10:03 -0400, Robert Dewar wrote: > Do we really want to quote to this level? This message has 11 levels of > quotes, the most I have ever seen. If everyone does this, the whole > thread is in every message and that seems unnecessary. I don't know if > there are gcc guidelines on this??? The only guideline I know of is that top-posts are to be avoided. You could use local tools to handle this situation. I use t-prot with mutt for example. It automatically shrinks the quote block, e.g.: [---=| Quote block shrunk by t-prot: 114 lines snipped |=---] ... last few lines of message ... -- Markus
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 7:31 AM, H.J. Lu wrote: > On Wed, Mar 18, 2015 at 7:02 AM, Jakub Jelinek wrote: >> On Wed, Mar 18, 2015 at 04:59:05PM +0300, Ilya Enkovich wrote: >>> Which is a weird thing to do just to have a warning instead of an >>> error. You don't guarantee MPX PLT generation by always passing '-z >>> bndplt' but remove an opportunity to use gold at all. With current >>> check you may use any linker and manually provide additional options >>> if you want to. >> >> Yeah, I agree, the configure check is a reasonable thing to do. >> > > We should either always pass -z bndplt to linker or disable > MPX. > MPX is a security feature. Knowing leaving a door open is a bad idea. -- H.J.
Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)
On 16/03/15 18:26, Jakub Jelinek wrote: On Mon, Mar 16, 2015 at 07:15:59PM +0100, Richard Biener wrote: On March 16, 2015 5:21:02 PM GMT+01:00, Jakub Jelinek wrote: On the following testcase, gimple LIM creates a vector COND_EXPR (scalar condition, vector lhs, rhs2 and rhs3), but if we don't have corresponding vector mode for it, we ICE trying to expand the BLKmode COND_EXPR, as it is unprepared for that. This patch lowers those (parallel or piecewise). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Though maybe LIM should not create these for cost reasons? I also wonder if we should lower them to control flow? Yeah, I've been thinking about teaching LIM not to do that if it is BLKmode. But then found how many other spots create COND_EXPRs and thought I just can't catch all of them. But guess I can change LIM incrementally too. I've also been thinking about lowering it to control flow, but only if it couldn't be done in say two halves comparison as in the testcase. I suppose doing say 2 V4SImode COND_EXPRs would still be beneficial over control flow, but if it is more than that (say 4 or 8+) it might be already better to just expand it as a PHI. But, as we don't create basic blocks in tree-vect-generic.c right now, I thought it might be too much for stage4. Hi Jakub, The testcase ICEs on arm-none-eabi at -Os (only): 0x7e6f11 process_insert_insn $TOP/gcc/gcse.c:2174 0x7e8a80 insert_insn_end_basic_block $TOP/gcc/gcse.c:2196 0x7eab01 hoist_code $TOP/gcc/gcse.c:3492 0x7eab01 one_code_hoisting_pass $TOP/gcc/gcse.c:3722 0x7eab01 execute_rtl_hoist $TOP/gcc/gcse.c:4212 0x7eab01 execute $TOP/gcc/gcse.c:4293 Thanks, Kyrill Jakub
Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)
On Wed, Mar 18, 2015 at 03:10:44PM +, Kyrill Tkachov wrote: > >I've also been thinking about lowering it to control flow, but only if it > >couldn't be done in say two halves comparison as in the testcase. I suppose > >doing say 2 V4SImode COND_EXPRs would still be beneficial over control flow, > >but if it is more than that (say 4 or 8+) it might be already better to just > >expand it as a PHI. But, as we don't create basic blocks in > >tree-vect-generic.c right now, I thought it might be too much for stage4. > The testcase ICEs on arm-none-eabi at -Os (only): > 0x7e6f11 process_insert_insn > $TOP/gcc/gcse.c:2174 > 0x7e8a80 insert_insn_end_basic_block > $TOP/gcc/gcse.c:2196 > 0x7eab01 hoist_code > $TOP/gcc/gcse.c:3492 > 0x7eab01 one_code_hoisting_pass > $TOP/gcc/gcse.c:3722 > 0x7eab01 execute_rtl_hoist > $TOP/gcc/gcse.c:4212 > 0x7eab01 execute > $TOP/gcc/gcse.c:4293 Must be some unrelated latent issue, either in arm backend or in gcse. I think the lowering I've done could be very easily just present in user code, so shouldn't be that hard to adjust the testcase so that it ICEs already before the commit. Jakub
Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)
On 18/03/15 15:14, Jakub Jelinek wrote: On Wed, Mar 18, 2015 at 03:10:44PM +, Kyrill Tkachov wrote: I've also been thinking about lowering it to control flow, but only if it couldn't be done in say two halves comparison as in the testcase. I suppose doing say 2 V4SImode COND_EXPRs would still be beneficial over control flow, but if it is more than that (say 4 or 8+) it might be already better to just expand it as a PHI. But, as we don't create basic blocks in tree-vect-generic.c right now, I thought it might be too much for stage4. The testcase ICEs on arm-none-eabi at -Os (only): 0x7e6f11 process_insert_insn $TOP/gcc/gcse.c:2174 0x7e8a80 insert_insn_end_basic_block $TOP/gcc/gcse.c:2196 0x7eab01 hoist_code $TOP/gcc/gcse.c:3492 0x7eab01 one_code_hoisting_pass $TOP/gcc/gcse.c:3722 0x7eab01 execute_rtl_hoist $TOP/gcc/gcse.c:4212 0x7eab01 execute $TOP/gcc/gcse.c:4293 Must be some unrelated latent issue, either in arm backend or in gcse. I think the lowering I've done could be very easily just present in user code, so shouldn't be that hard to adjust the testcase so that it ICEs already before the commit. Yep, I'm seeing the same ICE in the testcsase before the patch. Kyrill Jakub
Re: [debug-early] equate new DIE with DW_AT_specificationto a previous declaration
On 03/17/2015 07:12 PM, Jason Merrill wrote: On 03/17/2015 03:58 PM, Aldy Hernandez wrote: The problem is that, for -fno-implicit-templates, the decl is now DECL_EXTERNAL, which means we never equate this new "DIE with DW_AT_specification" to the DECL. That is, we never fall through here: else if (!DECL_EXTERNAL (decl)) { HOST_WIDE_INT cfa_fb_offset; struct function *fun = DECL_STRUCT_FUNCTION (decl); if (!old_die || !get_AT (old_die, DW_AT_inline)) equate_decl_number_to_die (decl, subr_die); However, when we call gen_subprogram_die() the third time through the outlining_inline_function hook (late debug), we again try to add a DW_AT_specification to the DIE cached from the first time around, but this time we ICE because we're not supposed to have multiple DW_AT_specification's pointing to the same DIE (the old original DIE). Why are we outlining a DECL_EXTERNAL function? SRA is analyzing Object::Method() and noticing that `this' is never used, so it's trying to rewrite the call to avoid passing `this' (by creating a clone). SRA has no restrictions on whether a function is DECL_EXTERNAL. For that matter, the SRA pass is called on all functions that have a gimple body, irregardless of DECL_EXTERNAL, courtesy of the pass manager: if (node->has_gimple_body_p ()) callback (DECL_STRUCT_FUNCTION (node->decl), data); ...and since Object::Method() has a gimple body even though it is marked DECL_EXTERNAL...we get the call into dwarf2out_abstract_decl. Incidentally, /* If we have no location information, this must be a partially generated DIE from early dwarf generation. Fall through and generate it. */ Why aren't we checking dumped_early here? Good point. I'll add an assert. Aldy
Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)
On 18/03/15 15:20, Kyrill Tkachov wrote: On 18/03/15 15:14, Jakub Jelinek wrote: On Wed, Mar 18, 2015 at 03:10:44PM +, Kyrill Tkachov wrote: I've also been thinking about lowering it to control flow, but only if it couldn't be done in say two halves comparison as in the testcase. I suppose doing say 2 V4SImode COND_EXPRs would still be beneficial over control flow, but if it is more than that (say 4 or 8+) it might be already better to just expand it as a PHI. But, as we don't create basic blocks in tree-vect-generic.c right now, I thought it might be too much for stage4. The testcase ICEs on arm-none-eabi at -Os (only): 0x7e6f11 process_insert_insn $TOP/gcc/gcse.c:2174 0x7e8a80 insert_insn_end_basic_block $TOP/gcc/gcse.c:2196 0x7eab01 hoist_code $TOP/gcc/gcse.c:3492 0x7eab01 one_code_hoisting_pass $TOP/gcc/gcse.c:3722 0x7eab01 execute_rtl_hoist $TOP/gcc/gcse.c:4212 0x7eab01 execute $TOP/gcc/gcse.c:4293 Must be some unrelated latent issue, either in arm backend or in gcse. I think the lowering I've done could be very easily just present in user code, so shouldn't be that hard to adjust the testcase so that it ICEs already before the commit. Yep, I'm seeing the same ICE in the testcsase before the patch. With some help from James G, I've got a potential fix to gcse.c in testing. Kyrill Kyrill Jakub
Re: [PATCH][RFC] Fix PR63155
Following this patch (r221318), we're seeing what appears to be a miscompile of glibc on AArch64. This causes quite a bunch of tests to fail, segfaults etc., if LD_LIBRARY_PATH leads to a libc.so.6 built with that patch vs without (same glibc sources). We are still working on a reduced testcase, but this is proving difficult... --Alan Richard Biener wrote: On Mon, 9 Mar 2015, Richard Biener wrote: On Fri, 6 Mar 2015, Jeff Law wrote: On 03/06/15 06:16, Richard Biener wrote: This fixes PR63155 and reduces the memory usage at -O0 from reported 10GB (couldn't verify/update on my small box) to 350MB (still worse compared to 4.8 which needs only 50MB). It fixes this by no longer computing live info or building a conflict graph for coalescing of SSA names flowing over abnormal edges (which needs to succeed). Of course this also removes verification that this coalescing is valid (but computing this has quadratic cost). With this it turns ICEs into miscompiles. We could restrict verifying that we can perform abnormal coalescing to ENABLE_CHECKING (and I've wanted a verifier pass that can verify this multiple times to be able to catch what breaks it and not having to work back from out-of-SSA ICEing...). So any opinion on this patch welcome. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Ok for trunk? ;) Thanks, Richard. 2015-03-06 Richard Biener PR middle-end/63155 * tree-ssa-coalesce.c (attempt_coalesce): Handle graph being NULL. (coalesce_partitions): Split out abnormal coalescing to ... (perform_abnormal_coalescing): ... this function. (coalesce_ssa_name): Perform abnormal coalescing without computing live/conflict. I'd personally like to keep the checking when ENABLE_CHECKING. I haven't followed this discussion real closely, but I wonder if some kind of blocking approach would work without blowing up the memory consumption. There's no inherent reason why we have to coalesce everything at the same time. We can use a blocking factor and do coalescing on some N number of SSA_NAMEs at a time. Yes, that's possible at (quite?) some compile-time cost. Note that we can't really guarantee that the resulting live/conflict problems shrink significantly enough without sorting the coalesces in a different way (not after important coalesces but after their basevars). I suspect we can select an N that ultimately degenerates into the current "do everything together" for the common case and only has to iterate over blocks of SSA_NAMEs in extreme cases. I've attached a patch to the PR that adds such a number N after which we simply stop coalescing. Of course that doesn't work for abnormals where we _must_ coalesce. Thus this patch ... As said, it's simple to keep the checking with ENABLE_CHECKING (I wonder if we should make some of the checking we do a runtime choice rather than a compile-time one...). I'll update the patch. Ok, like the following which adds a verify_ssa_coalescing () function (which could in theory be called from IL verification like verify_ssa) and calls it when ENABLE_CHECKING is defined. Bootstrap & regtest running on x86_64-unknown-linux-gnu. It didn't look appropriate for this stage to implement virtual operand verification. Ok this way? Thanks, Richard. 2015-03-06 Richard Biener PR middle-end/63155 * tree-ssa-coalesce.h (verify_ssa_coalescing): Declare. * tree-ssa-coalesce.c (attempt_coalesce): Handle graph being NULL. (coalesce_partitions): Call verify_ssa_coalescing if ENABLE_CHECKING. Split out abnormal coalescing to ... (perform_abnormal_coalescing): ... this function. (coalesce_ssa_name): Perform abnormal coalescing without computing live/conflict. (verify_ssa_coalescing_worker): New function. (verify_ssa_coalescing): Likewise. Index: gcc/tree-ssa-coalesce.c === *** gcc/tree-ssa-coalesce.c (revision 221278) --- gcc/tree-ssa-coalesce.c (working copy) *** create_outofssa_var_map (coalesce_list_p *** 1121,1128 /* Attempt to coalesce ssa versions X and Y together using the partition !mapping in MAP and checking conflicts in GRAPH. Output any debug info to !DEBUG, if it is nun-NULL. */ static inline bool attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, --- 1121,1128 /* Attempt to coalesce ssa versions X and Y together using the partition !mapping in MAP and checking conflicts in GRAPH if not NULL. !Output any debug info to DEBUG, if it is nun-NULL. */ static inline bool attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, *** attempt_coalesce (var_map map, ssa_confl *** 1154,1160 fprintf (debug, " [map: %d, %d] ", p1, p2); ! if (!ssa_conflicts_test_p (graph, p1, p2)) { var1 = partition_to_var (map, p1); var2 = partition_to_var (map, p2); --- 11
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
2015-03-18 17:42 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 7:31 AM, H.J. Lu wrote: >> On Wed, Mar 18, 2015 at 7:02 AM, Jakub Jelinek wrote: >>> >>> Yeah, I agree, the configure check is a reasonable thing to do. >>> >> >> We should either always pass -z bndplt to linker or disable >> MPX. >> > > MPX is a security feature. Knowing leaving a door open is a > bad idea. Instrumented binary used with legacy libraries is a supported usage model. Each user determines his own level of security. Ilya > > -- > H.J.
Re: [PATCH, ARM, PR64208] LRA ICE Fix
On 18 March 2015 at 12:42, Yvan Roux wrote: > HI Kyrill, > > On 18 March 2015 at 11:24, Kyrill Tkachov wrote: >> Hi Yvan, >> >> >> On 18/03/15 10:19, Yvan Roux wrote: >>> >>> Hi, >>> >>> This is a fix for PR64208 where LRA loops when dealing with >>> iwmmxt_arm_movdi insn. As explain in the PR, the issue was introduced >>> on trunk and 4.9 branch by fix of PR rtl-optimization/60969 and then >>> workaround by r211798 (-fuse-caller-save enable for ARM). >>> >>> The changes in IRA cost made by PR60969, changed the register class of >>> this insn output from GENERAL_REGS to IWMMXT_REGS, and the >>> redundancies in the insn pattern alternatives description force LRA to >>> reload the pseudo, which generates the same iwmmxt_arm_movdi insn, >>> which can't be resolved, and so on ... >>> >>> Removing the redundancies fixes the issue, as LRA find that >>> alternative 8 (Uy => y) matches. >>> >>> This issue is present in 4.9 branch, but latent on trunk (the >>> clobbering of IP and CC information added during -fuse-caller-save >>> patch changed the register allocation). >>> >>> Cross compiled and regression tested on ARM targets (but not on an >>> IWMMXT one), is it ok for trunk and 4.9 branch ? >>> >>> Rq: I think that adding IP and CC clobbers to >>> CALL_INSN_FUNCTION_USAGE, as specified by AAPCS, in 4.9 branch is >>> something we need too, I've a patch for that if you agree on that. >>> >>> Thanks, >>> Yvan >>> >>> 2105-03-17 Yvan Roux >>> >>> PR target/64208 >>> * config/arm/iwmmxt.md ("*iwmmxt_arm_movdi"): Cleanup redundant >>> alternatives. >> >> >> As a general note, I think this patch should include the testcase from the >> PR. >> I can see that it's fairly self-contained. > > Yes, I wanted to find one that exhibits the issue on trunk as the PR > testcase doesn't, but don't find one so far. If it's fine to include > a testcase that don't fail without that patch on trunk, I can include > it. Here is the patch with the testcase. Cheers, Yvan gcc/ 2105-03-18 Yvan Roux PR target/64208 * config/arm/iwmmxt.md ("*iwmmxt_arm_movdi"): Cleanup redundant alternatives. gcc/testsuite/ 2105-03-18 Yvan Roux PR target/64208 * gcc.target/arm/pr64208.c: New test. diff --git a/gcc/config/arm/iwmmxt.md b/gcc/config/arm/iwmmxt.md index fda3c2c..d1a60ff 100644 --- a/gcc/config/arm/iwmmxt.md +++ b/gcc/config/arm/iwmmxt.md @@ -107,8 +107,8 @@ ) (define_insn "*iwmmxt_arm_movdi" - [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r, r, r, m,y,y,yr,y,yrUy,*w, r,*w,*w, *Uv") -(match_operand:DI 1 "di_operand" "rDa,Db,Dc,mi,r,y,yr,y,yrUy,y, r,*w,*w,*Uvi,*w"))] + [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r, r, r, m,y,y,r, y,Uy,*w, r,*w,*w, *Uv") +(match_operand:DI 1 "di_operand" "rDa,Db,Dc,mi,r,y,r,y,Uy,y, r,*w,*w,*Uvi,*w"))] "TARGET_REALLY_IWMMXT && ( register_operand (operands[0], DImode) || register_operand (operands[1], DImode))" diff --git a/gcc/testsuite/gcc.target/arm/pr64208.c b/gcc/testsuite/gcc.target/arm/pr64208.c new file mode 100644 index 000..96fd56d --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr64208.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-mcpu=*" } { "-mcpu=iwmmxt" } } */ +/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-mabi=*" } { "-mabi=iwmmxt" } } */ +/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-march=*" } { "-march=iwmmxt" } } */ +/* { dg-skip-if "Test is specific to ARM mode" { arm*-*-* } { "-mthumb" } { "" } } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-require-effective-target arm_iwmmxt_ok } */ +/* { dg-options "-O1 -mcpu=iwmmxt" } */ + +long long x6(void); +void x7(long long, long long); +void x8(long long); + +int x0; +long long *x1; + +void x2(void) { + long long *x3 = x1; + while (x1) { +long long x4 = x0, x5 = x6(); +x7(x4, x5); +x8(x5); +*x3 = 0; + } +}
Re: PR 13631 Problems in messages
On 03/12/14 14:54 +, Jonathan Wakely wrote: On 02/12/14 23:58 +0100, François Dumont wrote: DR libstdc++/13631 s/DR/PR/ * include/bits/codecvt.h (codecvt): friend class std::messages. (codecvt): friend class std::messages. * config/locale/gnu/messages_member.h (messages::do_open): Specialized. (messages::do_close): Likewise. (messages::do_open): Likewise. (messages::do_close): Likewise. * config/locale/gnu/messages_member.cc: (messages::do_open): Implement. Use bind_textdomain_codeset based on codecvt._M_c_locale_codecvt code set. Use internal cache to keep opened domain name with locale information. (messages::do_open): Likewise with codecvt. (messages::do_close): Implement. Clean cache information. (messages::do_close): Likewise. (get_glibc_msg): New. Use dgettext rather than gettext using cached domain name associated to catalog id. (messages::do_get): Use latter. (messages::do_get): Likewise and use also cached locale codecvt facet to convert wchar_t default value to char and the result back to wchar_t. * testsuite/22_locale/messages/13631.cc: New. * testsuite/22_locale/messages/members/char/2.cc: Use also fr_FR locale for charset conversion to get the expected accented character. Tested under Linux x86_64. Ok to commit ? Yes, thanks for fixing this longstanding bug! (The use of std::string might have to change when we have two different versions of std::string, as we probably only want one Catalogs object to be shared by both ABIs, but I will deal with that as part of the std::string patch) The new get_glibc_msg function doesn't compile with old versions of glibc, _M_name_messages is not defined (the code is also not exception safe, but that was not introduced by the recent changes). Tested x86_64-linux, committed to trunk. commit eb85799122e4aaecfc5a231a3f012b5dbbf88116 Author: Jonathan Wakely Date: Wed Mar 18 15:56:12 2015 + PR libstdc++/13631 * config/locale/gnu/messages_members.cc (get_glibc_msg): Fix fallback implementation for old glibc. Fix whitespace. diff --git a/libstdc++-v3/config/locale/gnu/messages_members.cc b/libstdc++-v3/config/locale/gnu/messages_members.cc index f115d5f..2e6122d 100644 --- a/libstdc++-v3/config/locale/gnu/messages_members.cc +++ b/libstdc++-v3/config/locale/gnu/messages_members.cc @@ -34,6 +34,8 @@ #include #include #include +#include // std::free +#include // ::strdup #include #include @@ -139,28 +141,28 @@ namespace } const char* - get_glibc_msg(__c_locale __attribute__((unused)) __locale_messages, + get_glibc_msg(__c_locale __locale_messages __attribute__((unused)), + const char* __name_messages __attribute__((unused)), const char* __domainname, const char* __dfault) { #if __GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ > 2) std::__c_locale __old = __uselocale(__locale_messages); -const char* __msg = - const_cast(dgettext(__domainname, __dfault)); - __uselocale(__old); -#else - char* __old = setlocale(LC_ALL, 0); - const size_t __len = strlen(__old) + 1; - char* __sav = new char[__len]; - memcpy(__sav, __old, __len); - setlocale(LC_ALL, _M_name_messages); const char* __msg = dgettext(__domainname, __dfault); - setlocale(LC_ALL, __sav); - delete [] __sav; -#endif - +__uselocale(__old); return __msg; -} +#else +if (char* __sav = strdup(setlocale(LC_ALL, 0))) + { + setlocale(LC_ALL, __name_messages); + const char* __msg = dgettext(__domainname, __dfault); + setlocale(LC_ALL, __sav); + free(__sav); + return __msg; + } +return __dfault; +#endif + } } namespace std _GLIBCXX_VISIBILITY(default) @@ -172,14 +174,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION typename messages::catalog messages::do_open(const basic_string& __s, const locale& __l) const - { -typedef codecvt __codecvt_t; -const __codecvt_t& __codecvt = use_facet<__codecvt_t>(__l); +{ + typedef codecvt __codecvt_t; + const __codecvt_t& __codecvt = use_facet<__codecvt_t>(__l); -bind_textdomain_codeset(__s.c_str(), - __nl_langinfo_l(CODESET, __codecvt._M_c_locale_codecvt)); -return get_catalogs()._M_add(__s, __l); - } + bind_textdomain_codeset(__s.c_str(), + __nl_langinfo_l(CODESET, __codecvt._M_c_locale_codecvt)); + return get_catalogs()._M_add(__s, __l); +} template<> void @@ -199,7 +201,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION if (!__cat_info) return __dfault; - return get_glibc_msg(_M_c_locale_messages, + return get_glibc_msg(_M_c_locale_messages, _M_name_messages, __cat_info->_M_domain.c_str(), __dfault.c_str()); } @@ -209,15 +211,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION typename messages::catalog messages::do_open(const basic_string& __s, const locale& __l) const - { -typedef codecvt __codecvt_t; -const __codecvt_t& __codecvt = use_facet<__codec
Re: PR 13631 Problems in messages
On 18/03/15 16:17 +, Jonathan Wakely wrote: The new get_glibc_msg function doesn't compile with old versions of glibc, _M_name_messages is not defined (the code is also not exception safe, but that was not introduced by the recent changes). We can probably drop that old code completely, given that we document that glibc 2.3 is required: https://gcc.gnu.org/onlinedocs/libstdc++/manual/setup.html#manual.intro.setup.prereq Let's look into that for GCC 6.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 9:14 AM, Ilya Enkovich wrote: > 2015-03-18 17:42 GMT+03:00 H.J. Lu : >> On Wed, Mar 18, 2015 at 7:31 AM, H.J. Lu wrote: >>> On Wed, Mar 18, 2015 at 7:02 AM, Jakub Jelinek wrote: Yeah, I agree, the configure check is a reasonable thing to do. >>> >>> We should either always pass -z bndplt to linker or disable >>> MPX. >>> >> >> MPX is a security feature. Knowing leaving a door open is a >> bad idea. > > Instrumented binary used with legacy libraries is a supported usage > model. Each user determines his own level of security. > It doesn't mean we should leave a door open. Are we supposed to detect this with MPX: [hjl@skylakeclient bug-1]$ cat x.c #include int main () { char buf[10]; memset(buf, 'a', 11); return 0; } [hjl@skylakeclient bug-1]$ I believe we should, not maybe. We shouldn't silent fail it when linker doesn't support -z bndplt. -- H.J.
Re: [PATCH, stage1] Make parloops gate more strict
On 18-03-15 12:18, Richard Biener wrote: On Wed, Mar 18, 2015 at 12:03 PM, Tom de Vries wrote: On 18-03-15 11:16, Richard Biener wrote: On Fri, Mar 13, 2015 at 4:28 PM, Tom de Vries wrote: On 13-03-15 13:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek wrote: On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. Atm fdump-passes doesn't run with cfun == NULL. From pass_manager::dump_passes: ... FOR_EACH_FUNCTION (n) if (DECL_STRUCT_FUNCTION (n->decl)) { node = n; break; } if (!node) return; push_cfun (DECL_STRUCT_FUNCTION (node->decl)); Um - this now picks a random function which may be one with an optimize or target attribute associated to it. Indeed. Attached patch removes that code, and runs the gates with cfun == NULL for -fdump-passes. It at least builds, and allows us to compile src/gcc/testsuite/gcc.dg/dump-pass.c with -O2 -fdump-passes. Should I bootstrap and reg-test, or do you see a problem with this approach? Yeah - it makes the -fdump-passes "hack" more pervasive throughout the compiler. I suppose it should instead build & push a "dummy" sturct function. Like this? Well, or simply don't care for it's brokeness. I'm afraid leaving it broken just means we'll keep coming back to it. So I'd prefer either fixing or removing. Thanks, - Tom Fix fdump-passes --- gcc/function.c | 37 - gcc/function.h | 2 ++ gcc/passes.c| 17 ++--- gcc/tree-chkp.c | 6 -- 4 files changed, 40 insertions(+), 22 deletions(-) diff --git a/gcc/function.c b/gcc/function.c index 2c3d142..9ddbad8 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -4862,6 +4862,29 @@ prepare_function_start (void) frame_pointer_needed = 0; } +void +push_dummy_function (bool with_decl) +{ + tree fn_decl, fn_type, fn_result_decl; + + gcc_assert (!in_dummy_function); + in_dummy_function = true; + + if (with_decl) +{ + fn_type = build_function_type_list (void_type_node, NULL_TREE); + fn_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, NULL_TREE, + fn_type); + fn_result_decl = build_decl (UNKNOWN_LOCATION, RESULT_DECL, + NULL_TREE, void_type_node); + DECL_RESULT (fn_decl) = fn_result_decl; +} + else +fn_decl = NULL_TREE; + + push_struct_function (fn_decl); +} + /* Initialize the rtl expansion mechanism so that we can do simple things like generate sequences. This is used to provide a context during global initialization of some passes. You must call expand_dummy_function_end @@ -4870,9 +4893,7 @@ prepare_function_start (void) void init_dummy_function_start (void) { - gcc_assert (!in_dummy_function); - in_dummy_function = true; - push_struct_function (NULL_TREE); + push_dummy_function (false); prepare_function_start (); } @@ -5144,6 +5165,13 @@ expand_function_start (tree subr) stack_check_probe_note = emit_note (NOTE_INSN_DELETED); } +void +pop_dummy_function (void) +{ + pop_cfun (); + in_dummy_function = false; +} + /* Undo the effects of init_dummy_function_start. */ void expand_dummy_function_end (void) @@ -5159,8 +5187,7 @@ expand_dummy_function_end (void) free_after_parsing (cfun); free_after_compilation (cfun); - pop_cfun (); - in_dummy_function = false; + pop_dummy_function (); } /* Helper for diddle_return_value. */ diff --git a/gcc/function.h b/gcc/function.h index b89c5ae..349f80c 100644 --- a/gcc/function.h +++ b/gcc/function.h @@ -899,6 +899,8 @@ extern int get_next_funcdef_no (void); extern int get_last_funcdef_no (void); extern void allocate_struct_function (tree, bool); extern void push_struct_function (tree fndecl); +extern void push_dummy_function (bool); +extern void pop_dummy_function (void); extern void init_dummy_function_start (void); extern void init_function_start (tree); extern void stack_protect_epilogue (void); diff --git a/gcc/passes.c b/gcc/passes.c index 23a90d9..bca8dbb 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -944,32 +944,19 @@ dump_passes (void) void pass_manager::dump_passes () const { - struct cgraph_node *n, *node = NULL; + push_dummy_function (true); create_pass_tab (); - FOR_EACH_FUNCTION (n) -if (DECL_STRUCT_FUNCTION (n->decl)) - { - node = n; - break; - } - - if (!node) -return; - - push_
[debug-early] Cache DW_TAG_label's if appropriate
I'm running the gdb testsuite with the debug-early branch compiler, and fixing regressions. gdb.base/label.c was causing gdb to ICE because we were generating multiple DW_TAG_label's for the same DECL. Consequently, I have fixed gen_label_die to handle multiple calls with the same DECL. By the way Jason, is there any reason we previously called equate_decl_number_to_die() only for abstract instances?: if (DECL_ABSTRACT_P (decl)) equate_decl_number_to_die (decl, lbl_die); else { insn = DECL_RTL_IF_SET (decl); I was unable to find the original reasoning, because the original patch setting this was from the 1990s, and it came en-masse when dwarf2out.c was contributed. I don't see anything else in dwarf2out.c doing a lookup_decl_die() on a LABEL_DECL, so I don't even know why the equate_decl_number_to_die() is even there. Anyways, in the immortal words of Shaggy... "wasn't me", so I'm leaving this bit untouched. Tested on x86-64 Linux with the gdb regression suite. Committed to branch. Aldy commit 8fea2b5c37026b3121afc0a98630542bb9a733dd Author: Aldy Hernandez Date: Wed Mar 18 09:55:38 2015 -0700 Cache DW_TAG_label's if appropriate. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 30c6cc6..92f4903 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -19725,14 +19725,20 @@ static void gen_label_die (tree decl, dw_die_ref context_die) { tree origin = decl_ultimate_origin (decl); - dw_die_ref lbl_die = new_die (DW_TAG_label, context_die, decl); + dw_die_ref lbl_die = lookup_decl_die (decl); rtx insn; char label[MAX_ARTIFICIAL_LABEL_BYTES]; - if (origin != NULL) -add_abstract_origin_attribute (lbl_die, origin); - else -add_name_and_src_coords_attributes (lbl_die, decl); + if (!lbl_die) +{ + lbl_die = new_die (DW_TAG_label, context_die, decl); + equate_decl_number_to_die (decl, lbl_die); + + if (origin != NULL) + add_abstract_origin_attribute (lbl_die, origin); + else + add_name_and_src_coords_attributes (lbl_die, decl); +} if (DECL_ABSTRACT_P (decl)) equate_decl_number_to_die (decl, lbl_die);
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
2015-03-18 19:45 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 9:14 AM, Ilya Enkovich wrote: >> 2015-03-18 17:42 GMT+03:00 H.J. Lu : >>> On Wed, Mar 18, 2015 at 7:31 AM, H.J. Lu wrote: >>> >>> MPX is a security feature. Knowing leaving a door open is a >>> bad idea. >> >> Instrumented binary used with legacy libraries is a supported usage >> model. Each user determines his own level of security. >> > > It doesn't mean we should leave a door open. Are we supposed to > detect this with MPX: > > [hjl@skylakeclient bug-1]$ cat x.c > #include > > int > main () > { > char buf[10]; > memset(buf, 'a', 11); > return 0; > } > [hjl@skylakeclient bug-1]$ > > I believe we should, not maybe. We shouldn't silent fail it > when linker doesn't support -z bndplt. It depends on compiler flags and libraries used and is up to user to decide. User may be warned during libmpx configuration. Ilya > > -- > H.J.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 10:13 AM, Ilya Enkovich wrote: > 2015-03-18 19:45 GMT+03:00 H.J. Lu : >> On Wed, Mar 18, 2015 at 9:14 AM, Ilya Enkovich >> wrote: >>> 2015-03-18 17:42 GMT+03:00 H.J. Lu : On Wed, Mar 18, 2015 at 7:31 AM, H.J. Lu wrote: MPX is a security feature. Knowing leaving a door open is a bad idea. >>> >>> Instrumented binary used with legacy libraries is a supported usage >>> model. Each user determines his own level of security. >>> >> >> It doesn't mean we should leave a door open. Are we supposed to >> detect this with MPX: >> >> [hjl@skylakeclient bug-1]$ cat x.c >> #include >> >> int >> main () >> { >> char buf[10]; >> memset(buf, 'a', 11); >> return 0; >> } >> [hjl@skylakeclient bug-1]$ >> >> I believe we should, not maybe. We shouldn't silent fail it >> when linker doesn't support -z bndplt. > > It depends on compiler flags and libraries used and is up to user to > decide. User may be warned during libmpx configuration. > What is "USER"? The one who build GCC may not be same person who uses GCC. -- H.J.
[PATCH][2/3][PR65458] Mark omp thread functions as parallelized
Hi, this patch fixes PR65458. The patch marks omp thread functions as parallelized, which means the parloops pass no longer attempts to modify that function. Bootstrapped and reg-tested on x86_64. OK for stage4 trunk? Thanks, - Tom Mark omp thread functions as parallelized 2015-03-18 Tom de Vries PR tree-optimization/65458 * omp-low.c: Add include of tree-parloops.h. (expand_omp_taskreg): Call mark_parallelized_function for child_fn. * tree-parloops.c (mark_parallelized_function): New function. Factor out of .. (create_loop_fn): ... here. * tree-parloops.h (mark_parallelized_function): Declare. --- gcc/omp-low.c | 2 ++ gcc/tree-parloops.c | 14 ++ gcc/tree-parloops.h | 1 + 3 files changed, 13 insertions(+), 4 deletions(-) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 48d73cb..c5c0ccf 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -108,6 +108,7 @@ along with GCC; see the file COPYING3. If not see #include "context.h" #include "lto-section-names.h" #include "gomp-constants.h" +#include "tree-parloops.h" /* Lowering of OMP parallel and workshare constructs proceeds in two @@ -5370,6 +5371,7 @@ expand_omp_taskreg (struct omp_region *region) entry_stmt = last_stmt (region->entry); child_fn = gimple_omp_taskreg_child_fn (entry_stmt); child_cfun = DECL_STRUCT_FUNCTION (child_fn); + mark_parallelized_function (child_fn); entry_bb = region->entry; exit_bb = region->exit; diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index a584460..7405258 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -1439,6 +1439,16 @@ parallelized_function_p (tree fn) return bitmap_bit_p (parallelized_functions, DECL_UID (fn)); } +void +mark_parallelized_function (tree fndecl) +{ + if (!parallelized_functions) +parallelized_functions = BITMAP_GGC_ALLOC (); + + bitmap_set_bit (parallelized_functions, DECL_UID (fndecl)); +} + + /* Creates and returns an empty function that will receive the body of a parallelized loop. */ @@ -1459,10 +1469,6 @@ create_loop_fn (location_t loc) type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE); decl = build_decl (loc, FUNCTION_DECL, name, type); - if (!parallelized_functions) -parallelized_functions = BITMAP_GGC_ALLOC (); - bitmap_set_bit (parallelized_functions, DECL_UID (decl)); - TREE_STATIC (decl) = 1; TREE_USED (decl) = 1; DECL_ARTIFICIAL (decl) = 1; diff --git a/gcc/tree-parloops.h b/gcc/tree-parloops.h index d71f0a4..a742755 100644 --- a/gcc/tree-parloops.h +++ b/gcc/tree-parloops.h @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see #define GCC_TREE_PARLOOPS_H extern bool parallelized_function_p (tree); +extern void mark_parallelized_function (tree); #endif /* GCC_TREE_PARLOOPS_H */ -- 1.9.1
[PATCH][3/3][PR65460] Mark offloaded functions as parallelized
Hi, this patch fixes PR65460. The patch marks offloaded functions as parallelized, which means the parloops pass no longer attempts to modify that function. Bootstrapped and reg-tested on x86_64. OK for stage4 trunk? Thanks, - Tom Mark offloaded functions as parallelized 2015-03-18 Tom de Vries PR tree-optimization/65460 * omp-low.c (expand_omp_target): Call mark_parallelized_function for child_fn. --- gcc/omp-low.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index c5c0ccf..e7ceee2 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -8801,6 +8801,7 @@ expand_omp_target (struct omp_region *region) { child_fn = gimple_omp_target_child_fn (entry_stmt); child_cfun = DECL_STRUCT_FUNCTION (child_fn); + mark_parallelized_function (child_fn); } /* Supported by expand_omp_taskreg, but not here. */ -- 1.9.1
[PATCH][1/3] Make parallelize_loops static
Hi, this patch makes parallelize_loops static. I think the fact that it was extern was a left-over from the times that the parloops pass was not located in tree-parloops.c. bootstrapped and reg-tested on x86_64. OK for stage4 trunk? Thanks, - Tom Make parallelize_loops static 2015-03-18 Tom de Vries * tree-parloops.c (parallelize_loops): Make static. * tree-parloops.h (parallelize_loops): Remove extern declaration. --- gcc/tree-parloops.c | 2 +- gcc/tree-parloops.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index fbb9eeb..a584460 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2153,7 +2153,7 @@ try_create_reduction_list (loop_p loop, primitives. Returns true if some loop was parallelized, false otherwise. */ -bool +static bool parallelize_loops (void) { unsigned n_threads = flag_tree_parallelize_loops; diff --git a/gcc/tree-parloops.h b/gcc/tree-parloops.h index 3256446..d71f0a4 100644 --- a/gcc/tree-parloops.h +++ b/gcc/tree-parloops.h @@ -21,6 +21,5 @@ along with GCC; see the file COPYING3. If not see #define GCC_TREE_PARLOOPS_H extern bool parallelized_function_p (tree); -extern bool parallelize_loops (void); #endif /* GCC_TREE_PARLOOPS_H */ -- 1.9.1
Re: [PATCH][2/3][PR65458] Mark omp thread functions as parallelized
On Wed, Mar 18, 2015 at 06:21:51PM +0100, Tom de Vries wrote: > this patch fixes PR65458. > > The patch marks omp thread functions as parallelized, which means the > parloops pass no longer attempts to modify that function. > > Bootstrapped and reg-tested on x86_64. > > OK for stage4 trunk? This will certainly not work with LTO. IMHO you instead want some cgraph_node flag, and make sure you stream it out and in for LTO. Jakub
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
2015-03-18 20:14 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 10:13 AM, Ilya Enkovich > wrote: >> 2015-03-18 19:45 GMT+03:00 H.J. Lu : >>> On Wed, Mar 18, 2015 at 9:14 AM, Ilya Enkovich >>> wrote: Instrumented binary used with legacy libraries is a supported usage model. Each user determines his own level of security. >>> >>> It doesn't mean we should leave a door open. Are we supposed to >>> detect this with MPX: >>> >>> [hjl@skylakeclient bug-1]$ cat x.c >>> #include >>> >>> int >>> main () >>> { >>> char buf[10]; >>> memset(buf, 'a', 11); >>> return 0; >>> } >>> [hjl@skylakeclient bug-1]$ >>> >>> I believe we should, not maybe. We shouldn't silent fail it >>> when linker doesn't support -z bndplt. >> >> It depends on compiler flags and libraries used and is up to user to >> decide. User may be warned during libmpx configuration. >> > > What is "USER"? The one who build GCC may not be same > person who uses GCC. > The person who build GCC determines its default behavior. User either uses it with default settings or overwrites it with own flags. You may warn the person who build GCC that his config has no '-z bndplt' by default. Ilya > > -- > H.J.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Mar 18, 2015 at 10:34 AM, Ilya Enkovich It doesn't mean we should leave a door open. Are we supposed to detect this with MPX: [hjl@skylakeclient bug-1]$ cat x.c #include int main () { char buf[10]; memset(buf, 'a', 11); return 0; } [hjl@skylakeclient bug-1]$ I believe we should, not maybe. We shouldn't silent fail it when linker doesn't support -z bndplt. >>> >>> It depends on compiler flags and libraries used and is up to user to >>> decide. User may be warned during libmpx configuration. >>> >> >> What is "USER"? The one who build GCC may not be same >> person who uses GCC. >> > > The person who build GCC determines its default behavior. User either > uses it with default settings or overwrites it with own flags. You may > warn the person who build GCC that his config has no '-z bndplt' by > default. > Person who use GCC have no idea about it. That is why we should always pass -z bndplt to ld if MPX is enabled. Otherwise, Person who use GCC may falsely believe him/her are covered by MPX. -- H.J.
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
2015-03-18 20:39 GMT+03:00 H.J. Lu : > On Wed, Mar 18, 2015 at 10:34 AM, Ilya Enkovich >>> >>> What is "USER"? The one who build GCC may not be same >>> person who uses GCC. >>> >> >> The person who build GCC determines its default behavior. User either >> uses it with default settings or overwrites it with own flags. You may >> warn the person who build GCC that his config has no '-z bndplt' by >> default. >> > > Person who use GCC have no idea about it. That is why we > should always pass -z bndplt to ld if MPX is enabled. Otherwise, > Person who use GCC may falsely believe him/her are covered by > MPX. This person should be more careful because there are other ways to use MPX including model with no bndplt. Ilya > > -- > H.J.
Re: [PATCH][1/3] Make parallelize_loops static
On March 18, 2015 6:21:34 PM GMT+01:00, Tom de Vries wrote: >Hi, > >this patch makes parallelize_loops static. I think the fact that it was >extern >was a left-over from the times that the parloops pass was not located >in >tree-parloops.c. > >bootstrapped and reg-tested on x86_64. > >OK for stage4 trunk? OK. Thanks, Richard. >Thanks, >- Tom
[patch] c++/65046 Use abi-tag to fix building with -Werror=abi-tag
This adds the abi_tag attribute in a few places it's missing, so that -Werror=abi-tag builds will work after Jason's upcoming patches to make functions inherit the tag from their return type. For the __sso_string type I disable the warning, because the fact it uses std::__cxx11::string internally is not visible to users. There are still some missing tags in the Debug Mode string and list which I'll fix asap. Preparing this patch reminded me that we currently have two copies of the Catalog_info and Catalogs code in the unnamed namespace in config/locale/gnu/messages_members.cc, one using the old string and one using the new. We should really alter the code to not use std::string so that the catalogs can be shared by both versions of the messages facets. Tested x86_64-linux, as a standard build, and also with Jason's WIP patches and --enable-cxx-flags=-Werror=abi-tag. Committed to trunk. commit b6a949d6488a8ee72f182e7d859c830b45bf71a0 Author: Jonathan Wakely Date: Tue Mar 17 16:45:32 2015 + PR c++/65046 * config/locale/gnu/messages_members.cc (Catalog_info, Catalogs, get_catalogs): Add abi-tag. * include/ext/codecvt_specializations.h (encoding_state, encoding_char_traits): Likewise. * src/c++11/cxx11-ios_failure.cc (io_error_category): Likewise. * src/c++11/cxx11-shim_facets.cc (__any_string::operator basic_string, numpunct_shim, collate_shim, time_get_shim, moneypunct_shim, money_get_shim, money_put_shim, messages_shim): Likewise. * src/c++11/future.cc (future_error_category::message): Likewise. * src/c++11/system_error.cc (generic_error_category::message, system_error_category::message): Likewise. (__sso_string): Disable -Wabi-tag warnings. diff --git a/libstdc++-v3/config/locale/gnu/messages_members.cc b/libstdc++-v3/config/locale/gnu/messages_members.cc index 2e6122d..c90499e 100644 --- a/libstdc++-v3/config/locale/gnu/messages_members.cc +++ b/libstdc++-v3/config/locale/gnu/messages_members.cc @@ -46,8 +46,8 @@ namespace typedef messages_base::catalog catalog; - struct Catalog_info -{ + struct _GLIBCXX_DEFAULT_ABI_TAG Catalog_info + { Catalog_info(catalog __id, const string& __domain, locale __loc) : _M_id(__id), _M_domain(__domain), _M_locale(__loc) { } @@ -57,7 +57,7 @@ namespace locale _M_locale; }; - class Catalogs + class _GLIBCXX_DEFAULT_ABI_TAG Catalogs { public: Catalogs() : _M_catalog_counter(0) { } @@ -133,6 +133,7 @@ namespace std::vector _M_infos; }; + _GLIBCXX_DEFAULT_ABI_TAG Catalogs& get_catalogs() { diff --git a/libstdc++-v3/include/ext/codecvt_specializations.h b/libstdc++-v3/include/ext/codecvt_specializations.h index a24adfc..d9f6630 100644 --- a/libstdc++-v3/include/ext/codecvt_specializations.h +++ b/libstdc++-v3/include/ext/codecvt_specializations.h @@ -47,7 +47,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // This includes conversions and comparisons between various character // sets. This object encapsulates data that may need to be shared between // char_traits, codecvt and ctype. - class encoding_state + class _GLIBCXX_DEFAULT_ABI_TAG encoding_state { public: // Types: @@ -207,7 +207,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // associated fpos for the position type, all other // bits equivalent to the required char_traits instantiations. template -struct encoding_char_traits : public std::char_traits<_CharT> +struct _GLIBCXX_DEFAULT_ABI_TAG encoding_char_traits +: public std::char_traits<_CharT> { typedef encoding_statestate_type; typedef typename std::fpos pos_type; diff --git a/libstdc++-v3/src/c++11/cxx11-ios_failure.cc b/libstdc++-v3/src/c++11/cxx11-ios_failure.cc index e1c8d4e..b0a7c46 100644 --- a/libstdc++-v3/src/c++11/cxx11-ios_failure.cc +++ b/libstdc++-v3/src/c++11/cxx11-ios_failure.cc @@ -41,6 +41,7 @@ namespace name() const noexcept { return "iostream"; } +_GLIBCXX_DEFAULT_ABI_TAG virtual std::string message(int __ec) const { std::string __msg; diff --git a/libstdc++-v3/src/c++11/cxx11-shim_facets.cc b/libstdc++-v3/src/c++11/cxx11-shim_facets.cc index 82bdf6f..a32b9f0 100644 --- a/libstdc++-v3/src/c++11/cxx11-shim_facets.cc +++ b/libstdc++-v3/src/c++11/cxx11-shim_facets.cc @@ -147,6 +147,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // The returned object will match the caller's string ABI, even when the // stored string doesn't. template + _GLIBCXX_DEFAULT_ABI_TAG operator basic_string() const { if (!_M_dtor) @@ -226,7 +227,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION namespace // unnamed { template - struct numpunct_shim : std::numpunct<_CharT>, facet::__shim + struct _GLIBCXX_DEFAULT_ABI_TAG numpunct_shim + : std::numpunct<_CharT>, facet::__shim { typedef typename numpunct<_CharT>::__cache_type __cache_type; @@ -250,7 +252,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
Re: [PATCH][RFC] Fix PR63155
On March 18, 2015 4:59:30 PM GMT+01:00, Alan Lawrence wrote: >Following this patch (r221318), we're seeing what appears to be a >miscompile of >glibc on AArch64. This causes quite a bunch of tests to fail, segfaults >etc., if >LD_LIBRARY_PATH leads to a libc.so.6 built with that patch vs without >(same >glibc sources). We are still working on a reduced testcase, but this is >proving >difficult... The patch was supposed to end up in the very same code generation. Thus it's enough to identify the problematic source file and send me preprocessed source and cc1 invocation so I can investigate with a cross. Thanks, Richard. >--Alan > >Richard Biener wrote: >> On Mon, 9 Mar 2015, Richard Biener wrote: >> >>> On Fri, 6 Mar 2015, Jeff Law wrote: >>> On 03/06/15 06:16, Richard Biener wrote: > This fixes PR63155 and reduces the memory usage at -O0 from >reported > 10GB (couldn't verify/update on my small box) to 350MB (still >worse > compared to 4.8 which needs only 50MB). > > It fixes this by no longer computing live info or building a >conflict > graph for coalescing of SSA names flowing over abnormal edges > (which needs to succeed). > > Of course this also removes verification that this coalescing is >valid > (but computing this has quadratic cost). With this it turns > ICEs into miscompiles. > > We could restrict verifying that we can perform abnormal >coalescing > to ENABLE_CHECKING (and I've wanted a verifier pass that can >verify > this multiple times to be able to catch what breaks it and not >having > to work back from out-of-SSA ICEing...). > > So any opinion on this patch welcome. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > Ok for trunk? ;) > > Thanks, > Richard. > > 2015-03-06 Richard Biener > > PR middle-end/63155 > * tree-ssa-coalesce.c (attempt_coalesce): Handle graph being >NULL. > (coalesce_partitions): Split out abnormal coalescing to ... > (perform_abnormal_coalescing): ... this function. > (coalesce_ssa_name): Perform abnormal coalescing without >computing > live/conflict. I'd personally like to keep the checking when ENABLE_CHECKING. I haven't followed this discussion real closely, but I wonder if >some kind of blocking approach would work without blowing up the memory >consumption. There's no inherent reason why we have to coalesce everything at >the same time. We can use a blocking factor and do coalescing on some N >number of SSA_NAMEs at a time. >>> Yes, that's possible at (quite?) some compile-time cost. Note that >we >>> can't really guarantee that the resulting live/conflict problems >shrink >>> significantly enough without sorting the coalesces in a different >way >>> (not after important coalesces but after their basevars). >>> I suspect we can select an N that ultimately degenerates into the >current "do everything together" for the common case and only has to iterate >over blocks of SSA_NAMEs in extreme cases. >>> I've attached a patch to the PR that adds such a number N after >which we >>> simply stop coalescing. Of course that doesn't work for abnormals >where >>> we _must_ coalesce. Thus this patch ... >>> >>> As said, it's simple to keep the checking with ENABLE_CHECKING (I >wonder >>> if we should make some of the checking we do a runtime choice rather >>> than a compile-time one...). I'll update the patch. >> >> Ok, like the following which adds a verify_ssa_coalescing () function >> (which could in theory be called from IL verification like >verify_ssa) >> and calls it when ENABLE_CHECKING is defined. >> >> Bootstrap & regtest running on x86_64-unknown-linux-gnu. >> >> It didn't look appropriate for this stage to implement virtual >operand >> verification. >> >> Ok this way? >> >> Thanks, >> Richard. >> >> 2015-03-06 Richard Biener >> >> PR middle-end/63155 >> * tree-ssa-coalesce.h (verify_ssa_coalescing): Declare. >> * tree-ssa-coalesce.c (attempt_coalesce): Handle graph being >NULL. >> (coalesce_partitions): Call verify_ssa_coalescing if >ENABLE_CHECKING. >> Split out abnormal coalescing to ... >> (perform_abnormal_coalescing): ... this function. >> (coalesce_ssa_name): Perform abnormal coalescing without >computing >> live/conflict. >> (verify_ssa_coalescing_worker): New function. >> (verify_ssa_coalescing): Likewise. >> >> Index: gcc/tree-ssa-coalesce.c >> === >> *** gcc/tree-ssa-coalesce.c (revision 221278) >> --- gcc/tree-ssa-coalesce.c (working copy) >> *** create_outofssa_var_map (coalesce_list_p >> *** 1121,1128 >> >> >> /* Attempt to coalesce ssa versions X and Y together using the >partition >> !mapping in MAP and checking conflicts in GR
Re: [PATCH] PR target/65248: Copy relocation against protected symbol doesn't work
On Mar 18, 2015, at 2:55 AM, Uros Bizjak wrote: > >> Here is the updated patch with testcases. Tested on Linux/x86. OK >> for trunk? > This patch needs global reviewer approval (I have added Jakub to CC) > and Darwin maintainer approval. So, my concern would be this, does the bug also impact darwin, and does the bug fix also fix darwin? If no and the change doesn’t change code-gen for darwin (which I think it does not), then the darwin bits are ok. I did a quick check of the test case on darwin, no protected variables (we ignore the protected request). When compiled, the program works (returns 0).
[PATCH, rtl-optimization]: Fix PR60851, ICE: in extract_constrain_insn_cached, at recog.c
Hello! Attached patch activates "safety net" recognition of pseudo regs for LRA-enabled targets. LRA is able to spill pseudos to memory to satisfy memory-only constraints, so constrain_operands should recognize this functionality. This is actually the same approach as how constants are handled a couple of lines above. the changed section. 2015-03-18 Uros Bizjak PR rtl-optimization/60851 * recog.c (constrain_operands): Accept a pseudo register before reload for LRA enabled targets. testsuite/ChangeLog: 2015-03-18 Uros Bizjak PR rtl-optimization/60851 * gcc.target/i386/pr60851.c: New test. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. OK for mainline and 4.9 branch? Uros. Index: recog.c === --- recog.c (revision 221493) +++ recog.c (working copy) @@ -2775,6 +2775,10 @@ constrain_operands (int strict, alternative_mask a /* Before reload, accept what reload can turn into mem. */ || (strict < 0 && CONSTANT_P (op)) + /* Before reload, accept a pseudo, + since LRA can turn it into mem. */ + || (targetm.lra_p () && strict < 0 && REG_P (op) + && REGNO (op) >= FIRST_PSEUDO_REGISTER) /* During reload, accept a pseudo */ || (reload_in_progress && REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER))) Index: testsuite/gcc.target/i386/pr60851.c === --- testsuite/gcc.target/i386/pr60851.c (revision 0) +++ testsuite/gcc.target/i386/pr60851.c (working copy) @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -flive-range-shrinkage -mtune=bdver4 -mdispatch-scheduler" } */ + +long double ld (char c) +{ + return c; +}
Re: [PATCH] PR target/65248: Copy relocation against protected symbol doesn't work
On Wed, Mar 18, 2015 at 11:58 AM, Mike Stump wrote: > On Mar 18, 2015, at 2:55 AM, Uros Bizjak wrote: >> >>> Here is the updated patch with testcases. Tested on Linux/x86. OK >>> for trunk? > >> This patch needs global reviewer approval (I have added Jakub to CC) >> and Darwin maintainer approval. > > So, my concern would be this, does the bug also impact darwin, and does the > bug fix also fix darwin? This bug doesn't impact darwin. > If no and the change doesn’t change code-gen for darwin (which I think it > does not), then the darwin bits are ok. > > I did a quick check of the test case on darwin, no protected variables (we > ignore the protected request). When compiled, the program works (returns 0). It is expected. This bug only affects targets which support protected visibility and use copy relocation. -- H.J.
Re: [PATCH] Fix PR ipa/65439
On 03/18/2015 02:06 PM, Martin Liška wrote: Hi. Sorry for previous mail, which has wrongly assigned To address. Original message: This is fix for PR65439 which extends number of expected equivalences for IPA ICF that can be different on darwin target. Ready for trunk? Thanks Martin I've just written to Honza and he approved the patch, having really poor internet connection. I'm going to install the patch. Martin
Re: [PATCH][RFC] Fix PR63155
On Wed, Mar 18, 2015 at 8:59 AM, Alan Lawrence wrote: > Following this patch (r221318), we're seeing what appears to be a miscompile > of glibc on AArch64. This causes quite a bunch of tests to fail, segfaults > etc., if LD_LIBRARY_PATH leads to a libc.so.6 built with that patch vs > without (same glibc sources). We are still working on a reduced testcase, > but this is proving difficult... I was just debugging the same issue. From what I can tell vfprintf is being miscompiled. An example code which shows the issue but not self-contained yet as I thought it was related to varargs but I suspect it is more related to the computed gotos inside vprintf in glibc. #include #include int g(int a, va_list ap) { int b; b = va_arg(ap, int); if (b != SECOND) __builtin_abort (); if (a != FIRST) __builtin_abort (); printf("first: %d.\n", a); printf("second: %d.\n", b); } int f(int a, ...) { int b; va_list ap; va_start(ap, a); g(a, ap); va_end (ap); return 0; } int main(void) { return f(2, FIRST, SECOND); } --- CUT --- With the new glibc I get: first: 4194304. second: 1. But with the old one I get : first: 16. second: 32. Thanks, Andrew Pinski > > --Alan > > > Richard Biener wrote: >> >> On Mon, 9 Mar 2015, Richard Biener wrote: >> >>> On Fri, 6 Mar 2015, Jeff Law wrote: >>> On 03/06/15 06:16, Richard Biener wrote: > > This fixes PR63155 and reduces the memory usage at -O0 from reported > 10GB (couldn't verify/update on my small box) to 350MB (still worse > compared to 4.8 which needs only 50MB). > > It fixes this by no longer computing live info or building a conflict > graph for coalescing of SSA names flowing over abnormal edges > (which needs to succeed). > > Of course this also removes verification that this coalescing is valid > (but computing this has quadratic cost). With this it turns > ICEs into miscompiles. > > We could restrict verifying that we can perform abnormal coalescing > to ENABLE_CHECKING (and I've wanted a verifier pass that can verify > this multiple times to be able to catch what breaks it and not having > to work back from out-of-SSA ICEing...). > > So any opinion on this patch welcome. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > Ok for trunk? ;) > > Thanks, > Richard. > > 2015-03-06 Richard Biener > > PR middle-end/63155 > * tree-ssa-coalesce.c (attempt_coalesce): Handle graph being NULL. > (coalesce_partitions): Split out abnormal coalescing to ... > (perform_abnormal_coalescing): ... this function. > (coalesce_ssa_name): Perform abnormal coalescing without computing > live/conflict. I'd personally like to keep the checking when ENABLE_CHECKING. I haven't followed this discussion real closely, but I wonder if some kind of blocking approach would work without blowing up the memory consumption. There's no inherent reason why we have to coalesce everything at the same time. We can use a blocking factor and do coalescing on some N number of SSA_NAMEs at a time. >>> >>> Yes, that's possible at (quite?) some compile-time cost. Note that we >>> can't really guarantee that the resulting live/conflict problems shrink >>> significantly enough without sorting the coalesces in a different way >>> (not after important coalesces but after their basevars). >>> I suspect we can select an N that ultimately degenerates into the current "do everything together" for the common case and only has to iterate over blocks of SSA_NAMEs in extreme cases. >>> >>> I've attached a patch to the PR that adds such a number N after which we >>> simply stop coalescing. Of course that doesn't work for abnormals where >>> we _must_ coalesce. Thus this patch ... >>> >>> As said, it's simple to keep the checking with ENABLE_CHECKING (I wonder >>> if we should make some of the checking we do a runtime choice rather >>> than a compile-time one...). I'll update the patch. >> >> >> Ok, like the following which adds a verify_ssa_coalescing () function >> (which could in theory be called from IL verification like verify_ssa) >> and calls it when ENABLE_CHECKING is defined. >> >> Bootstrap & regtest running on x86_64-unknown-linux-gnu. >> >> It didn't look appropriate for this stage to implement virtual operand >> verification. >> >> Ok this way? >> >> Thanks, >> Richard. >> >> 2015-03-06 Richard Biener >> >> PR middle-end/63155 >> * tree-ssa-coalesce.h (verify_ssa_coalescing): Declare. >> * tree-ssa-coalesce.c (attempt_coalesce): Handle graph being NULL. >> (coalesce_partitions): Call verify_ssa_coalescing if >> ENABLE_CHECKING. >> Split out abnormal coalescing to ... >> (perform_abnormal_coalescing): ... this function. >> (coalesce_ssa_name): Perform abnormal coalescing
Re: [debug-early] Cache DW_TAG_label's if appropriate
On 03/18/2015 01:03 PM, Aldy Hernandez wrote: By the way Jason, is there any reason we previously called equate_decl_number_to_die() only for abstract instances?: So that multiple concrete instances can find the abstract instance. Why isn't DECL_ABSTRACT_P set the first time you get here for the label in question? I don't see anything else in dwarf2out.c doing a lookup_decl_die() on a LABEL_DECL, so I don't even know why the equate_decl_number_to_die() is even there. add_abstract_origin_attribute uses lookup_decl_die. Jason
[PATCH] Add testcase for PR64265
Hi! Back when I've added the EH support for tsan (aka TSAN_FUNC_EXIT), we didn't have the barrier_wait hack to make tsan threads reliable, so I didn't commit a testcase. This one fails with that patch reverted and succeeds with the current trunk, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-03-18 Jakub Jelinek PR sanitizer/64265 * g++.dg/tsan/pr64265.C: New test. --- gcc/testsuite/g++.dg/tsan/pr64265.C.jj 2015-03-18 17:47:28.871772124 +0100 +++ gcc/testsuite/g++.dg/tsan/pr64265.C 2015-03-18 18:00:39.342922371 +0100 @@ -0,0 +1,54 @@ +// PR sanitizer/64265 +// { dg-shouldfail "tsan" } +// { dg-additional-options "-fno-omit-frame-pointer -ldl" } + +#include +#include "tsan_barrier.h" + +static pthread_barrier_t barrier; +int v; + +__attribute__((noinline, noclone)) int +foo (int x) +{ + if (x < 99) +throw x; + barrier_wait (&barrier); + v++; + return x; +} + +__attribute__((noinline, noclone)) void +bar (void) +{ + for (int i = 0; i < 100; i++) +try + { + foo (i); + } +catch (int) + { + } +} + +__attribute__((noinline, noclone)) void * +tf (void *) +{ + bar (); + return NULL; +} + +int +main () +{ + pthread_t th; + barrier_init (&barrier, 2); + if (pthread_create (&th, NULL, tf, NULL)) +return 0; + v++; + barrier_wait (&barrier); + pthread_join (th, NULL); + return 0; +} + +// { dg-output "WARNING: ThreadSanitizer: data race.*#2 _?(tf|_Z2tfPv)" } Jakub
Fix PR 65177: diamonds are not valid execution threads for jump threading
Hi, the attached patch fixes PR 65177 in which the code generator of FSM jump thread would create a diamond on the copied path: see https://gcc.gnu.org/PR65177#c18 for a detailed description. The patch is renaming SEME into jump_thread as the notion of SEME is more general than the special case that we are interested in, in the case of jump-threading: an execution thread to be jump-threaded has the property that each node on the path has exactly one predecessor, disallowing any other control flow like diamonds or back-edge loops within the SEME region. The patch passes regstrap on x86-64-linux, and fixes the make check of hmmer. Ok to commit? Thanks, Sebastian >From 8c82e8b8c7d864c009bb7a116faf4acf64954704 Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Tue, 17 Mar 2015 20:28:19 +0100 Subject: [PATCH] fix 65177 PR tree-optimization/65177 * cfghooks.c (bb_in_bbs): New. (copy_bbs): Add parameter make_jump_thread. * cfghooks.h (copy_bbs): Update definition. * cfgloopmanip.c (duplicate_loop_to_header_edge): Update use of copy_bbs. * trans-mem.c (ipa_uninstrument_transaction): Same. * tree-cfg.c (gimple_duplicate_sese_region): Same. (gimple_duplicate_sese_tail): Same. * tree-vect-loop-manip.c (slpeel_tree_duplicate_loop_to_edge_cfg): Same. * tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread. --- gcc/cfghooks.c | 39 +-- gcc/cfghooks.h |2 +- gcc/cfgloopmanip.c |2 +- gcc/trans-mem.c |2 +- gcc/tree-cfg.c |4 ++-- gcc/tree-ssa-threadupdate.c | 31 --- gcc/tree-vect-loop-manip.c |2 +- 7 files changed, 51 insertions(+), 31 deletions(-) diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c index abeab8c..1a9e2f9 100644 --- a/gcc/cfghooks.c +++ b/gcc/cfghooks.c @@ -1300,6 +1300,18 @@ end: return ret; } +/* Return true when BB is one of the first N items in BBS. */ + +static inline bool +bb_in_bbs (basic_block bb, basic_block *bbs, int n) +{ + for (int i = 0; i < n; i++) +if (bb == bbs[i]) + return true; + + return false; +} + /* Duplicates N basic blocks stored in array BBS. Newly created basic blocks are placed into array NEW_BBS in the same order. Edges from basic blocks in BBS are also duplicated and copies of those that lead into BBS are @@ -1321,12 +1333,16 @@ end: also in the same order. Newly created basic blocks are put after the basic block AFTER in the - instruction stream, and the order of the blocks in BBS array is preserved. */ + instruction stream, and the order of the blocks in BBS array is preserved. + + When MAKE_JUMP_THREAD is true, only redirect edges to consecutive elements of + BBS. */ void copy_bbs (basic_block *bbs, unsigned n, basic_block *new_bbs, edge *edges, unsigned num_edges, edge *new_edges, - struct loop *base, basic_block after, bool update_dominance) + struct loop *base, basic_block after, bool update_dominance, + bool make_jump_thread) { unsigned i, j; basic_block bb, new_bb, dom_bb; @@ -1385,6 +1401,25 @@ copy_bbs (basic_block *bbs, unsigned n, basic_block *new_bbs, if (!(e->dest->flags & BB_DUPLICATED)) continue; + + if (make_jump_thread) + { + /* When creating a jump-thread, we only redirect edges to + consecutive basic blocks. */ + if (i + 1 < n) + { + if (e->dest != bbs[i + 1]) + continue; + } + else + { + /* Do not jump back into the jump-thread path from the last + BB. */ + if (bb_in_bbs (e->dest, bbs, n - 1)) + continue; + } + } + redirect_edge_and_branch_force (e, get_bb_copy (e->dest)); } } diff --git a/gcc/cfghooks.h b/gcc/cfghooks.h index 4a1340e..9a17180 100644 --- a/gcc/cfghooks.h +++ b/gcc/cfghooks.h @@ -238,7 +238,7 @@ extern void lv_add_condition_to_bb (basic_block, basic_block, basic_block, extern bool can_copy_bbs_p (basic_block *, unsigned); extern void copy_bbs (basic_block *, unsigned, basic_block *, edge *, unsigned, edge *, struct loop *, - basic_block, bool); + basic_block, bool, bool); void account_profile_record (struct profile_record *, int); diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c index 45cc85d..e987b6b 100644 --- a/gcc/cfgloopmanip.c +++ b/gcc/cfgloopmanip.c @@ -1337,7 +1337,7 @@ duplicate_loop_to_header_edge (struct loop *loop, edge e, /* Copy bbs. */ copy_bbs (bbs, n, new_bbs, spec_edges, 2, new_spec_edges, loop, - place_after, true); + place_after, true, false); place_after = new_spec_edges[SE_LATCH]->src; if (flags & DLTHE_RECORD_COPY_NUMBER) diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c index 078c2da..eb88735 100644 --- a/gcc/trans-mem.c +++ b/gcc/trans-mem.c @@ -4162,7 +4162,7 @@ ipa_uninstrument_transaction (struct tm_region *region, basic_block *new_bbs = XNEWVEC (basic_block, n); copy_bbs (queue.address (), n, new_bbs, NULL, 0, NUL
[PATCH] Teach ipa-split about TSAN_FUNC_EXIT (PR sanitizer/65400)
Hi! TSAN_FUNC_EXIT internal function is special, because we drop it during inlining, so for fnsplit we need to be careful about it, otherwise we can end up with unbalanced pairs of tsan entry/exit marker functions. This patch gives up unless it is one of the two most common cases with -fsanitize=thread - either TSAN_FUNC_EXIT in the return bb, which is handled as accepting it as return bb despite the TSAN_FUNC_EXIT call - duplicating the TSAN_FUNC_EXIT is exactly the right thing to do in that case, we want it in the *.part.* function and also in the caller, if the former is later inlined into the latter again, the TSAN_FUNC_EXIT of the former is dropped. And another case that we handle is if there is TSAN_FUNC_EXIT on the predecessors of return_bb - we can handle that case easily too by duplicating TSAN_FUNC_EXIT after the *.part.* call. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-03-18 Jakub Jelinek PR sanitizer/65400 * ipa-split.c (find_return_bb): Allow TSAN_FUNC_EXIT internal call in the return bb. (find_split_points): Add RETURN_BB argument, don't call find_return_bb. (split_function): Likewise. Add ADD_TSAN_FUNC_EXIT argument, if true append TSAN_FUNC_EXIT internal call after the call to the split off function. (execute_split_functions): Call find_return_bb here. Don't optimize if TSAN_FUNC_EXIT is found in unexpected places. Adjust find_split_points and split_function calls. * c-c++-common/tsan/pr65400-1.c: New test. * c-c++-common/tsan/pr65400-2.c: New test. --- gcc/ipa-split.c.jj 2015-03-17 18:10:11.0 +0100 +++ gcc/ipa-split.c 2015-03-18 17:12:45.017773858 +0100 @@ -769,7 +769,8 @@ consider_split (struct split_point *curr of the form: = tmp_var; return - but return_bb can not be more complex than this. + but return_bb can not be more complex than this (except for + -fsanitize=thread we allow TSAN_FUNC_EXIT () internal call in there). If nothing is found, return the exit block. When there are multiple RETURN statement, chose one with return value, @@ -814,6 +815,13 @@ find_return_bb (void) found_return = true; retval = gimple_return_retval (return_stmt); } + /* For -fsanitize=thread, allow also TSAN_FUNC_EXIT () in the return +bb. */ + else if ((flag_sanitize & SANITIZE_THREAD) + && is_gimple_call (stmt) + && gimple_call_internal_p (stmt) + && gimple_call_internal_fn (stmt) == IFN_TSAN_FUNC_EXIT) + ; else break; } @@ -1074,12 +1082,11 @@ typedef struct the component used by consider_split. */ static void -find_split_points (int overall_time, int overall_size) +find_split_points (basic_block return_bb, int overall_time, int overall_size) { stack_entry first; vec stack = vNULL; basic_block bb; - basic_block return_bb = find_return_bb (); struct split_point current; current.header_time = overall_time; @@ -1236,19 +1243,20 @@ insert_bndret_call_after (tree retbnd, t gimple_call_set_lhs (bndret, retbnd); gsi_insert_after (gsi, bndret, GSI_CONTINUE_LINKING); } + /* Split function at SPLIT_POINT. */ static void -split_function (struct split_point *split_point) +split_function (basic_block return_bb, struct split_point *split_point, + bool add_tsan_func_exit) { vec args_to_pass = vNULL; bitmap args_to_skip; tree parm; int num = 0; cgraph_node *node, *cur_node = cgraph_node::get (current_function_decl); - basic_block return_bb = find_return_bb (); basic_block call_bb; - gcall *call; + gcall *call, *tsan_func_exit_call = NULL; edge e; edge_iterator ei; tree retval = NULL, real_retval = NULL, retbnd = NULL; @@ -1534,11 +1542,18 @@ split_function (struct split_point *spli || DECL_BY_REFERENCE (DECL_RESULT (current_function_decl gimple_call_set_return_slot_opt (call, true); + if (add_tsan_func_exit) +tsan_func_exit_call = gimple_build_call_internal (IFN_TSAN_FUNC_EXIT, 0); + /* Update return value. This is bit tricky. When we do not return, do nothing. When we return we might need to update return_bb or produce a new return statement. */ if (!split_part_return_p) -gsi_insert_after (&gsi, call, GSI_NEW_STMT); +{ + gsi_insert_after (&gsi, call, GSI_NEW_STMT); + if (tsan_func_exit_call) + gsi_insert_after (&gsi, tsan_func_exit_call, GSI_NEW_STMT); +} else { e = make_edge (call_bb, return_bb, @@ -1642,6 +1657,8 @@ split_function (struct split_point *spli } else gsi_insert_after (&gsi, call, GSI_NEW_STMT); + if (tsan_func_exit_call) + gsi_insert_after (&gsi, tsan_func_exit_call, GSI_NEW_STMT); } /* We don't use return block (there is either no return in function or multiple o
Re: [debug-early] Cache DW_TAG_label's if appropriate
On 03/18/2015 01:59 PM, Jason Merrill wrote: On 03/18/2015 01:03 PM, Aldy Hernandez wrote: By the way Jason, is there any reason we previously called equate_decl_number_to_die() only for abstract instances?: So that multiple concrete instances can find the abstract instance. Ah, I see. Why isn't DECL_ABSTRACT_P set the first time you get here for the label in question? Oh, that part was working fine. The testcase that prompted this whole thing had nothing to do with DECL_ABSTRACT_P. I was just curious if I could remove the call to equate_decl_number_to_die, but I see it's needed. I don't see anything else in dwarf2out.c doing a lookup_decl_die() on a LABEL_DECL, so I don't even know why the equate_decl_number_to_die() is even there. add_abstract_origin_attribute uses lookup_decl_die. Thanks. Aldy
[PATCH v2] New testcase to check parameter passing bug
Hi, I have modified the test-case to check parameter passing bug based on the comments from Kyrill Tkachov, Christophe Lyon, and Segher Boessenkool as follows: 1. move from "gcc.target/arm" to "gcc.dg" 2. change "dg-do compile" to "dg-do run" Please let me know if there's still something to fix more. Thanks for your comment. Honggyu --- gcc/testsuite/ChangeLog|4 gcc/testsuite/gcc.dg/pr65358.c | 33 + 2 files changed, 37 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/pr65358.c diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 77d24a1..218f908 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,7 @@ +2015-03-19 Honggyu Kim + + * gcc.dg/pr65358.c: New test. + 2015-03-18 Paolo Carlini PR c++/59816 diff --git a/gcc/testsuite/gcc.dg/pr65358.c b/gcc/testsuite/gcc.dg/pr65358.c new file mode 100644 index 000..ba89fd4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr65358.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +struct pack +{ + int fine; + int victim; + int killer; +}; + +int __attribute__ ((__noinline__, __noclone__)) +bar (int a, int b, struct pack p) +{ + if (a != 20 || b != 30) +__builtin_abort (); + if (p.fine != 40 || p.victim != 50 || p.killer != 60) +__builtin_abort (); + return 0; +} + +int __attribute__ ((__noinline__, __noclone__)) +foo (int arg1, int arg2, int arg3, struct pack p) +{ + return bar (arg2, arg3, p); +} + +int main (void) +{ + struct pack p = { 40, 50, 60 }; + + (void) foo (10, 20, 30, p); + return 0; +} -- 1.7.9.5
Re: [PATCH v2] New testcase to check parameter passing bug
On Thu, Mar 19, 2015 at 10:40:44AM +0900, Honggyu Kim wrote: > --- > gcc/testsuite/ChangeLog|4 > gcc/testsuite/gcc.dg/pr65358.c | 33 + > 2 files changed, 37 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/pr65358.c > > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index 77d24a1..218f908 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > I'm kind of new to gcc community and I'm used to send a patch throught git send-email. If you have other preferable way, please let me know. I will try to follow the format and style more like other gcc patches. Thanks you all. Honggyu
Re: [PATCH] Fix PR ipa/65439
> Hi. > > Sorry for previous mail, which has wrongly assigned To address. Original > message: > > > This is fix for PR65439 which extends number of expected equivalences for IPA > ICF that > can be different on darwin target. > > Ready for trunk? OK Honza > Thanks > Martin > >From 1fb6c73e6ee74a2923aaccbac9658b46005aeba5 Mon Sep 17 00:00:00 2001 > From: mliska > Date: Wed, 18 Mar 2015 09:52:29 +0100 > Subject: [PATCH] Fix PR65439. > > gcc/testsuite/ChangeLog: > > 2015-03-18 Martin Liska > > PR ipa/65439 > * g++.dg/ipa/ipa-icf-4.C: Extend expected number of > equivalences either to 6 or 7. > --- > gcc/testsuite/g++.dg/ipa/ipa-icf-4.C | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C > b/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C > index 2cd7a2e..e5d3123 100644 > --- a/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C > +++ b/gcc/testsuite/g++.dg/ipa/ipa-icf-4.C > @@ -44,5 +44,5 @@ int main() > } > > /* { dg-final { scan-ipa-dump "\(Unified; Variable alias has been > created\)|\(Symbol aliases are not supported by target\)" "icf" } } */ > -/* { dg-final { scan-ipa-dump "Equal symbols: 6" "icf" } } */ > +/* { dg-final { scan-ipa-dump "Equal symbols: \[67\]" "icf" } } */ > /* { dg-final { cleanup-ipa-dump "icf" } } */ > -- > 2.1.2 >
[rl78] fix ICE with far operands to and/ior
Committed. * config/rl78/rl78-virt.md (andqi3_virt): Allow far operands. (iorqi3_virt): Likewise. Index: config/rl78/rl78-virt.md === --- config/rl78/rl78-virt.md(revision 221505) +++ config/rl78/rl78-virt.md(working copy) @@ -128,23 +128,23 @@ "rl78_virt_insns_ok () && !TARGET_G10" "v.mulu\t%0, %2" [(set_attr "valloc" "umul")] ) (define_insn "*andqi3_virt" - [(set (match_operand:QI 0 "rl78_nonfar_nonimm_operand" "=vm") - (and:QI (match_operand:QI 1 "rl78_nonfar_operand" "vim") + [(set (match_operand:QI 0 "rl78_nonimmediate_operand" "=vm") + (and:QI (match_operand:QI 1 "rl78_general_operand" "vim") (match_operand:QI 2 "rl78_general_operand" "vim"))) ] "rl78_virt_insns_ok ()" "v.and\t%0, %1, %2" ) (define_insn "*iorqi3_virt" - [(set (match_operand:QI 0 "rl78_nonfar_nonimm_operand" "=vm") - (ior:QI (match_operand:QI 1 "rl78_nonfar_operand" "vim") + [(set (match_operand:QI 0 "rl78_nonimmediate_operand" "=vm") + (ior:QI (match_operand:QI 1 "rl78_general_operand" "vim") (match_operand:QI 2 "rl78_general_operand" "vim"))) ] "rl78_virt_insns_ok ()" "v.or\t%0, %1, %2" )
[PINGv2][PATCH] ASan on unaligned accesses
On 03/04/2015 11:07 AM, Andrew Pinski wrote: On Wed, Mar 4, 2015 at 12:00 AM, Marat Zakirov wrote: Hi all! Here is the patch which forces ASan to work on memory access without proper alignment. it's useful because some programs like linux kernel often cheat with alignment which may cause false negatives. This patch needs additional support for proper work on unaligned accesses in global data and heap. It will be implemented in libsanitizer by separate patch. --Marat gcc/ChangeLog: 2015-02-25 Marat Zakirov * asan.c (asan_emit_stack_protection): Support for misalign accesses. (asan_expand_check_ifn): Likewise. * params.def: New option asan-catch-misaligned. * params.h: New param ASAN_CATCH_MISALIGNED. Since this parameter can only be true or false, I think it should be a normal option. Also you did not add documentation of the param. Thanks, Andrew Fixed. gcc/ChangeLog: 2015-03-12 Marat Zakirov * asan.c (asan_emit_stack_protection): Support for misalign accesses. (asan_expand_check_ifn): Likewise. * common.opt: New flag -fasan-catch-misaligned. * doc/invoke.texi: New flag description. * opts.c (finish_options): Add check for new flag. (common_handle_option): Switch on flag if SANITIZE_KERNEL_ADDRESS. gcc/testsuite/ChangeLog: 2015-03-12 Marat Zakirov * c-c++-common/asan/misalign-catch.c: New test. diff --git a/gcc/asan.c b/gcc/asan.c index 9e4a629..80bf2e8 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -1050,7 +1050,6 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, rtx_code_label *lab; rtx_insn *insns; char buf[30]; - unsigned char shadow_bytes[4]; HOST_WIDE_INT base_offset = offsets[length - 1]; HOST_WIDE_INT base_align_bias = 0, offset, prev_offset; HOST_WIDE_INT asan_frame_size = offsets[0] - base_offset; @@ -1193,11 +1192,37 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, if (STRICT_ALIGNMENT) set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode))); prev_offset = base_offset; + + vec shadow_mems; + vec shadow_bytes; + + shadow_mems.create(0); + shadow_bytes.create(0); + for (l = length; l; l -= 2) { if (l == 2) cur_shadow_byte = ASAN_STACK_MAGIC_RIGHT; offset = offsets[l - 1]; + if (l != length && flag_asan_catch_misaligned) + { + HOST_WIDE_INT aoff + = base_offset + ((offset - base_offset) + & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1)) + - ASAN_RED_ZONE_SIZE; + if (aoff > prev_offset) + { + shadow_mem = adjust_address (shadow_mem, VOIDmode, + (aoff - prev_offset) + >> ASAN_SHADOW_SHIFT); + prev_offset = aoff; + shadow_bytes.safe_push (0); + shadow_bytes.safe_push (0); + shadow_bytes.safe_push (0); + shadow_bytes.safe_push (0); + shadow_mems.safe_push (shadow_mem); + } + } if ((offset - base_offset) & (ASAN_RED_ZONE_SIZE - 1)) { int i; @@ -1212,13 +1237,13 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, if (aoff < offset) { if (aoff < offset - (1 << ASAN_SHADOW_SHIFT) + 1) - shadow_bytes[i] = 0; + shadow_bytes.safe_push (0); else - shadow_bytes[i] = offset - aoff; + shadow_bytes.safe_push (offset - aoff); } else - shadow_bytes[i] = ASAN_STACK_MAGIC_PARTIAL; - emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes)); + shadow_bytes.safe_push (ASAN_STACK_MAGIC_PARTIAL); + shadow_mems.safe_push(shadow_mem); offset = aoff; } while (offset <= offsets[l - 2] - ASAN_RED_ZONE_SIZE) @@ -1227,12 +1252,21 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, (offset - prev_offset) >> ASAN_SHADOW_SHIFT); prev_offset = offset; - memset (shadow_bytes, cur_shadow_byte, 4); - emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes)); + shadow_bytes.safe_push (cur_shadow_byte); + shadow_bytes.safe_push (cur_shadow_byte); + shadow_bytes.safe_push (cur_shadow_byte); + shadow_bytes.safe_push (cur_shadow_byte); + shadow_mems.safe_push(shadow_mem); offset += ASAN_RED_ZONE_SIZE; } cur_shadow_byte = ASAN_STACK_MAGIC_MIDDLE; } + for (unsigned i = 0; flag_asan_catch_misaligned && i < shadow_bytes.length () - 1; i++) +if (shadow_bytes[i] == 0 && shadow_bytes[i + 1] > 0) + shadow_bytes[i] = 8 + (shadow_bytes[i + 1] > 7 ? 0 : shadow_bytes[i + 1]); + for (unsigned i = 0; i < shadow_mems.length (); i++) +emit_move_insn (shadow_mems[i], asan_shadow_cst (&shadow_bytes[i * 4])); + do_pending_stack_adjust (); /* Construct epilogue sequence. */ @@ -1285,34 +1319,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, if (STRICT_ALIGNMENT) set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode))); - prev_offset = base_offset; - last_offset = base_offset; - last_size = 0; - for (l = length; l; l -= 2) -{ - offset = ba