Re: [PATCH 2/n] [PR tree-optimization/78496] Simplify ASSERT_EXPRs to facilitate jump threading
On Sat, May 6, 2017 at 5:03 PM, Jeff Law wrote: > This is the 2nd of 3-5 patches to address pr78496. > > Jump threading will examine ASSERT_EXPRs as it walks the IL and record > information from those ASSERT_EXPRs into the available expression and > const/copies tables where they're later used to simplify conditionals. > > We're missing a couple things though. > > First an ASSERT_EXPR with an EQ test creates an equality we can enter into > the const/copy tables. We were failing to do that. This is most > interesting when the RHS of the condition in the ASSERT_EXPR is a constant, > obviously. This has a secondary benefit of doing less work to get better > optimization. > > Second, some ASSERT_EXPRs may start off as a relational test such as x <= 0, > but after range extraction and propagation they could be simplified into an > equality comparison. We already do this with conditionals and generalizing > that code to handle ASSERT_EXPRs is pretty easy. This gives us more chances > to extract simple equivalences from the condition in ASSERT_EXPRs. > > This patch fixes those 2 problems. I don't think it directly helps pr78496 > right now as we're unable to pick up the newly exposed jump threads until > VRP2. But that's a short term limitation that I've already addressed > locally :-) > > Bootstrapped, regression tested and installed on the trunk. > > jeff > > > ps. An astute observer might note that improving the effectiveness of VRP > jump threading seems counterproductive since I've stated I want to remove > VRP jump threading. These improvements don't significantly change how I was > planning to do that or the infrastructure we're going to rely upon to make > that change. All that changes is where we get the information from -- > ASSERT_EXPRs vs querying a new API that generates the same information as > needed. That API is already there. It's called register_edge_assert_for (it might need a wrapper to be most useful and also needs to be exported / moved from VRP to somewhere else). Both VRP and EVRP use it to "compute" ASSERT_EXPRs. So I'm not sure if changing VRP with your patches is a good thing when you could have used the new API in the first place ... Richard. > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index b0c253b09ae..0f78f2a2ed1 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,12 @@ > +2017-05-06 Jeff Law > + > + PR tree-optimization/78496 > + * tree-vrp.c (simplify_assert_expr_using_ranges): New function. > + (simplify_stmt_using_ranges): Call it. > + (vrp_dom_walker::before_dom_children): Extract equivalences > + from an ASSERT_EXPR with an equality comparison against a > + constant. > + > 2017-05-06 Richard Sandiford > > * lra-constraints.c (lra_copy_reg_equiv): New function. > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index fca5b87e798..42782a6d17c 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,9 @@ > +2017-05-06 Jeff Law > + > + PR tree-optimization/78496 > + * gcc.dg/tree-ssa/ssa-thread-16.c: New test. > + * gcc.dg/tree-ssa/ssa-thread-17.c: New test. > + > 2017-05-06 Richard Sandiford > > * gcc.target/aarch64/spill_1.c: New test. > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-16.c > b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-16.c > new file mode 100644 > index 000..78c349ca14d > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-16.c > @@ -0,0 +1,38 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-vrp1-details" } */ > + > +/* We should thread the if (exp == 2) conditional on the > + the path from inside the if (x) THEN arm. It is the only > + jump threading opportunity in this code. */ > + > +/* { dg-final { scan-tree-dump-times "Threaded" 1 "vrp1" } } */ > + > + > +extern void abort (void) __attribute__ ((__nothrow__, __leaf__)) > + __attribute__ ((__noreturn__)); > + > +int x; > + > + > +int code; > +void > +do_jump (int exp) > +{ > + switch (code) > +{ > +case 4: > + if ((exp) == 1) > + goto normal; > + if (x) > + { > + if (exp != 0) > + abort (); > + } > + if ((exp) == 2) > + goto normal; > +case 3: > + abort (); > +} > + normal: > + ; > +} > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-17.c > b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-17.c > new file mode 100644 > index 000..692658fbb4b > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-17.c > @@ -0,0 +1,36 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-vrp1-details" } */ > + > +/* We should simplify one ASSERT_EXPR from a relational > + into an equality test. */ > +/* { dg-final { scan-tree-dump-times "Folded > into:\[^\r\n\]*ASSERT_EXPR\*\[^\r\n\]* == 1" 1 "vrp1" } } */ > + > +/* And simplification of the ASSERT_EXPR leads to a jump threading > opportunity. */ > +/* {
Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
On Fri, May 5, 2017 at 10:53 PM, Jeff Law wrote: > On 05/04/2017 08:37 AM, Jeff Law wrote: >> >> >> You understanding is slightly wrong however, The ASSERT_EXPRs and >> conditionals map 100% through propagation and into simplification. It's >> only during simplification that we lose the direct mapping as we change the >> conditional in order to remove the unnecessary type conversion. Threading >> runs after simplification. >> >> Another approach here would be to simplify the ASSERT_EXPR in the same >> manner that we simplify the conditional. That may even be better for >> various reasons in the short term. Let me poke at that. > > Now I remember why I didn't do that (simplify the ASSERT_EXPR). It > doesn't work :-) > > So given an ASSERT_EXPR like: > > v1_378 = ASSERT_EXPR ; > > Let's assume for the sake of argument v1_179 was set by a type cast from > xx_1. We can simplify that ASSERT_EXPR into > > v1_378 = ASSERT_EXPR ; > > But note we can not change operand 0 of the ASSERT_EXPR. That would change > the *type* of the ASSERT_EXPR and blows up all kinds of things later. So > the ASSERT_EXPR at best can morph into this form: > > v1_378 = ASSERT_EXPR ; > > When the threader wants to look for an ASSERT_EXPR that creates a range for > an object, it does lookups based on a match of operand 0 without digging > into the expression in operand 1. > > That's something we may want to change in the future (it plays into issues > that arise in patch #3) but in the immediate term it's still best to defer > that one special case of tweaking a GIMPLE_COND. Note that I tried last stage3 (it ended up being too late) to get rid of ASSERT_EXPRs doing substitute-and-fold itself (basically copy-propagate them out at this point rather than as a separate thing later). This is because the ASSERT_EXPR uses interfere with the single_use checks in match.pd patterns and thus are actually harmful. The barrier I ran into was the ASSERT_EXPR use by the threader ... so now you're making us rely even more on those :/ Richard. > Jeff
Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()
On 05/05/2017 07:05 PM, Jonathan Wakely wrote: As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda hinted at by http://wg21.link/lwg1200) there's a problem with char_traits::eof() because it returns int_type(-1) which is the same value as u'\u', a valid UTF-16 code point. i.e. because all values of int_type are also valid values of char_type we cannot meet the requirement that: "The member eof() shall return an implementation-defined constant that cannot appear as a valid UTF-16 code unit." I've reported this as a defect, suggesting that the wording above needs to change. One consequence is that basic_streambuf::sputc(u'\u') always returns the same value, whether it succeeds or not. On success it returns to_int_type(u'\u') and on failure it returns eof(), which is the same value. I think that can be solved with the attached change, which preserves the invariant in [char.traits.require] that eof() returns: "a value e such that X::eq_int_type(e,X::to_int_type(c)) is false for all values c." This can be true if we ensure that to_int_type never returns the eof() value. http://www.unicode.org/faq/private_use.html#nonchar10 suggests doing something like this. It means that when writing u'\u' to a streambuf we write that character successfully, but return u'\uFFFD' instead; and when reading u'\u' from a streambuf we return u'\uFFFD' instead. This is asymmetrical, as we can write that character but not read it back. It might be better to refuse to write u'\u' and write it as the replacement character instead, but I think I prefer to write the right character when possible. It also doesn't require any extra changes. All tests pass with this, does anybody see any problems with this approach? Sounds scary to me. As an application programmer, I'd expect to be able to use chart16_t based streams to read and write arbitrary sequences of Unicode code points (encoded as sequences of UTF-16 code units). (Think of an application temporarily storing internal strings to a disk file.) Also, I'd be surprised to find this asymmetric behavior only for U+ and not for other noncharacters, and only for char16_t and not for char32_t. To me, the definition of char16_t's int_type and eof() sounds like a bug that needs fixing, not working around?
[PATCH][x86] Fix ADD[SD,SS] and SUB[SD,SS] runtime tests
Hi, This patch fixes errors in runtime tests for ADDSD, ADDSS, SUBSD and SUBSS instructions. gcc/testsuite/ * gcc.target/i386/avx512f-vaddsd-2.c: Test fixed. * gcc.target/i386/avx512f-vaddss-2.c: Ditto. * gcc.target/i386/avx512f-vsubsd-2.c: Ditto. * gcc.target/i386/avx512f-vsubss-2.c: Ditto. Is it ok for trunk? Thanks, Sebastian ADD[SD_SS]_SUB[SD_SS]_runtime_tests_fix.patch Description: ADD[SD_SS]_SUB[SD_SS]_runtime_tests_fix.patch
Re: [PATCH] prevent -Wno-system-headers from suppressing -Wstringop-overflow (PR 79214)
I see these regressions on ia64: FAIL: c-c++-common/Wsizeof-pointer-memaccess2.c -std=gnu++11 (test for excess errors) Excess errors: /usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:11:53: warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' reading 8 bytes from a region of size 1 [-Wstringop-overflow=] /usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:12:55: warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' reading 8 bytes from a region of size 1 [-Wstringop-overflow=] /usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:11:53: warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' reading 8 bytes from a region of size 1 [-Wstringop-overflow=] /usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:12:55: warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' reading 8 bytes from a region of size 1 [-Wstringop-overflow=] /usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:11:53: warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' reading 8 bytes from a region of size 1 [-Wstringop-overflow=] /usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:12:55: warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' reading 8 bytes from a region of size 1 [-Wstringop-overflow=] /usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:11:53: warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' reading 8 bytes from a region of size 1 [-Wstringop-overflow=] /usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:12:55: warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' reading 8 bytes from a region of size 1 [-Wstringop-overflow=] Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [PATCH, RFC] warn about raw pointers that can be unique_ptr
On Sun, 7 May 2017, Trevor Saunders wrote: On Sun, May 07, 2017 at 07:32:48PM +0200, Marc Glisse wrote: On Sun, 7 May 2017, tbsaunde+...@tbsaunde.org wrote: This is a start at warning about various resource allocation issues that can be improved. Currently it only warns about functions that call malloc and then always pass the resulting pointer to free(). It should be pretty simple to extend this to new/delete and new[]/delete[], as well as checking that in simple cases the pairing is correct. However it wasn't obvious to me how to tell if a function call is to a allocating operator new. We probably don't want to warn about placement new since complicated things can be happening there. There is the DECL_IS_OPERATOR_NEW() but that doesn't seem to cover class specific operator new overloads, and maybe even custom ones at global scope? Other things that may be reasonable to try and include would be open / close and locks. It might also be good to warn about functions that take or return unique ownership of resources, but I'm not quiet sure how to handle functions that allocate or deallocate shared resources. Interesting. 1) How do you avoid warning for code that already uses unique_ptr? After inlining, it will look just like code calling new, and delete on all paths. Currently by running early enough that no inline has happened. I'm not sure how valuable it would be to try and find things post inlining instead. Ah, right, for the warning it indeed makes sense to be pre-inlining, as it isn't clear you could use unique_ptr if the call to free was not directly in this function. Well, currently you only handle malloc/free, but if you provide an inline implementation of new/delete that forwards to malloc/free, the issue should already appear. Or maybe the pass is very early? But then the compiler will likely seldom notice that the argument to free is actually the return of malloc. I wonder how true this last bit is, sure we could find some more things after more optimization, but its not obvious to me that it would be so much to really be worth it. Looking at gcc itself there's plenty of places where this should already be enough with tracking places where the pointer is coppied from one ssa name to another. I feel like the big areas of improvement are warning about functions that take or return ownership of things. I fear that there are many cases where you need more PRE / DOM / SRA / ... for the SSA_NAME to end up in a place where malloc and free match, but if you already detect some cases, that's good. 2) In some sense, we would want to warn when some path is missing the call to free, but it seems very hard to distinguish the cases where it is accidentally missing from those where it is done on purpose. Maybe detect cases where the only paths missing free are through exceptions? I didn't think about exceptions, but I'd been thinking of trying to warn about variables we prove don't escape in alias analysis, but that would probably be a small set of all allocations. 3) This seems very similar to the conditions needed to transform a call to malloc into a stack allocation. I was experimenting with this some years ago, see for instance https://gcc.gnu.org/ml/gcc-patches/2013-11/msg01049.html (it is probably not the most recent message on the subject, since the main code moved to CCP afterwards http://geometrica.saclay.inria.fr/team/Marc.Glisse/tmp/heapstack, but I can't locate more recent messages at the moment, and it was more a prototype than a ready patch). Do you think the infrastructure could be shared between your warning and such an optimization? Or are the goals likely to diverge too soon for sharing to be practical? yeah, some of the issues are certainly similar. I'm not sure if there would be things worth sharing probably easiest to find out by just trying it. It seems that the differences will be larger than I originally expected, so it isn't clear the gain would be worth it. While I remember, one (easy) case you may want to handle is: p = malloc(n); if (p != 0) ... where the else branch does not need to contain a call to free. (maybe one can add strdup and similar functions later, but better start with few functions for now) -- Marc Glisse
Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()
On 05/05/2017 07:05 PM, Jonathan Wakely wrote: As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda hinted at by http://wg21.link/lwg1200) there's a problem with char_traits::eof() because it returns int_type(-1) which is the same value as u'\u', a valid UTF-16 code point. I think the real bug is that char_traits::int_type is just plain wrong. It has to be a signed integer, and capable of representing values in the range 0 .. 65535. char_traits has a similar problem. char_traits should be fine on glibc because WEOF is reserved, something that is probably not the case for char32_t. Thanks, Florian
Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()
On 08/05/17 09:52 +0200, Stephan Bergmann via libstdc++ wrote: On 05/05/2017 07:05 PM, Jonathan Wakely wrote: As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda hinted at by http://wg21.link/lwg1200) there's a problem with char_traits::eof() because it returns int_type(-1) which is the same value as u'\u', a valid UTF-16 code point. i.e. because all values of int_type are also valid values of char_type we cannot meet the requirement that: "The member eof() shall return an implementation-defined constant that cannot appear as a valid UTF-16 code unit." I've reported this as a defect, suggesting that the wording above needs to change. One consequence is that basic_streambuf::sputc(u'\u') always returns the same value, whether it succeeds or not. On success it returns to_int_type(u'\u') and on failure it returns eof(), which is the same value. I think that can be solved with the attached change, which preserves the invariant in [char.traits.require] that eof() returns: "a value e such that X::eq_int_type(e,X::to_int_type(c)) is false for all values c." This can be true if we ensure that to_int_type never returns the eof() value. http://www.unicode.org/faq/private_use.html#nonchar10 suggests doing something like this. It means that when writing u'\u' to a streambuf we write that character successfully, but return u'\uFFFD' instead; and when reading u'\u' from a streambuf we return u'\uFFFD' instead. This is asymmetrical, as we can write that character but not read it back. It might be better to refuse to write u'\u' and write it as the replacement character instead, but I think I prefer to write the right character when possible. It also doesn't require any extra changes. All tests pass with this, does anybody see any problems with this approach? Sounds scary to me. As an application programmer, I'd expect to be able to use chart16_t based streams to read and write arbitrary sequences of Unicode code points (encoded as sequences of UTF-16 code units). (Think of an application temporarily storing internal strings to a disk file.) Also, I'd be surprised to find this asymmetric behavior only for U+ and not for other noncharacters, and only for char16_t and not for char32_t. To me, the definition of char16_t's int_type and eof() sounds like a bug that needs fixing, not working around? Fixing that would require changing the standard and breaking the ABI of all existing implementations. I've opened a defect report against that standard, but a change that requires an ABI break isn't likely to be popular. Changing the semantics of to_int_type for U+ is far less likely to affect any ABIs (it's a constexpr function so it's possible somebody is using the value of to_int_type(char_type(-1)) as a template argument, but it seems unlikely. It's a much smaller change, "allowed" by http://www.unicode.org/faq/private_use.html#nonchar10 and it only affects a noncharacter that is not intended for interchange anyway. I'm not claiming it's ideal, but it fixes a bug today.
Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()
On 08/05/17 11:53 +0200, Florian Weimer via libstdc++ wrote: On 05/05/2017 07:05 PM, Jonathan Wakely wrote: As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda hinted at by http://wg21.link/lwg1200) there's a problem with char_traits::eof() because it returns int_type(-1) which is the same value as u'\u', a valid UTF-16 code point. I think the real bug is that char_traits::int_type is just plain wrong. It has to be a signed integer, Why does it have to be signed? and capable of representing values in the range 0 .. 65535. char_traits has a similar problem. char_traits should be fine on glibc because WEOF is reserved, something that is probably not the case for char32_t. I think there are 32-bit values which are not valid UTF-32 code points, including char32_t(-1) which we use for EOF.
[C++ Patch Ping] Re: [C++ Patch/RFC] PR 80145
Hi, gently pinging this... On 23/03/2017 20:07, Paolo Carlini wrote: Hi, this ICE on invalid code isn't a regression, thus a patch probably doesn't qualify for Stage 4, but IMHO I made good progress on it and I'm sending what I have now anyway... The ICE happens during error recovery after a sensible diagnostic for the first declaration in: auto* foo() { return 0; } auto* foo(); After the error, finish_function does: apply_deduced_return_type (fndecl, void_type_node); fntype = TREE_TYPE (fndecl); which then is inconsistent with the auto* return type of the second declaration and leads to an ICE in merge_types (which duplicate_decls thought was safe to call because types_match is true (evidently: decls_match uses fndecl_declared_return_type)). Thus, in terms of error recovery, I think that in cases like the one at issue it makes sense not to replace auto* after the error and leave the return type untouched: certainly the below passes testing and avoids ICEing on the testcase at issue and a variant of it. Thanks, Paolo.
Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()
On 05/08/2017 12:24 PM, Jonathan Wakely wrote: On 08/05/17 11:53 +0200, Florian Weimer via libstdc++ wrote: On 05/05/2017 07:05 PM, Jonathan Wakely wrote: As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda hinted at by http://wg21.link/lwg1200) there's a problem with char_traits::eof() because it returns int_type(-1) which is the same value as u'\u', a valid UTF-16 code point. I think the real bug is that char_traits::int_type is just plain wrong. It has to be a signed integer, Why does it have to be signed? Hmm. Maybe it's not strictly required. int_type(-1) as a distinct value is likely sufficient. and capable of representing values in the range 0 .. 65535. char_traits has a similar problem. char_traits should be fine on glibc because WEOF is reserved, something that is probably not the case for char32_t. I think there are 32-bit values which are not valid UTF-32 code points, including char32_t(-1) which we use for EOF. I'm not sure if char32_t is restricted to UTF-32 codepoints (the standard does not say, I think). But even UCS-4 is 31-bit only, so maybe the problem does not arise there. Thanks, Florian
Re: [PATCH, RFC] warn about raw pointers that can be unique_ptr
On Mon, May 08, 2017 at 10:28:13AM +0200, Marc Glisse wrote: > On Sun, 7 May 2017, Trevor Saunders wrote: > > > On Sun, May 07, 2017 at 07:32:48PM +0200, Marc Glisse wrote: > > > On Sun, 7 May 2017, tbsaunde+...@tbsaunde.org wrote: > > > > > > > This is a start at warning about various resource allocation issues > > > > that can be > > > > improved. Currently it only warns about functions that call malloc and > > > > then > > > > always pass the resulting pointer to free(). > > > > > > > > It should be pretty simple to extend this to new/delete and > > > > new[]/delete[], as > > > > well as checking that in simple cases the pairing is correct. However > > > > it > > > > wasn't obvious to me how to tell if a function call is to a allocating > > > > operator > > > > new. We probably don't want to warn about placement new since > > > > complicated > > > > things can be happening there. There is the DECL_IS_OPERATOR_NEW() but > > > > that > > > > doesn't seem to cover class specific operator new overloads, and maybe > > > > even > > > > custom ones at global scope? > > > > > > > > Other things that may be reasonable to try and include would be open / > > > > close > > > > and locks. > > > > > > > > It might also be good to warn about functions that take or return unique > > > > ownership of resources, but I'm not quiet sure how to handle functions > > > > that > > > > allocate or deallocate shared resources. > > > > > > Interesting. > > > > > > 1) How do you avoid warning for code that already uses unique_ptr? After > > > inlining, it will look just like code calling new, and delete on all > > > paths. > > > > Currently by running early enough that no inline has happened. I'm not > > sure how valuable it would be to try and find things post inlining > > instead. > > Ah, right, for the warning it indeed makes sense to be pre-inlining, as it > isn't clear you could use unique_ptr if the call to free was not directly in > this function. well, you probably could, you'd just need to pass it around some. > > > Well, currently you only handle malloc/free, but if you provide an inline > > > implementation of new/delete that forwards to malloc/free, the issue > > > should > > > already appear. Or maybe the pass is very early? But then the compiler > > > will > > > likely seldom notice that the argument to free is actually the return of > > > malloc. > > > > I wonder how true this last bit is, sure we could find some more things > > after more optimization, but its not obvious to me that it would be so > > much to really be worth it. Looking at gcc itself there's plenty of > > places where this should already be enough with tracking places where > > the pointer is coppied from one ssa name to another. I feel like the > > big areas of improvement are warning about functions that take or return > > ownership of things. > > I fear that there are many cases where you need more PRE / DOM / SRA / ... > for the SSA_NAME to end up in a place where malloc and free match, but if > you already detect some cases, that's good. So, I've been using a hacked up version of this to find places to use auto_bitmap. I ended up following places where one ssa name is assigned to another, and moving this pass after the ccp within all_early_optimizations. The pass location change was because at the earlier location there was places where the ssa name from the allocation was assigned to a vAR_DECL, that was ssa'd away at the later point. So it may actually be that optimization helps more than I thought, but I couldn't see an immediate reason we needed to be still using variables instead of ssa names in the one spot I looked at. > > > 2) In some sense, we would want to warn when some path is missing the call > > > to free, but it seems very hard to distinguish the cases where it is > > > accidentally missing from those where it is done on purpose. Maybe detect > > > cases where the only paths missing free are through exceptions? > > > > I didn't think about exceptions, but I'd been thinking of trying to warn > > about variables we prove don't escape in alias analysis, but that would > > probably be a small set of all allocations. > > > > > 3) This seems very similar to the conditions needed to transform a call to > > > malloc into a stack allocation. I was experimenting with this some years > > > ago, see for instance > > > https://gcc.gnu.org/ml/gcc-patches/2013-11/msg01049.html (it is probably > > > not > > > the most recent message on the subject, since the main code moved to CCP > > > afterwards > > > http://geometrica.saclay.inria.fr/team/Marc.Glisse/tmp/heapstack, > > > but I can't locate more recent messages at the moment, and it was more a > > > prototype than a ready patch). Do you think the infrastructure could be > > > shared between your warning and such an optimization? Or are the goals > > > likely to diverge too soon for sharing to be practical? > > > > yeah, some of the issues are certain
Re: [PATCH][AArch64] Allow const0_rtx operand for atomic compare-exchange patterns
Ping. Thanks, Kyrill On 24/04/17 10:37, Kyrill Tkachov wrote: Pinging this back into context so that I don't forget about it... https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01648.html Thanks, Kyrill On 28/02/17 12:29, Kyrill Tkachov wrote: Hi all, For the testcase in this patch we currently generate: foo: mov w1, 0 ldaxr w2, [x0] cmp w2, 3 bne .L2 stxrw3, w1, [x0] cmp w3, 0 .L2: csetw0, eq ret Note that the STXR could have been storing the WZR register instead of moving zero into w1. This is due to overly strict predicates and constraints in the store exclusive pattern and the atomic compare exchange expanders and splitters. This simple patch fixes that in the patterns concerned and with it we can generate: foo: ldaxr w1, [x0] cmp w1, 3 bne .L2 stxrw2, wzr, [x0] cmp w2, 0 .L2: csetw0, eq ret Bootstrapped and tested on aarch64-none-linux-gnu. Ok for GCC 8? Thanks, Kyrill 2017-02-28 Kyrylo Tkachov * config/aarch64/atomics.md (atomic_compare_and_swap expander): Use aarch64_reg_or_zero predicate for operand 4. (aarch64_compare_and_swap define_insn_and_split): Use aarch64_reg_or_zero predicate for operand 3. Add 'Z' constraint. (aarch64_store_exclusive): Likewise for operand 2. 2017-02-28 Kyrylo Tkachov * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.
Re: [PATCH][AArch64] Emit tighter strong atomic compare-exchange loop when comparing against zero
Ping. Thanks, Kyrill On 24/04/17 10:38, Kyrill Tkachov wrote: Pinging this back into context so that I don't forget about it... https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00376.html Thanks, Kyrill On 08/03/17 16:35, Kyrill Tkachov wrote: Hi all, For the testcase in this patch where the value of x is zero we currently generate: foo: mov w1, 4 .L2: ldaxr w2, [x0] cmp w2, 0 bne .L3 stxrw3, w1, [x0] cbnzw3, .L2 .L3: csetw0, eq ret We currently cannot merge the cmp and b.ne inside the loop into a cbnz because we need the condition flags set for the return value of the function (i.e. the cset at the end). But if we re-jig the sequence in that case we can generate a tighter loop: foo: mov w1, 4 .L2: ldaxr w2, [x0] cbnzw2, .L3 stxrw3, w1, [x0] cbnzw3, .L2 .L3: cmp w2, 0 csetw0, eq ret So we add an explicit compare after the loop and inside the loop we use the fact that we're comparing against zero to emit a CBNZ. This means we may re-do the comparison twice (once inside the CBNZ, once at the CMP at the end), but there is now less code inside the loop. I've seen this sequence appear in glibc locking code so maybe it's worth adding the extra bit of complexity to the compare-exchange splitter to catch this case. Bootstrapped and tested on aarch64-none-linux-gnu. In previous iterations of the patch where I had gotten some logic wrong it would cause miscompiles of libgomp leading to timeouts in its testsuite but this version passes everything cleanly. Ok for GCC 8? (I know it's early, but might as well get it out in case someone wants to try it out) Thanks, Kyrill 2017-03-08 Kyrylo Tkachov * config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Emit CBNZ inside loop when doing a strong exchange and comparing against zero. Generate the CC flags after the loop. 2017-03-08 Kyrylo Tkachov * gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: New test.
Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()
On 08/05/17 12:52 +0200, Florian Weimer via libstdc++ wrote: On 05/08/2017 12:24 PM, Jonathan Wakely wrote: On 08/05/17 11:53 +0200, Florian Weimer via libstdc++ wrote: On 05/05/2017 07:05 PM, Jonathan Wakely wrote: As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda hinted at by http://wg21.link/lwg1200) there's a problem with char_traits::eof() because it returns int_type(-1) which is the same value as u'\u', a valid UTF-16 code point. I think the real bug is that char_traits::int_type is just plain wrong. It has to be a signed integer, Why does it have to be signed? Hmm. Maybe it's not strictly required. int_type(-1) as a distinct value is likely sufficient. Agreed. and capable of representing values in the range 0 .. 65535. char_traits has a similar problem. char_traits should be fine on glibc because WEOF is reserved, something that is probably not the case for char32_t. I think there are 32-bit values which are not valid UTF-32 code points, including char32_t(-1) which we use for EOF. I'm not sure if char32_t is restricted to UTF-32 codepoints (the standard does not say, I think). But even UCS-4 is 31-bit only, so maybe the problem does not arise there. It's not really clear what the encoding of char32_t is (see http://talesofcpp.fusionfenix.com/post-10/episode-seven-one-char-to-rule-them-all for a good analysis) but whether it's UCS-4 or UTF-32, U+ is not in the universal character set, so we can use 0x for char_traits::eof(). So I think only char_traits has this problem.
Re: [committed] Get rid of macros for diagnostic_report_current_module
On 05/05/2017 05:35 PM, David Malcolm wrote: Committed to trunk as r247664. gcc/ChangeLog: * diagnostic.c (last_module_changed_p): New function. (set_last_module): New function. Given that C++ modules-ts is a thing, I wonder if there's a better name for these functions? that might be less confusing down the road. nathan -- Nathan Sidwell
[C++ Patch] PR 80186 ("ICE on C++ code with invalid constructor...")
Hi, in order to avoid this error recovery issue I think we want to check the return value of grok_ctor_properties, as we do in decl.c, for its only other use. Tested x86_64-linux. Thanks, Paolo. // /cp 2017-05-08 Paolo Carlini PR c++/80186 * pt.c (tsubst_decl): Early return error_mark_node if grok_ctor_properties returns false. /testsuite 2017-05-08 Paolo Carlini PR c++/80186 * g++.dg/template/crash126.C: New. Index: cp/pt.c === --- cp/pt.c (revision 247733) +++ cp/pt.c (working copy) @@ -12407,8 +12407,8 @@ tsubst_decl (tree t, tree args, tsubst_flags_t com if (DECL_CONSTRUCTOR_P (r) || DECL_DESTRUCTOR_P (r)) { maybe_retrofit_in_chrg (r); - if (DECL_CONSTRUCTOR_P (r)) - grok_ctor_properties (ctx, r); + if (DECL_CONSTRUCTOR_P (r) && !grok_ctor_properties (ctx, r)) + RETURN (error_mark_node); /* If this is an instantiation of a member template, clone it. If it isn't, that'll be handled by clone_constructors_and_destructors. */ Index: testsuite/g++.dg/template/crash126.C === --- testsuite/g++.dg/template/crash126.C(revision 0) +++ testsuite/g++.dg/template/crash126.C(working copy) @@ -0,0 +1,13 @@ +// PR c++/80186 + +template < class T, class > struct A +{ + A (); + A (A &); + A (A < T, T >); // { dg-error "invalid constructor" } +}; + +void f () +{ + A < int, int > (A < int, int >()); // { dg-error "cannot bind" } +}
Re: [PATCH 1/2] C++ template type diff printing
On 05/04/2017 07:44 PM, David Malcolm wrote: This patch kit implements two new options to make it easier to read diagnostics involving mismatched template types: -fdiagnostics-show-template-tree and -fno-elide-type. It adds two new formatting codes: %H and %I which are equivalent to %qT, but are to be used together for type comparisons e.g. "can't convert from %H to %I". While I can see why one might want %H and %I to imply quoting, it seems confusingly inconsistent with the other formatters, which require an explicit 'q'. rather than printing: could not convert 'std::map >()' from 'std::map >' to 'std::map >' with -felide-type (the default), it prints: could not convert 'std::map >()' from 'map<[...],vector>' to 'map<[...],vector> Neat. Is '-felide-type' a good name? Wouldn't something like '-fdiagnostic-elide-template-args' be better? With -fdiagnostics-show-template-tree a tree-like structure of the template is printed, showing the differences; in this case: 'tree' sounds implementor-speaky, Perhaps '-fdiagnostic-expand-template-args' or something? Right, bikeshedding out of the way ... --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -99,7 +99,47 @@ static void cp_diagnostic_starter (diagnostic_context *, diagnostic_info *); static void cp_print_error_function (diagnostic_context *, diagnostic_info *); + +/* Return true iff TYPE_A and TYPE_B are template types that are + meaningful to compare. */ + +static bool +comparable_template_types_p (tree type_a, tree type_b) +{ + if (TREE_CODE (type_a) != RECORD_TYPE) +return false; + if (TREE_CODE (type_b) != RECORD_TYPE) +return false; CLASS_TYPE_P (type_a) etc? + int len_max = MAX (len_a, len_b); + for (int idx = 0; idx < len_max; idx++) +{ + tree arg_a = idx < len_a ? TREE_VEC_ELT (args_a, idx) : NULL; + tree arg_b = idx < len_b ? TREE_VEC_ELT (args_b, idx) : NULL; + if (arg_a == arg_b) Should this use same_type_p for types? If direct comparison is correct, a comment would help. + Only non-default template arguments are printed. */ + +static void +print_template_tree_comparison (pretty_printer *pp, tree type_a, tree type_b, + for (int idx = 0; idx < len_max; idx++) +{ + tree arg_a = idx < len_a ? TREE_VEC_ELT (args_a, idx) : NULL; + tree arg_b = idx < len_b ? TREE_VEC_ELT (args_b, idx) : NULL; + if (arg_a == arg_b) Same question. If these have to be comparable template type, when can len_a and len_b be different? + This is called in phase 2 of pp_format, when it is accumulating + a series of formatted chunks. We stash the location of the chunk + we're meant to have written to, so that we can write to it in the + m_format_postprocessor hook. */ + +static void +defer_half_of_type_diff (pretty_printer *pp, char spec, tree type, If this is called 'phase 2' why not name the function for that, rather than 'half'? +case 'H': +case 'I': + { + defer_half_of_type_diff (pp, *spec, next_tree, buffer_ptr, verbose); Why not tell defer_half_of_type_diff whether it's doing the %H bit or the %I bit directly, rather than have it also check for H and I? I've not looked more than cursorily at the non C++ bits. nathan -- Nathan Sidwell
Re: [PATCH GCC8][11/33]New interfaces for tree affine
On Thu, May 4, 2017 at 5:21 PM, Bin.Cheng wrote: > On Mon, Apr 24, 2017 at 11:43 AM, Richard Biener > wrote: >> On Tue, Apr 18, 2017 at 12:43 PM, Bin Cheng wrote: >>> Hi, >>> This patch adds three simple interfaces for tree affine which will be used >>> in >>> cost computation later. >>> >>> Is it OK? >> >> >> +static inline tree >> +aff_combination_type (aff_tree *aff) >> >> misses a function comment. Please do not introduce new 'static inline' >> function in headers but instead use plain 'inline'. >> >> +/* Return true if AFF is simple enough. */ >> +static inline bool >> +aff_combination_simple_p (aff_tree *aff) >> +{ >> >> what is "simple"? Based on that find a better name. >> "singleton"? But aff_combination_const_p isn't >> simple_p (for whatever reason). > Patch updated. The one (13th) depending on this one is updated too. Ok with +inline bool +aff_combination_singleton_var_p (aff_tree *aff) +{ + gcc_assert (aff != NULL); this assert removed (aff->n will ICE anyway if NULL) + return (aff->n == 1 + && aff->offset == 0 + && (aff->elts[0].coef == 1 || aff->elts[0].coef == -1)); please adjust the comment to say "one (negated) singleton variable". Thanks, Richard. > Thanks, > bin >> >> Richard. >> >> >> >>> Thanks, >>> bin >>> 2017-04-11 Bin Cheng >>> >>> * tree-affine.h (aff_combination_type): New interface. >>> (aff_combination_const_p, aff_combination_simple_p): New interfaces.
Re: [PATCH GCC8][28/33]Don't count non-interger PHIs for register pressure
On Thu, May 4, 2017 at 5:33 PM, Bin.Cheng wrote: > On Wed, Apr 26, 2017 at 3:32 PM, Bin.Cheng wrote: >> On Wed, Apr 26, 2017 at 3:23 PM, Richard Biener >> wrote: >>> On Wed, Apr 26, 2017 at 3:37 PM, Bin.Cheng wrote: On Wed, Apr 26, 2017 at 2:32 PM, Richard Biener wrote: > On Tue, Apr 18, 2017 at 12:52 PM, Bin Cheng wrote: >> Hi, >> Given only integer variables are meaningful for register pressure >> estimation in IVOPTs, >> this patch skips non-integer type PHIs when counting register pressure. >> Is it OK? > > Huh. I suppose it only makes a difference because you are ignoring > POINTER_TYPE_P > IVs? At least I would be surprised if get_iv returns true for float > or vector PHIs (yeah, see > early out in get_iv)? So why exclude POINTER_TYPE_P IVs? Hmm, but if get_iv returns non-NULL, the phi won't be counted because loop is continued? Actually, all IV and invariants are skipped by checking get_iv, so this is only to skip floating point phis. >>> >>> Err, but AFAICS get_iv will return non-NULL for POINTER_TYPE_P IVs >>> which you then skip by your added >>> >>> + if (!INTEGRAL_TYPE_P (TREE_TYPE (op))) >>> + continue; >>> >>> thus float IVs are always skipped by means if get_iv returning NULL. >>> >>> Oh, the get_iv check continues for non-NULL result ... so it makes sense. >>> But still, why exclude POINTER_TYPE_P non-IV ops? >> POINTER_TYPE_P is simply an overlook, will update patch. > Here is updated version picking up POINTER_TYPE_P. Ok. Richard. > Thanks, > bin >> >> Thanks, >> bin >>> >>> Richard. >>> Thanks, bin > > Richard. > >> Thanks, >> bin >> >> 2017-04-11 Bin Cheng >> >> * tree-ssa-loop-ivopts.c (determine_set_costs): Skip non-interger >> when counting register pressure. >>
Re: [PATCH GCC8][17/33]Treat complex cand step as invriant expression
On Thu, May 4, 2017 at 7:55 PM, Bin.Cheng wrote: > On Wed, May 3, 2017 at 2:43 PM, Richard Biener > wrote: >> On Tue, Apr 18, 2017 at 12:46 PM, Bin Cheng wrote: >>> Hi, >>> We generally need to compute cand step in loop preheader and use it in loop >>> body. >>> Unless it's an SSA_NAME of constant integer, an invariant expression is >>> needed. >> >> I'm confused as to what this patch does. Comments talk about "Handle step >> as" >> but then we print "Depend on inv...". And we share bitmaps, well it seems >> >> + find_inv_vars (data, &step, &cand->inv_vars); >> + >> + iv_inv_expr_ent *inv_expr = get_loop_invariant_expr (data, step); >> + /* Share bitmap between inv_vars and inv_exprs for cand. */ >> + if (inv_expr != NULL) >> + { >> + cand->inv_exprs = cand->inv_vars; >> + cand->inv_vars = NULL; >> + if (cand->inv_exprs) >> + bitmap_clear (cand->inv_exprs); >> + else >> + cand->inv_exprs = BITMAP_ALLOC (NULL); >> + >> + bitmap_set_bit (cand->inv_exprs, inv_expr->id); >> >> just shares the bitmap allocation (and leaks cand->inv_exprs?). >> >> Note that generally it might be cheaper to use bitmap_head instead of >> bitmap in the various structures (and then bitmap_initialize ()), this >> saves one indirection. >> >> Anyway, the relation between inv_vars and inv_exprs is what confuses me. >> Maybe it's the same as for cost_pair? invariants vs. loop invariants? >> whatever that means... >> >> That is, can you expand the comments in cost_pair / iv_cand for inv_vars >> vs. inv_exprs, esp what "invariant" actually means? > When we represent use with cand, there will be computation which is > loop invariant. The invariant computation is an newly created > invariant expression and is based on ssa_vars existed before ivopts. > If the invariant expression is complicated, we handle and call it as > invariant expression. We say the cost_pair depends on the inv.exprs. > If the invariant expression is simple enough, we record all existing > ssa_vars it based on in inv_vars. We say the cost_pair depends on the > inv.vars. The same words stand for struct iv_cand. If cand.step is > simple enough, we simply record the ssa_var it based on in inv_vars, > otherwise, the step is a new invariant expression which doesn't exist > before, we record it in cand.inv_exprs. > > Add comment for inv_vars/inv_exprs, is this OK? I noticed there is a > redundant field cost_pair.inv_expr, I deleted it as obvious in a > standalone patch. Thanks for the explanation. Ok. Thanks, Richard. > Thanks, > bin >> >> Thanks, >> Richard. >> >>> Thanks, >>> bin >>> >>> 2017-04-11 Bin Cheng >>> >>> * tree-ssa-loop-ivopts.c (struct iv_cand): New field inv_exprs. >>> (dump_cand): Support iv_cand.inv_exprs. >>> (add_candidate_1): Record invariant exprs in iv_cand.inv_exprs >>> for candidates. >>> (iv_ca_set_no_cp, iv_ca_set_cp, free_loop_data): Support >>> iv_cand.inv_exprs.
Re: [C++ PATCH] namespace bindings
On Mai 05 2017, Nathan Sidwell wrote: > This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces > {get,set}_namespace_binding with get_namespace_value and set_global_value > respectively. I'm getting this regression on ia64: FAIL: g++.dg/template/virtual3.C -std=c++11 (test for excess errors) Excess errors: /usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: error: use of deleted function 'virtual B::~B()' /usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: error: use of deleted function 'virtual B::~B()' /usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: error: use of deleted function 'virtual B::~B()' /usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: error: use of deleted function 'virtual B::~B()' /usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: error: use of deleted function 'virtual B::~B()' /usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: error: use of deleted function 'virtual B::~B()' /usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: error: use of deleted function 'virtual B::~B()' /usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: error: use of deleted function 'virtual B::~B()' Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
[Testsuite, committed] Fix vector peeling test failures
This fixes a few failures on ARM and AArch64 due to a recent change in alignment peeling by switching the vector cost model off (https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00407.html). Tested on AArch64, ARM and x64 - committed as obvious. ChangeLog: 2017-05-08 Wilco Dijkstra * testsuite/gcc.dg/vect/vect-44.c: Add -fno-vect-cost-model. * gcc/testsuite/gcc.dg/vect/vect-50.c: Likewise. -- diff --git a/gcc/testsuite/gcc.dg/vect/vect-44.c b/gcc/testsuite/gcc.dg/vect/vect-44.c index 186f9cfc9e26d6eb53514dec0fac176d696ec578..fbc593572429422c8e527c5e5559c515efd38aa6 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-44.c +++ b/gcc/testsuite/gcc.dg/vect/vect-44.c @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_float } */ +/* { dg-options "-fno-vect-cost-model" } */ #include #include "tree-vect.h" diff --git a/gcc/testsuite/gcc.dg/vect/vect-50.c b/gcc/testsuite/gcc.dg/vect/vect-50.c index 78bfd8d3920445fe51c7393a82870ea85f62bb55..0d5febc165ee3bf3b7b595237168d9d4b9604d4b 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-50.c +++ b/gcc/testsuite/gcc.dg/vect/vect-50.c @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_float } */ +/* { dg-options "-fno-vect-cost-model" } */ #include #include "tree-vect.h"
[PATCH] More strict-overflow VRP cleanup
The following removes more optimization disabling when strict-overflow sensitive simplifications were done. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2017-05-08 Richard Biener * tree-vrp.c (gimple_assign_nonzero_warnv_p): Rename to ... (gimple_assign_nonzero): ... this and remove strict_overflow_p argument. (gimple_stmt_nonzero_warnv_p): Rename to ... (gimple_stmt_nonzero_p): ... this and remove strict_overflow_p argument. (vrp_stmt_computes_nonzero): Remove strict_overflow_p argument. (extract_range_basic): Adjust, do not disable propagation on strict overflow sensitive simplification. (vrp_visit_cond_stmt): Likewise. Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 247733) +++ gcc/tree-vrp.c (working copy) @@ -857,27 +857,28 @@ symbolic_range_based_on_p (value_range * *STRICT_OVERFLOW_P.*/ static bool -gimple_assign_nonzero_warnv_p (gimple *stmt, bool *strict_overflow_p) +gimple_assign_nonzero_p (gimple *stmt) { enum tree_code code = gimple_assign_rhs_code (stmt); + bool strict_overflow_p; switch (get_gimple_rhs_class (code)) { case GIMPLE_UNARY_RHS: return tree_unary_nonzero_warnv_p (gimple_assign_rhs_code (stmt), gimple_expr_type (stmt), gimple_assign_rhs1 (stmt), -strict_overflow_p); +&strict_overflow_p); case GIMPLE_BINARY_RHS: return tree_binary_nonzero_warnv_p (gimple_assign_rhs_code (stmt), gimple_expr_type (stmt), gimple_assign_rhs1 (stmt), gimple_assign_rhs2 (stmt), - strict_overflow_p); + &strict_overflow_p); case GIMPLE_TERNARY_RHS: return false; case GIMPLE_SINGLE_RHS: return tree_single_nonzero_warnv_p (gimple_assign_rhs1 (stmt), - strict_overflow_p); + &strict_overflow_p); case GIMPLE_INVALID_RHS: gcc_unreachable (); default: @@ -891,12 +892,12 @@ gimple_assign_nonzero_warnv_p (gimple *s *STRICT_OVERFLOW_P.*/ static bool -gimple_stmt_nonzero_warnv_p (gimple *stmt, bool *strict_overflow_p) +gimple_stmt_nonzero_p (gimple *stmt) { switch (gimple_code (stmt)) { case GIMPLE_ASSIGN: - return gimple_assign_nonzero_warnv_p (stmt, strict_overflow_p); + return gimple_assign_nonzero_p (stmt); case GIMPLE_CALL: { tree fndecl = gimple_call_fndecl (stmt); @@ -934,13 +935,13 @@ gimple_stmt_nonzero_warnv_p (gimple *stm } } -/* Like tree_expr_nonzero_warnv_p, but this function uses value ranges +/* Like tree_expr_nonzero_p, but this function uses value ranges obtained so far. */ static bool -vrp_stmt_computes_nonzero (gimple *stmt, bool *strict_overflow_p) +vrp_stmt_computes_nonzero (gimple *stmt) { - if (gimple_stmt_nonzero_warnv_p (stmt, strict_overflow_p)) + if (gimple_stmt_nonzero_p (stmt)) return true; /* If we have an expression of the form &X->a, then the expression @@ -3927,8 +3928,7 @@ extract_range_basic (value_range *vr, gi if (INTEGRAL_TYPE_P (type) && gimple_stmt_nonnegative_warnv_p (stmt, &sop)) set_value_range_to_nonnegative (vr, type); - else if (vrp_stmt_computes_nonzero (stmt, &sop) - && !sop) + else if (vrp_stmt_computes_nonzero (stmt)) set_value_range_to_nonnull (vr, type); else set_value_range_to_varying (vr); @@ -7568,7 +7568,6 @@ static void vrp_visit_cond_stmt (gcond *stmt, edge *taken_edge_p) { tree val; - bool sop; *taken_edge_p = NULL; @@ -7634,25 +7633,14 @@ vrp_visit_cond_stmt (gcond *stmt, edge * additional checking. Testing on several code bases (GCC, DLV, MICO, TRAMP3D and SPEC2000) showed that doing this results in 4 more predicates folded in SPEC. */ - sop = false; + bool sop; val = vrp_evaluate_conditional_warnv_with_ops (gimple_cond_code (stmt), gimple_cond_lhs (stmt), gimple_cond_rhs (stmt), false, &sop, NULL); if (val) -{ - if (!sop) - *taken_edge_p = find_taken_edge (gimple_bb (stmt), val); - else - { - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, -"\nIgnoring predicate evaluation because " -"it assumes that signed overflow is undefined"); - val = NULL_TREE; - } -} +*taken_edge_p = find_taken_edge (gimple_bb (stmt)
Re: [C++ PATCH] namespace bindings
On Fri, May 5, 2017 at 4:07 PM, Nathan Sidwell wrote: > This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces > {get,set}_namespace_binding with get_namespace_value and set_global_value > respectively. I'd prefer to stick with the _binding naming, as "value" is more ambiguous. Jason
Re: [PATCH 1/2] C++ template type diff printing
On Mon, May 8, 2017 at 7:36 AM, Nathan Sidwell wrote: I agree with most of Nathan's comments, but > 'tree' sounds implementor-speaky, Perhaps > '-fdiagnostic-expand-template-args' or something? I like "tree" here; I read it as referring to the formatting of the abstract tree data structure, not the GCC internal type. Jason
Re: [C++ Patch] PR 80186 ("ICE on C++ code with invalid constructor...")
OK. On Mon, May 8, 2017 at 7:25 AM, Paolo Carlini wrote: > Hi, > > in order to avoid this error recovery issue I think we want to check the > return value of grok_ctor_properties, as we do in decl.c, for its only other > use. Tested x86_64-linux. > > Thanks, Paolo. > > // > >
Re: [C++ Patch/RFC] PR 80145
OK. On Thu, Mar 23, 2017 at 3:07 PM, Paolo Carlini wrote: > Hi, > > this ICE on invalid code isn't a regression, thus a patch probably doesn't > qualify for Stage 4, but IMHO I made good progress on it and I'm sending > what I have now anyway... The ICE happens during error recovery after a > sensible diagnostic for the first declaration in: > > auto* foo() { return 0; } > auto* foo(); > > After the error, finish_function does: > > apply_deduced_return_type (fndecl, void_type_node); > fntype = TREE_TYPE (fndecl); > > which then is inconsistent with the auto* return type of the second > declaration and leads to an ICE in merge_types (which duplicate_decls > thought was safe to call because types_match is true (evidently: decls_match > uses fndecl_declared_return_type)). Thus, in terms of error recovery, I > think that in cases like the one at issue it makes sense not to replace > auto* after the error and leave the return type untouched: certainly the > below passes testing and avoids ICEing on the testcase at issue and a > variant of it. > > Thanks, > Paolo. > > // >
Re: [C++ PATCH] namespace bindings
On 05/08/2017 08:07 AM, Andreas Schwab wrote: On Mai 05 2017, Nathan Sidwell wrote: This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces {get,set}_namespace_binding with get_namespace_value and set_global_value respectively. I'm getting this regression on ia64: Sorry, I'll take a look ... -- Nathan Sidwell
Re: {PATCH] New C++ warning -Wcatch-value
On Sun, May 7, 2017 at 8:05 PM, Martin Sebor wrote: > On 05/07/2017 02:03 PM, Volker Reichelt wrote: >> >> On 2 May, Martin Sebor wrote: >>> >>> On 05/01/2017 02:38 AM, Volker Reichelt wrote: Hi, catching exceptions by value is a bad thing, as it may cause slicing, i.e. a) a superfluous copy b) which is only partial. See also https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#e15-catch-exceptions-from-a-hierarchy-by-reference To warn the user about catch handlers of non-reference type, the following patch adds a new C++/ObjC++ warning option "-Wcatch-value". >>> >>> >>> I think the problems related to catching exceptions by value >>> apply to (a subset of) class types but not so much to fundamental >>> types. I would expect indiscriminately warning on every type to >>> be overly restrictive. >>> >>> The Enforcement section of the C++ guideline suggests to >>> >>>Flag by-value exceptions if their types are part of a hierarchy >>>(could require whole-program analysis to be perfect). >>> >>> The corresponding CERT C++ Coding Standard guideline offers >>> a similar suggestion here: >>> >>>https://www.securecoding.cert.org/confluence/x/TAD5CQ >>> >>> so I would suggest to model the warning on that approach (within >>> limits of a single translation unit, of course). I.e., warn only >>> for catching by value objects of non-trivial types, or perhaps even >>> only polymorphic types? >>> >>> Martin >> >> >> I've never seen anybody throw integers in real-world code, so I didn't >> want to complicate things for this case. But maybe I should only warn >> about class-types. >> >> IMHO it makes sense to warn about non-polymorphic class types >> (although slicing is not a problem there), because you still have to >> deal with redundant copies. >> >> Another thing would be pointers. I've never seen pointers in catch >> handlers (except some 'catch (const char*)' which I would consider >> bad practice). Therefore I'd like to warn about 'catch (const A*)' >> which might be a typo that should read 'catch (const A&)' instead. >> >> Would that be OK? > > > To my knowledge, catch by value of non-polymorphic types (and > certainly fundamental types) is not a cause of common bugs. > It's part of the recommended practice to throw by value, catch > by reference, which is grounded in avoiding the slicing problem. > It's also sometimes recommended for non-trivial class types to > avoid creating a copy of the object (which, for non-trivial types, > may need to allocate resource and could throw). Otherwise, it's > not dissimilar to pass-by value vs pass-by-reference (or even > pass-by-pointer). Both may be good practices for some types or > in some situations but neither is necessary to avoid bugs or > universally applicable to achieve superior performance. > > The pointer case is interesting. In C++ Coding Standards, > Sutter and Alexandrescu recommend to throw (and catch) smart > pointers over plain pointers because it obviates having to deal > with memory management issues. That's sound advice but it seems > more like a design guideline than a coding rule aimed at directly > preventing bugs. I also think that the memory management bugs > that it might find might be more easily detected at the throw > site instead. E.g., warning on the throw expression below: > > { > Exception e; > throw &e; > } > > or perhaps even on > > { > throw *new Exception (); > } > > A more sophisticated (and less restrictive) checker could detect > and warn on "throw " if it found a catch (T) or catch (T&) > in the same file and no catch (T*) (but not warn otherwise). > > Martin > > PS After re-reading some of the coding guidelines on this topic > it occurs to me that (if your patch doesn't handle this case yet) > it might be worth considering to enhance it to also warn on > rethrowing caught polymorphic objects (i.e., warn on > > catch (E &e) { throw e; } > > and suggest to use "throw;" instead, for the same reason: to > help avoid slicing. > > PPS It may be a useful feature to implement some of other ideas > you mentioned (e.g., throw by value rather than pointer) but it > feels like a separate and more ambitious project than detecting > the relatively common and narrow slicing problem. I agree with Martin's comments. Jason
Re: [C++ PATCH] namespace bindings
On 05/08/2017 08:56 AM, Jason Merrill wrote: On Fri, May 5, 2017 at 4:07 PM, Nathan Sidwell wrote: This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces {get,set}_namespace_binding with get_namespace_value and set_global_value respectively. I'd prefer to stick with the _binding naming, as "value" is more ambiguous. Ok. It is getting the 'value' binding, ignoring the 'type' binding, fwiw. nathan -- Nathan Sidwell
[PATCH] Small PRE speedup
Was sitting in my tree. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2017-05-08 Richard Biener * tree-ssa-pre.c (bitmap_set_and): Avoid bitmap copy. (bitmap_set_subtract_values): Likewise. Index: gcc/tree-ssa-pre.c === --- gcc/tree-ssa-pre.c (revision 247733) +++ gcc/tree-ssa-pre.c (working copy) @@ -817,19 +818,23 @@ bitmap_set_and (bitmap_set_t dest, bitma if (dest != orig) { - bitmap_head temp; - bitmap_initialize (&temp, &grand_bitmap_obstack); - bitmap_and_into (&dest->values, &orig->values); - bitmap_copy (&temp, &dest->expressions); - EXECUTE_IF_SET_IN_BITMAP (&temp, 0, i, bi) + + unsigned int to_clear = -1U; + FOR_EACH_EXPR_ID_IN_SET (dest, i, bi) { + if (to_clear != -1U) + { + bitmap_clear_bit (&dest->expressions, to_clear); + to_clear = -1U; + } pre_expr expr = expression_for_id (i); unsigned int value_id = get_expr_value_id (expr); if (!bitmap_bit_p (&dest->values, value_id)) - bitmap_clear_bit (&dest->expressions, i); + to_clear = i; } - bitmap_clear (&temp); + if (to_clear != -1U) + bitmap_clear_bit (&dest->expressions, to_clear); } } @@ -862,18 +867,20 @@ bitmap_set_subtract_values (bitmap_set_t { unsigned int i; bitmap_iterator bi; - bitmap_head temp; - - bitmap_initialize (&temp, &grand_bitmap_obstack); - - bitmap_copy (&temp, &a->expressions); - EXECUTE_IF_SET_IN_BITMAP (&temp, 0, i, bi) + pre_expr to_remove = NULL; + FOR_EACH_EXPR_ID_IN_SET (a, i, bi) { + if (to_remove) + { + bitmap_remove_from_set (a, to_remove); + to_remove = NULL; + } pre_expr expr = expression_for_id (i); if (bitmap_set_contains_value (b, get_expr_value_id (expr))) - bitmap_remove_from_set (a, expr); + to_remove = expr; } - bitmap_clear (&temp); + if (to_remove) +bitmap_remove_from_set (a, to_remove); }
Re: [Testsuite, committed] Fix vector peeling test failures
On Mon, May 8, 2017 at 2:41 PM, Wilco Dijkstra wrote: > This fixes a few failures on ARM and AArch64 due to a recent change in > alignment peeling by switching the vector cost model off > (https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00407.html). > > Tested on AArch64, ARM and x64 - committed as obvious. Thanks. Note that I'm not sure what -fno-vect-cost-model actually means when peeling for alignment, so this fix might not "prevail". Richard. > ChangeLog: > 2017-05-08 Wilco Dijkstra > > * testsuite/gcc.dg/vect/vect-44.c: Add -fno-vect-cost-model. > * gcc/testsuite/gcc.dg/vect/vect-50.c: Likewise. > -- > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-44.c > b/gcc/testsuite/gcc.dg/vect/vect-44.c > index > 186f9cfc9e26d6eb53514dec0fac176d696ec578..fbc593572429422c8e527c5e5559c515efd38aa6 > 100644 > --- a/gcc/testsuite/gcc.dg/vect/vect-44.c > +++ b/gcc/testsuite/gcc.dg/vect/vect-44.c > @@ -1,4 +1,5 @@ > /* { dg-require-effective-target vect_float } */ > +/* { dg-options "-fno-vect-cost-model" } */ > > #include > #include "tree-vect.h" > diff --git a/gcc/testsuite/gcc.dg/vect/vect-50.c > b/gcc/testsuite/gcc.dg/vect/vect-50.c > index > 78bfd8d3920445fe51c7393a82870ea85f62bb55..0d5febc165ee3bf3b7b595237168d9d4b9604d4b > 100644 > --- a/gcc/testsuite/gcc.dg/vect/vect-50.c > +++ b/gcc/testsuite/gcc.dg/vect/vect-50.c > @@ -1,4 +1,5 @@ > /* { dg-require-effective-target vect_float } */ > +/* { dg-options "-fno-vect-cost-model" } */ > > #include > #include "tree-vect.h"
Re: [PATCH] disable -Walloc-size-larger-than and -Wstringop-overflow for non-C front ends (PR 80545)
On 05/04/2017 10:13 PM, Jeff Law wrote: On 04/28/2017 04:02 PM, Martin Sebor wrote: The two options were included in -Wall and enabled for all front ends but only made to be recognized by the driver for the C family of compilers. That made it impossible to suppress those warnings when compiling code for those other front ends (like Fortran). The attached patch adjusts the warnings so that they are only enabled for the C family of front ends and not for any others, as per Richard's suggestion. (The other solution would have been to make the warnings available to all front ends. Since non-C languages don't have a way of calling the affected functions -- or do they? -- this is probably not necessary.) Martin gcc-80545.diff PR driver/80545 - option -Wstringop-overflow not recognized by Fortran gcc/c-family/ChangeLog: PR driver/80545 * c.opt (-Walloc-size-larger-than, -Wstringop-overflow): Enable and make available for the C family only. OK. jeff It turns out that this is not the right fix. I overlooked that -Wstringop-overflow is meant to be enabled by default and while removing the Init(2) bit and replacing it with LangEnabledBy (C ObjC C++ ObjC++, Wall, 2, 0) suppresses the warning in Fortran it also disables it by default in C/C++ unless -Wall is used. By my reading of the Option properties part of the GCC Internals manual there is no way to initialize a warning to on by default while making it available only in a subset of languages. The only way I can think of is to initialize it in the .opt file to something like -1 and then change it at some point to 2 somewhere in the C/C++ front ends. That seems pretty cumbersome. Am I missing some trick? Martin
Re: [C++ PATCH] namespace bindings
On 05/08/2017 08:07 AM, Andreas Schwab wrote: On Mai 05 2017, Nathan Sidwell wrote: This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces {get,set}_namespace_binding with get_namespace_value and set_global_value respectively. I'm getting this regression on ia64: FAIL: g++.dg/template/virtual3.C -std=c++11 (test for excess errors) I'm not seeing this. On an ia64-linux target (hosted on x86_64-linux) I get the attached output, which is the same as I get on an x86_64-linux native build. For the failure mode you describe I'd expect many more test failures to occur. I wonder if there's some uninitialized variable or something? nathan -- Nathan Sidwell virtual3.C:8:8: error: deleted function 'virtual B::~B()' struct B : A<0>, A<1> // { dg-error "deleted|context" } ^ virtual3.C:5:11: error: overriding non-deleted function 'A< >::~A() [with int = 0]' virtual ~A(); // { dg-message "non-deleted|private" } ^ virtual3.C:8:8: note: 'virtual B::~B()' is implicitly deleted because the default definition would be ill-formed: struct B : A<0>, A<1> // { dg-error "deleted|context" } ^ virtual3.C:8:8: error: 'A< >::~A() [with int = 0]' is private within this context virtual3.C:5:11: note: declared private here virtual ~A(); // { dg-message "non-deleted|private" } ^ virtual3.C:8:8: error: 'A< >::~A() [with int = 1]' is private within this context struct B : A<0>, A<1> // { dg-error "deleted|context" } ^ virtual3.C:5:11: note: declared private here virtual ~A(); // { dg-message "non-deleted|private" } ^ virtual3.C:8:8: error: deleted function 'virtual B::~B()' struct B : A<0>, A<1> // { dg-error "deleted|context" } ^ virtual3.C:5:11: error: overriding non-deleted function 'A< >::~A() [with int = 1]' virtual ~A(); // { dg-message "non-deleted|private" } ^ virtual3.C: In constructor 'B::B()': virtual3.C:10:7: error: 'A< >::~A() [with int = 0]' is private within this context B() {} // { dg-error "context" } ^ virtual3.C:5:11: note: declared private here virtual ~A(); // { dg-message "non-deleted|private" } ^ virtual3.C:10:7: error: 'A< >::~A() [with int = 1]' is private within this context B() {} // { dg-error "context" } ^ virtual3.C:5:11: note: declared private here virtual ~A(); // { dg-message "non-deleted|private" } ^
Re: [AArch64] Tighten move constraints for symbolic operands
On 31/03/17 09:34, Richard Sandiford wrote: > The movsi and movdi constraints allowed the source to be any > absolute symbolic expression ("S"). That's OK for operands that > have already been vetted by the aarch64_mov_operand predicate but > causes problems if the register allocator substitutes an equivalence > (the usual "the constraints can't accept more than the predicates" > restriction). > > Although all other uses of "S" in the backend are redundant and could > in principle be removed, "S" itself is a publicly-documented constraint > and so we'd have to keep its definition. This patch therefore adds a > new "Usa" constraint for legitimate absolute address operands. > > I saw this for a large testcase in which an equivalence was used > for a label_ref jump table. It's not something that can be cut > down easily or that would give a robust regression test. > > I don't think this is a regression, so maybe we don't want it for GCC 7. > > Tested on aarch64-linux-gnu. OK to install? > OK. R. > Thanks, > Richard > > > gcc/ > * config/aarch64/constraints.md (Usa): New constraint. > * config/aarch64/aarch64.md (*movsi_aarch64, *movdi_aarch64): Use it. > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 3717edf..260bd64 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -1074,7 +1074,7 @@ > > (define_insn_and_split "*movsi_aarch64" >[(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r ,r,*w,m, > m,r,r ,*w, r,*w") > - (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,Dv,m, > m,rZ,*w,S,Ush,rZ,*w,*w"))] > + (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,Dv,m, > m,rZ,*w,Usa,Ush,rZ,*w,*w"))] >"(register_operand (operands[0], SImode) > || aarch64_reg_or_zero (operands[1], SImode))" >"@ > @@ -1108,7 +1108,7 @@ > > (define_insn_and_split "*movdi_aarch64" >[(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r ,r,*w,m, > m,r,r, *w, r,*w,w") > - (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,n,Dv,m, > m,rZ,*w,S,Ush,rZ,*w,*w,Dd"))] > + (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,n,Dv,m, > m,rZ,*w,Usa,Ush,rZ,*w,*w,Dd"))] >"(register_operand (operands[0], DImode) > || aarch64_reg_or_zero (operands[1], DImode))" >"@ > diff --git a/gcc/config/aarch64/constraints.md > b/gcc/config/aarch64/constraints.md > index b77e096..5852a42 100644 > --- a/gcc/config/aarch64/constraints.md > +++ b/gcc/config/aarch64/constraints.md > @@ -104,6 +104,14 @@ >(and (match_code "high") > (match_test "aarch64_valid_symref (XEXP (op, 0), GET_MODE (XEXP (op, > 0)))"))) > > +(define_constraint "Usa" > + "@internal > + A constraint that matches an absolute symbolic address that can be > + loaded by a single ADR." > + (and (match_code "const,symbol_ref,label_ref") > + (match_test "aarch64_symbolic_address_p (op)") > + (match_test "aarch64_mov_operand_p (op, GET_MODE (op))"))) > + > (define_constraint "Uss" >"@internal >A constraint that matches an immediate shift constant in SImode." >
[PATCH, GCC/ARM, FYI] Define TM_MULTILIB_CONFIG for ARM multilib
Hi, TM_MULTILIB_CONFIG is not set in config.gcc when building with multilib for arm targets, leading to config/arm/t-multilib not including any of the files (t-aprofile and t-rmprofile) definining the architecture and FPU to build multilib for. This patch fixes that by setting TM_MULTILIB_CONFIG to with_multilib_list's value after it has been checked. It also fix a trailing whitespace issue. ChangeLog entry is as follows: *** gcc/ChangeLog ** 2017-05-08 Thomas Preud'homme * config.gcc (arm*-*-*): Set TM_MULTILIB_CONFIG from with_multilib_list after it has been checked. Committing as obvious. Best regards, Thomas diff --git a/gcc/config.gcc b/gcc/config.gcc index fd7caebf0d155df0b25cc6a8374af57555707fc3..e8aaf2df9d709bf9d95f0a249c79f5813519d51f 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -3803,7 +3803,7 @@ case "${target}" in ;; esac done - fi + fi if test "x${tmake_profile_file}" != x ; then # arm/t-aprofile and arm/t-rmprofile are only @@ -3820,6 +3820,7 @@ case "${target}" in fi tmake_file="${tmake_file} ${tmake_profile_file}" +TM_MULTILIB_CONFIG="$with_multilib_list" fi fi ;;
Re: [PATCH 1/2] Automatic context save/restore for regular interrupts.
* Claudiu Zissulescu [2017-05-05 13:02:43 +0200]: > The AUX_IRQ_CTRL register controls the behavior of automated register > save and restore or prologue and epilogue sequences during a non-fast > interrupt entry and exit, and context save and restore instructions. > > A user passes to the compiler the configuration of the AUX_IRQ_CTRL > register via mirq-ctrl-saved option. This option, specifies > gneral-purposes registers that the processor saves/restores on > interrupt entry and exit, and it is only valid for ARC EM and ARC HS > cores. > > gcc/ > 2017-05-05 Claudiu Zissulescu > > * config/arc/arc.c (irq_ctrl_saved): New variable. > (ARC_AUTOBLINK_IRQ_P): Define. > (ARC_AUTOFP_IRQ_P): Likewise. > (ARC_AUTO_IRQ_P): Likewise. > (irq_range): New function. > (arc_must_save_register): Likewise. > (arc_must_save_return_addr): Likewise. > (arc_dwarf_emit_irq_save_regs): Likewise. > (arc_override_options): Handle deferred options. > (MUST_SAVE_REGISTER): Deleted, replaced by arc_must_save_register. > (MUST_SAVE_RETURN_ADDR): Deleted, replaced by > arc_must_save_return_addr. > (arc_compute_frame_size): Handle automated save and restore of > registers. > (arc_expand_prologue): Likewise. > (arc_expand_epilogue): Likewise. > * config/arc/arc.md (stack_irq_dwarf): New unspec instruction. > * config/arc/arc.opt (mirq-ctrl-saved): New option. > * doc/invoke.texi (mirq-ctrl-saved): Document option. > * testsuite/gcc.target/arc/interrupt-5.c: Newfile. > * testsuite/gcc.target/arc/interrupt-6.c: Likewise. > * testsuite/gcc.target/arc/interrupt-7.c: Likewise. > * testsuite/gcc.target/arc/interrupt-8.c: Likewise. > * testsuite/gcc.target/arc/interrupt-9.c: Likewise. > --- > gcc/config/arc/arc.c | 329 > ++--- > gcc/config/arc/arc.md | 8 + > gcc/config/arc/arc.opt | 4 + > gcc/doc/invoke.texi| 11 +- > gcc/testsuite/gcc.target/arc/interrupt-5.c | 19 ++ > gcc/testsuite/gcc.target/arc/interrupt-6.c | 22 ++ > gcc/testsuite/gcc.target/arc/interrupt-7.c | 16 ++ > gcc/testsuite/gcc.target/arc/interrupt-8.c | 27 +++ > gcc/testsuite/gcc.target/arc/interrupt-9.c | 17 ++ > 9 files changed, 421 insertions(+), 32 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-5.c > create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-6.c > create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-7.c > create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-8.c > create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-9.c > > diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c > index 4574481..a61faef 100644 > --- a/gcc/config/arc/arc.c > +++ b/gcc/config/arc/arc.c > @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see > #include "builtins.h" > #include "rtl-iter.h" > #include "alias.h" > +#include "opts.h" > > /* Which cpu we're compiling for (ARC600, ARC601, ARC700). */ > static char arc_cpu_name[10] = ""; > @@ -111,6 +112,29 @@ struct GTY (()) arc_ccfsm >int target_label; > }; > > +/* Status of the IRQ_CTRL_AUX register. */ > +typedef struct irq_ctrl_saved_t > +{ > + short irq_save_last_reg; /* Last register number used by > +IRQ_CTRL_SAVED aux_reg. */ > + bool irq_save_blink; /* True if BLINK is automatically > +saved. */ > + bool irq_save_lpcount; /* True if LPCOUNT is automatically > +saved. */ > +} irq_ctrl_saved_t; I'm pretty sure GNU style is to put the comments above each entry in the struct, such as: /* Last register number used by IRQ_CTRL_SAVED aux_reg. */ short irq_save_last_reg; I did have a quick look though GCC and couldn't see many examples of comments to the right. > +static irq_ctrl_saved_t irq_ctrl_saved; > + > +#define ARC_AUTOBLINK_IRQ_P(FNTYPE) \ > + (ARC_INTERRUPT_P (FNTYPE) && irq_ctrl_saved.irq_save_blink) > + > +#define ARC_AUTOFP_IRQ_P(FNTYPE) \ > + (ARC_INTERRUPT_P (FNTYPE) && (irq_ctrl_saved.irq_save_last_reg > 26)) > + > +#define ARC_AUTO_IRQ_P(FNTYPE) \ > + (ARC_INTERRUPT_P (FNTYPE) \ > + && (irq_ctrl_saved.irq_save_blink \ > + || (irq_ctrl_saved.irq_save_last_reg >= 0))) > + > #define arc_ccfsm_current cfun->machine->ccfsm_current > > #define ARC_CCFSM_BRANCH_DELETED_P(STATE) \ > @@ -806,11 +830,110 @@ arc_init (void) > } > } > > +/* Parse -mirq-ctrl-saved= option string. Registers may be specified > + individually, or as ranges such as "r0-r3". Registers accepted are > + r0 through r31 and lp_count. Registers and ranges must be > + comma-separated. */ The comment seem
[PATCH, rs6000] Add x86 instrinsic headers to GCC PPC64LE taget
A common issue in porting applications and packages is that someone may have forgotten that there is more than one hardware platform. A specific example is applications using Intel x86 intrinsic functions without appropriate conditional compile guards. Another example is a developer tasked to port a large volume of code containing important functions "optimized" with Intel x86 intrinsics, but without the skill or time to perform the same optimization for another platform. Often the developer who wrote the original optimization has moved on and those left to maintain the application / package lack understanding of the original x86 intrinsic code or design. For PowerPC this can be acute especially for HPC vector SIMD codes. The PowerISA (as implemented for POWER and OpenPOWER servers) has extensive vector hardware facilities and GCC proves a large set of vector intrinsics. Thus I would like to restrict this support to PowerPC targets that support VMX/VSX and PowerISA-2.07 (power8) and later. But the difference in (intrinsic) spelling alone is enough stop many application developers in their tracks. So I propose to submit a series of patches to implement the PowerPC64LE equivalent of a useful subset of the x86 intrinsics. The final size and usefulness of this effort is to be determined. The proposal is to incrementally port intrinsic header files from the ./config/i386 tree to the ./config/rs6000 tree. This naturally provides the same header structure and intrinsic names which will simplify code porting. It seems natural to work from the bottom (oldest) up. For example starting with mmintrin.h and working our way up the following headers: smmintrin.h(SSE4.1) includes tmmintrin,h tmmintrin.h(SSSE3) includes pmmintrin.h pmmintrin.h(SSE3)includes emmintrin,h emmintrin.h(SSE2)includes xmmintrin.h xmmintrin.h(SSE) includes mmintrin.h and mm_malloc.h mmintrin.h (MMX) There is a smattering of non-vector intrinsics in common use. Like the Bit Manipulation Instructions (BMI & BMI2). bmiintrin.h bmi2intrin.h x86intrin.h (collector includes BMI headers and many others) The older intrinsic (BMI/MMX/SSE) instructions have been integrated into GCC and many of the intrinsic implementations are simple C code or GCC built-ins. The remaining intrinsic functions are implemented as platform specific builtins (__builtin_ia32_*) and need to be mapped to equivalent PowerPC builtin or vector intrinsic from altivec.h is required. Of course as part of this process we will port as many of the corresponding DejaGnu tests from gcc/testsuite/gcc.target/i386/ to gcc/testsuite/gcc.target/powerpc/ as appropriate. So far the dg-do run tests only require minor source changes, mostly to the platform specific dg-* directives. A few dg-do compile tests are needed to insure we are getting the expected folding/Common subexpression elimination (CSE) to generate the optimum sequence for PowerPC. To get the ball rolling I include the BMI intrinsics ported to PowerPC for review as they are reasonable size (31 intrinsic implementations). [gcc] 2017-05-04 Steven Munroe * config.gcc (powerpc*-*-*): Add bmi2intrin.h, bmiintrin.h, and x86intrin.h * config/rs6000/bmiintrin.h: New file. * config/rs6000/bmi2intrin.h: New file. * config/rs6000/x86intrin.h: New file. [gcc/testsuite] 2017-05-04 Steven Munroe * gcc.target/powerpc/bmi-andn-1.c: New file * gcc.target/powerpc/bmi-andn-2.c: New file. * gcc.target/powerpc/bmi-bextr-1.c: New file. * gcc.target/powerpc/bmi-bextr-2.c: New file. * gcc.target/powerpc/bmi-bextr-4.c: New file. * gcc.target/powerpc/bmi-bextr-5.c: New file. * gcc.target/powerpc/bmi-blsi-1.c: New file. * gcc.target/powerpc/bmi-blsi-2.c: New file. * gcc.target/powerpc/bmi-blsmsk-1.c: new file. * gcc.target/powerpc/bmi-blsmsk-2.c: New file. * gcc.target/powerpc/bmi-blsr-1.c: New file. * gcc.target/powerpc/bmi-blsr-2.c: New File. * gcc.target/powerpc/bmi-check.h: New File. * gcc.target/powerpc/bmi-tzcnt-1.c: new file. * gcc.target/powerpc/bmi-tzcnt-2.c: New file. * gcc.target/powerpc/bmi2-bzhi32-1.c: New file. * gcc.target/powerpc/bmi2-bzhi64-1.c: New file. * gcc.target/powerpc/bmi2-bzhi64-1a.c: New file. * gcc.target/powerpc/bmi2-check.h: New file. * gcc.target/powerpc/bmi2-mulx32-1.c: New file. * gcc.target/powerpc/bmi2-mulx32-2.c: New file. * gcc.target/powerpc/bmi2-mulx64-1.c: New file. * gcc.target/powerpc/bmi2-mulx64-2.c: New file. * gcc.target/powerpc/bmi2-pdep32-1.c: New file. * gcc.target/powerpc/bmi2-pdep64-1.c: New file. * gcc.target/powerpc/bmi2-pext32-1.c: New File. * gcc.target/powerpc/bmi2-pext64-1.c: New file. * gcc.target/powerpc/bmi2-pext64-1a.c: New File. Index: gcc/testsuite/gcc.target/powerpc/bmi-blsi-
Re: [PATCH 2/2] Fast interrupts support.
* Claudiu Zissulescu [2017-05-05 13:02:44 +0200]: > When a processor enters a fast interrupts handler, and duplicate > register banks are configured, the processor saves the user context by > saving the registers in the main register bank to these additional > registers in the duplicate register bank. In this fast interrupt > context, when you specify the rgf_banked_regs option,the compiler does > not save the registers duplicated in the additional register bank are > not saved. > > gcc/ > 2016-10-04 Claudiu Zissulescu > Andrew Burgess > > * config/arc/arc.c (ARC_AUTOBLINK_IRQ_P): Consider fast interrupts > case also. > (ARC_AUTOFP_IRQ_P): Likewise. > (ARC_AUTO_IRQ_P): Likewise. > (rgf_banked_register_count): New variable. > (parse_mrgf_banked_regs_option): New function. > (arc_override_options): Handle rgf_banked_regs option. > (arc_handle_interrupt_attribute): Add firq option. > (arc_compute_function_type): Return fast irq type when required. > (arc_must_save_register): Handle fast interrupts. > (arc_expand_prologue): Do not emit dwarf info for fast interrupts. > (arc_return_address_regs): Update. > * config/arc/arc.h (arc_return_address_regs): Update. > (arc_function_type): Add fast interrupt type. > (ARC_INTERRUPT_P): Update. > (RC_FAST_INTERRUPT_P): Define. > * config/arc/arc.md (simple_return): Update for fast interrupts. > (p_return_i): Likewise. > * config/arc/arc.opt (mrgf-banked-regs): New option. > * doc/invoke.texi (mrgf-banked-regs): Document. > * testsuite/gcc.target/arc/firq-1.c: New file. > * testsuite/gcc.target/arc/firq-2.c: Likewise. > * testsuite/gcc.target/arc/firq-3.c: Likewise. > * testsuite/gcc.target/arc/firq-4.c: Likewise. > * testsuite/gcc.target/arc/firq-5.c: Likewise. > * testsuite/gcc.target/arc/firq-6.c: Likewise. Looks fine to me, Thanks, Andrew > --- > gcc/config/arc/arc.c | 106 > +++--- > gcc/config/arc/arc.h | 13 +++-- > gcc/config/arc/arc.md | 9 ++- > gcc/config/arc/arc.opt| 4 ++ > gcc/doc/invoke.texi | 10 > gcc/testsuite/gcc.target/arc/firq-1.c | 27 + > gcc/testsuite/gcc.target/arc/firq-2.c | 31 ++ > gcc/testsuite/gcc.target/arc/firq-3.c | 40 + > gcc/testsuite/gcc.target/arc/firq-4.c | 31 ++ > gcc/testsuite/gcc.target/arc/firq-5.c | 15 + > gcc/testsuite/gcc.target/arc/firq-6.c | 21 +++ > 11 files changed, 277 insertions(+), 30 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/arc/firq-1.c > create mode 100644 gcc/testsuite/gcc.target/arc/firq-2.c > create mode 100644 gcc/testsuite/gcc.target/arc/firq-3.c > create mode 100644 gcc/testsuite/gcc.target/arc/firq-4.c > create mode 100644 gcc/testsuite/gcc.target/arc/firq-5.c > create mode 100644 gcc/testsuite/gcc.target/arc/firq-6.c > > diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c > index a61faef..a0cd597 100644 > --- a/gcc/config/arc/arc.c > +++ b/gcc/config/arc/arc.c > @@ -125,16 +125,25 @@ typedef struct irq_ctrl_saved_t > static irq_ctrl_saved_t irq_ctrl_saved; > > #define ARC_AUTOBLINK_IRQ_P(FNTYPE) \ > - (ARC_INTERRUPT_P (FNTYPE) && irq_ctrl_saved.irq_save_blink) > - > -#define ARC_AUTOFP_IRQ_P(FNTYPE) \ > - (ARC_INTERRUPT_P (FNTYPE) && (irq_ctrl_saved.irq_save_last_reg > 26)) > - > -#define ARC_AUTO_IRQ_P(FNTYPE) \ > - (ARC_INTERRUPT_P (FNTYPE) \ > - && (irq_ctrl_saved.irq_save_blink \ > + ((ARC_INTERRUPT_P (FNTYPE) \ > +&& irq_ctrl_saved.irq_save_blink)\ > + || (ARC_FAST_INTERRUPT_P (FNTYPE) \ > + && rgf_banked_register_count > 8)) > + > +#define ARC_AUTOFP_IRQ_P(FNTYPE) \ > + ((ARC_INTERRUPT_P (FNTYPE) \ > +&& (irq_ctrl_saved.irq_save_last_reg > 26)) \ > + || (ARC_FAST_INTERRUPT_P (FNTYPE) \ > + && rgf_banked_register_count > 8)) > + > +#define ARC_AUTO_IRQ_P(FNTYPE) \ > + (ARC_INTERRUPT_P (FNTYPE) && !ARC_FAST_INTERRUPT_P (FNTYPE)\ > + && (irq_ctrl_saved.irq_save_blink \ > || (irq_ctrl_saved.irq_save_last_reg >= 0))) > > +/* Number of registers in second bank for FIRQ support. */ > +static int rgf_banked_register_count; > + > #define arc_ccfsm_current cfun->machine->ccfsm_current > > #define ARC_CCFSM_BRANCH_DELETED_P(STATE) \ > @@ -924,6 +933,27 @@ irq_range (const char *cstr) >irq_ctrl_saved.irq_save_lpcount = (lpcount == 60); > } > > +/* Parse -mrgf-banked-regs=NUM option strin
Re: [PATCH 3/3] [ARC] Add support for advanced mpy/mac instructions.
* Claudiu Zissulescu [2017-04-25 15:03:44 +0200]: > Add support for MAC and DMPY instructions. ARCv2 Only. > > gcc/ > 2016-12-08 Claudiu Zissulescu > > * config/arc/arc.c (arc_conditional_register_usage): Handle ACCL, > ACCH registers. > * config/arc/arc.md (mulsidi3): Use advanced mpy instructions when > available. > (umulsidi3): Likewise. > (mulsidi3_700): Disable this pattern when we have advanced mpy > instructions. > (umulsidi3_700): Likewise. > (maddsidi4): New pattern. > (macd, mac, mac_r, umaddsidi4, macdu, macu, macu_r): Likewise. > (mpyd_arcv2hs, mpyd_imm_arcv2hs, mpydu_arcv2hs): Likewise. > (mpydu_imm_arcv2hs): Likewise. > * config/arc/predicates.md (accl_operand): New predicate. Looks good, but is it not possible to write some tests? Thanks, Andrew > --- > gcc/config/arc/arc.c | 5 +- > gcc/config/arc/arc.md| 266 > ++- > gcc/config/arc/predicates.md | 5 + > 3 files changed, 272 insertions(+), 4 deletions(-) > > diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c > index 170b8dd..238244b 100644 > --- a/gcc/config/arc/arc.c > +++ b/gcc/config/arc/arc.c > @@ -1789,8 +1789,9 @@ arc_conditional_register_usage (void) >arc_regno_reg_class[PROGRAM_COUNTER_REGNO] = GENERAL_REGS; > >/*ARCV2 Accumulator. */ > - if (TARGET_V2 > - && (TARGET_FP_DP_FUSED || TARGET_FP_SP_FUSED)) > + if ((TARGET_V2 > + && (TARGET_FP_DP_FUSED || TARGET_FP_SP_FUSED)) > + || TARGET_PLUS_DMPY) >{ > arc_regno_reg_class[ACCL_REGNO] = WRITABLE_CORE_REGS; > arc_regno_reg_class[ACCH_REGNO] = WRITABLE_CORE_REGS; > diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md > index ffab390..db5867c 100644 > --- a/gcc/config/arc/arc.md > +++ b/gcc/config/arc/arc.md > @@ -2157,6 +2157,18 @@ >"TARGET_ANY_MPY" > " > { > + if (TARGET_PLUS_MACD) > +{ > + if (CONST_INT_P (operands[2])) > + { > + emit_insn (gen_mpyd_imm_arcv2hs (operands[0], operands[1], > operands[2])); > + } > + else > + { > + emit_insn (gen_mpyd_arcv2hs (operands[0], operands[1], operands[2])); > + } > + DONE; > +} >if (TARGET_MPY) > { >operands[2] = force_reg (SImode, operands[2]); > @@ -2237,7 +2249,7 @@ >[(set (match_operand:DI 0 "register_operand" "=&r") > (mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "%c")) >(sign_extend:DI (match_operand:SI 2 "extend_operand" "cL"] > - "TARGET_MPY" > + "TARGET_MPY && !TARGET_PLUS_MACD" >"#" >"&& reload_completed" >[(const_int 0)] > @@ -2390,6 +2402,18 @@ >(zero_extend:DI (match_operand:SI 2 "nonmemory_operand" ""] >"" > { > + if (TARGET_PLUS_MACD) > +{ > + if (CONST_INT_P (operands[2])) > + { > + emit_insn (gen_mpydu_imm_arcv2hs (operands[0], operands[1], > operands[2])); > + } > + else > + { > + emit_insn (gen_mpydu_arcv2hs (operands[0], operands[1], operands[2])); > + } > + DONE; > +} >if (TARGET_MPY) > { >operands[2] = force_reg (SImode, operands[2]); > @@ -2480,7 +2504,7 @@ >[(set (match_operand:DI 0 "dest_reg_operand" "=&r") > (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "%c")) >(zero_extend:DI (match_operand:SI 2 "extend_operand" "cL"] > - "TARGET_MPY" > + "TARGET_MPY && !TARGET_PLUS_MACD" >"#" >"reload_completed" >[(const_int 0)] > @@ -6215,6 +6239,244 @@ >"" >[(set_attr "length" "0")]) > > +;; MAC and DMPY instructions > +(define_insn_and_split "maddsidi4" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (plus:DI > + (mult:DI > + (sign_extend:DI (match_operand:SI 1 "register_operand" "%r")) > + (sign_extend:DI (match_operand:SI 2 "extend_operand" "ri"))) > + (match_operand:DI 3 "register_operand" "r")))] > + "TARGET_PLUS_DMPY" > + "#" > + "TARGET_PLUS_DMPY && reload_completed" > + [(const_int 0)] > + "{ > + rtx acc_reg = gen_rtx_REG (DImode, ACC_REG_FIRST); > + emit_move_insn (acc_reg, operands[3]); > + if (TARGET_PLUS_MACD) > + emit_insn (gen_macd (operands[0], operands[1], operands[2])); > + else > + { > + emit_insn (gen_mac (operands[1], operands[2])); > + emit_move_insn (operands[0], acc_reg); > + } > + DONE; > + }" > + [(set_attr "type" "multi") > + (set_attr "length" "36")]) > + > +(define_insn "macd" > + [(set (match_operand:DI 0 "even_register_operand" "=Rcr,r,r") > + (plus:DI > + (mult:DI > + (sign_extend:DI (match_operand:SI 1 "register_operand" "%0,c,c")) > + (sign_extend:DI (match_operand:SI 2 "extend_operand" " c,cI,Cal"))) > + (reg:DI ARCV2_ACC))) > + (set (reg:DI ARCV2_ACC) > + (plus:DI > + (mult:DI (sign_extend:DI (match_dup 1)) > + (sign_extend:DI (match_dup 2))) > + (
[PATCH] Clarify mt_allocator documentation w.r.t deallocation
The interwebs seem confused by these docs, and think it means that mt_allocator just leaks memory instead of deallocating it. The interwebs are pretty dumb sometimes. * doc/xml/manual/mt_allocator.xml: Clarify deallocation behaviour. * doc/html/*: Regenerate. Committed to trunk. commit 4d46264542fef99c1eaf8d10e996c741f109ba44 Author: Jonathan Wakely Date: Mon May 8 16:04:53 2017 +0100 Clarify mt_allocator documentation w.r.t deallocation * doc/xml/manual/mt_allocator.xml: Clarify deallocation behaviour. * doc/html/*: Regenerate. diff --git a/libstdc++-v3/doc/xml/manual/mt_allocator.xml b/libstdc++-v3/doc/xml/manual/mt_allocator.xml index 12fe2ee..3254bf8 100644 --- a/libstdc++-v3/doc/xml/manual/mt_allocator.xml +++ b/libstdc++-v3/doc/xml/manual/mt_allocator.xml @@ -281,7 +281,8 @@ The _S_initialize() function: Notes about deallocation. This allocator does not explicitly -release memory. Because of this, memory debugging programs like +release memory back to the OS, but keeps its own freelists instead. +Because of this, memory debugging programs like valgrind or purify may notice leaks: sorry about this inconvenience. Operating systems will reclaim allocated memory at program termination anyway. If sidestepping this kind of noise is
Re: [C++ PATCH] namespace bindings
On Mai 08 2017, Nathan Sidwell wrote: > On 05/08/2017 08:07 AM, Andreas Schwab wrote: >> On Mai 05 2017, Nathan Sidwell wrote: >> >>> This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces >>> {get,set}_namespace_binding with get_namespace_value and set_global_value >>> respectively. >> >> I'm getting this regression on ia64: >> >> FAIL: g++.dg/template/virtual3.C -std=c++11 (test for excess errors) > > I'm not seeing this. On an ia64-linux target (hosted on x86_64-linux) I > get the attached output, which is the same as I get on an x86_64-linux > native build. g++.dg/cpp0x/defaulted34.C has the same problem. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
MinGW compilation warnings in libiberty's include/environ.h
When compiling libiberty (as part of GDB) with MinGW on MS-Windows, I see the following warning: gcc -c -DHAVE_CONFIG_H -O2 -gdwarf-4 -g3 -D__USE_MINGW_ACCESS -I. -I./../include -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes -pedantic -D_GNU_SOURCE ./setenv.c -o setenv.o In file included from ./setenv.c:64:0: ./../include/environ.h:30:1: warning: function declaration isn't a prototype [-Wstrict-prototypes] extern char **environ; ^ This was already reported 4 years ago, here: https://gcc.gnu.org/ml/gcc-patches/2013-03/msg00471.html and was solved back then. But it looks like the offending code was copied to include/environ.h without the fix, and the warning is thus back. The problem is with this code in environ.h: #ifndef HAVE_ENVIRON_DECL # ifdef __APPLE__ # include # define environ (*_NSGetEnviron ()) # else extern char **environ; # endif # define HAVE_ENVIRON_DECL #endif which causes the MinGW compiler to see the declaration of environ, whereas MinGW's stdlib.h has this: #ifdef __MSVCRT__ # define _environ (*__p__environ()) extern _CRTIMP __cdecl __MINGW_NOTHROW char ***__p__environ(void); # define _wenviron (*__p__wenviron()) extern _CRTIMP __cdecl __MINGW_NOTHROW wchar_t ***__p__wenviron(void); #else /* ! __MSVCRT__ */ # ifndef __DECLSPEC_SUPPORTED # define _environ (*_imp___environ_dll) extern char ***_imp___environ_dll; # else /* __DECLSPEC_SUPPORTED */ # define _environ _environ_dll __MINGW_IMPORT char ** _environ_dll; # endif /* __DECLSPEC_SUPPORTED */ #endif /* ! __MSVCRT__ */ #define environ _environ Can this be fixed again, please? The solution, as back then, is this: #ifndef HAVE_ENVIRON_DECL # ifdef __APPLE__ # include # define environ (*_NSGetEnviron ()) # else # ifndef environ extern char **environ; # endif # endif # define HAVE_ENVIRON_DECL #endif Thanks.
MinGW compilation warnings in libiberty's waitpid.c
When compiling libiberty (as part of GDB) with MinGW on MS-Windows, I see the following warning: gcc -c -DHAVE_CONFIG_H -O2 -gdwarf-4 -g3 -D__USE_MINGW_ACCESS -I. -I./../include -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes -pedantic -D_GNU_SOURCE ./waitpid.c -o waitpid.o ./waitpid.c: In function 'waitpid': ./waitpid.c:31:18: warning: implicit declaration of function 'wait' [-Wimplicit-function-declaration] int wpid = wait(stat_loc); ^ The file waitpid.c should not be built on MinGW, as it is not needed on Windows, and will not work if the function is called (because there's no 'wait' function on MS-Windows).
MinGW compilation warnings in libiberty's xstrndup.c
When compiling libiberty (as part of GDB) with MinGW on MS-Windows, I see the following warning: gcc -c -DHAVE_CONFIG_H -O2 -gdwarf-4 -g3 -D__USE_MINGW_ACCESS -I. -I./../include -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes -pedantic -D_GNU_SOURCE ./xstrndup.c -o xstrndup.o ./xstrndup.c: In function 'xstrndup': ./xstrndup.c:51:16: warning: implicit declaration of function 'strnlen' [-Wimplicit-function-declaration] size_t len = strnlen (s, n); ^ This happens because libiberty.h uses incorrect guards for the prototype of strnlen: #if defined (HAVE_DECL_STRNLEN) && !HAVE_DECL_STRNLEN extern size_t strnlen (const char *, size_t); #endif It should use HAVE_STRNLEN instead, because that's the only strnlen-related macro defined in config.g when strnlen is probed by the configure script. Thanks.
[CHKP] Attempt to fix PR79765 (multiversioning and instrumentation)
Hi, I'm trying to fix the problem with function multiversioning and MPX instrumentation (PR79765) and I face several issues. I would appreciate your advice: The first problem that arises is that multiversioning tries to make versions out of thunks, which do not have bodies. This is fixed with this patch: diff --git a/gcc/ipa-chkp.c b/gcc/ipa-chkp.c index c7fc3a0..5de11b3 100644 --- a/gcc/ipa-chkp.c +++ b/gcc/ipa-chkp.c @@ -718,6 +718,9 @@ chkp_produce_thunks (bool early) 0, CGRAPH_FREQ_BASE); node->create_reference (node->instrumented_version, IPA_REF_CHKP, NULL); + DECL_ATTRIBUTES (node->decl) + = remove_attribute ("target_clones", + DECL_ATTRIBUTES (node->decl)); /* Thunk shouldn't be a cdtor. */ DECL_STATIC_CONSTRUCTOR (node->decl) = 0; DECL_STATIC_DESTRUCTOR (node->decl) = 0; However, that is not all that is needed to be fixed. Consider case: __attribute__((target_clones("arch=core-avx2", "default"))) int test_fun() { return test_fun(); } Now, in multiple_target.c: create_dispatcher_calls we are replacing the call to test_fun with a call to the ifunc resolvler: (rr) p debug_function (node->decl, TDF_VOPS) __attribute__((target ("default"), target_clones ("arch=core-avx2", "default"))) test_fun () { [100.00%]: _2 = test_fun (); # VUSE <.MEM_1(D)> return _2; } (rr) p e $103 = (cgraph_edge *) 0x7f658d121340 (rr) p e->callee $104 = (rr) p e->caller $105 = (rr) p inode $106 = We do.. e->redirect_callee (inode); e->redirect_call_stmt_to_callee (); And after we have: (rr) p debug_function (node->decl, TDF_VOPS) __attribute__((target ("default"), target_clones ("arch=core-avx2", "default"))) test_fun () { [100.00%]: # .MEM = VDEF <.MEM> _2 = test_fun.ifunc (); # VUSE <.MEM_1(D)> return _2; } For MPX-instrumented code the outcome of the same procedure is different: (rr) p e->caller $5 = (rr) p e->callee $6 = (rr) p inode $7 = (rr) p debug_function (node->decl, TDF_VOPS) __attribute__((target ("default"), chkp instrumented, target_clones ("arch=core-avx2", "default"))) test_fun.chkp () { [100.00%]: _2 = test_fun.chkp (); # VUSE <.MEM_1(D)> return _2; } -- We don't have # .MEM = VDEF <.MEM> part and hence, we hit internal compiler error in verify_ssa, which fails. The reason for that is that in redirect_call_stmt_to_callee for instrumented code the path is different and update_stmt_fn is not called: if (e->callee->clone.combined_args_to_skip || skip_bounds) {..} else { .. update_stmt_fn (DECL_STRUCT_FUNCTION (e->caller->decl), new_stmt); } Calling this function after if-then-else fixes the issue: diff --git a/gcc/cgraph.c b/gcc/cgraph.c index e505b10..3dde20d 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -1527,8 +1527,8 @@ cgraph_edge::redirect_call_stmt_to_callee (void) { new_stmt = e->call_stmt; gimple_call_set_fndecl (new_stmt, e->callee->decl); - update_stmt_fn (DECL_STRUCT_FUNCTION (e->caller->decl), new_stmt); } + update_stmt_fn (DECL_STRUCT_FUNCTION (e->caller->decl), new_stmt); /* If changing the call to __cxa_pure_virtual or similar noreturn function, adjust gimple_call_fntype too. */ This call makes sense to me, but is it a correct thing to do? And finally, the third point. In the same if-then-else, in the "if" part there is a call to "gsi_replace", which uses "cfun", but targetclone is ipa pass and hence has cfun defined to NULL. So, is it a bug at all to use "gsi_replace" in ipa pass? thanks in advance, Alexander
Re: [PATCH, RFC] warn about raw pointers that can be unique_ptr
Note that the new option will need documenting in invoke.texi for any patch version actually proposed for inclusion. -- Joseph S. Myers jos...@codesourcery.com
Re: [C++ PATCH] namespace bindings
On 05/08/2017 08:56 AM, Jason Merrill wrote: On Fri, May 5, 2017 at 4:07 PM, Nathan Sidwell wrote: This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces {get,set}_namespace_binding with get_namespace_value and set_global_value respectively. I'd prefer to stick with the _binding naming, as "value" is more ambiguous. I've reverted that bit of the change. nathan -- Nathan Sidwell 2017-05-08 Nathan Sidwell Revert _binding -> _value change. * name-lookup.h (get_namespace_value, set_global_value): Rename to ... (get_namespace_binding, set_global_binding): ... these. * name-lookup.c (get_namespace_value, set_global_value): Rename to ... (get_namespace_binding, set_global_binding): ... these. (arg_assoc_namespace, pushdecl_maybe_friend_1, check_for_out_of_scope_variable, push_overloaded_decl_1, lookup_name_innermost_nonclass_level, push_namespace): Adjust. * cp-tree.h (IDENTIFIER_GLOBAL_VALUE, SET_IDENTIFIER_GLOBAL_VALUE): Adjust. * decl.c (poplevel): Adjust. * pt.c (make_constrained_auto): Likewise. Index: cp-tree.h === --- cp-tree.h (revision 247739) +++ cp-tree.h (working copy) @@ -554,9 +554,9 @@ struct GTY(()) ptrmem_cst { typedef struct ptrmem_cst * ptrmem_cst_t; #define IDENTIFIER_GLOBAL_VALUE(NODE) \ - get_namespace_value (NULL_TREE, (NODE)) + get_namespace_binding (NULL_TREE, (NODE)) #define SET_IDENTIFIER_GLOBAL_VALUE(NODE, VAL) \ - set_global_value ((NODE), (VAL)) + set_global_binding ((NODE), (VAL)) #define CLEANUP_P(NODE) TREE_LANG_FLAG_0 (TRY_BLOCK_CHECK (NODE)) Index: decl.c === --- decl.c (revision 247739) +++ decl.c (working copy) @@ -693,7 +693,7 @@ poplevel (int keep, int reverse, int fun /*class_p=*/true); tree ns_binding = NULL_TREE; if (!ob) - ns_binding = get_namespace_value (current_namespace, name); + ns_binding = get_namespace_binding (current_namespace, name); if (ob && ob->scope == current_binding_level->level_chain) /* We have something like: Index: name-lookup.c === --- name-lookup.c (revision 247739) +++ name-lookup.c (working copy) @@ -215,7 +215,7 @@ arg_assoc_namespace (struct arg_lookup * if (arg_assoc_namespace (k, TREE_PURPOSE (value))) return true; - value = get_namespace_value (scope, k->name); + value = get_namespace_binding (scope, k->name); if (!value) return false; @@ -1250,7 +1250,7 @@ pushdecl_maybe_friend_1 (tree x, bool is /* In case this decl was explicitly namespace-qualified, look it up in its namespace context. */ if (DECL_NAMESPACE_SCOPE_P (x) && namespace_bindings_p ()) - t = get_namespace_value (DECL_CONTEXT (x), name); + t = get_namespace_binding (DECL_CONTEXT (x), name); else t = lookup_name_innermost_nonclass_level (name); @@ -1267,7 +1267,7 @@ pushdecl_maybe_friend_1 (tree x, bool is t = innermost_non_namespace_value (name); /* Or in the innermost namespace. */ if (! t) - t = get_namespace_value (DECL_CONTEXT (x), name); + t = get_namespace_binding (DECL_CONTEXT (x), name); /* Does it have linkage? Note that if this isn't a DECL, it's an OVERLOAD, which is OK. */ if (t && DECL_P (t) && ! (TREE_STATIC (t) || DECL_EXTERNAL (t))) @@ -1521,7 +1521,7 @@ pushdecl_maybe_friend_1 (tree x, bool is { tree decl; - decl = get_namespace_value (current_namespace, name); + decl = get_namespace_binding (current_namespace, name); if (decl && TREE_CODE (decl) == OVERLOAD) decl = OVL_FUNCTION (decl); @@ -1568,7 +1568,7 @@ pushdecl_maybe_friend_1 (tree x, bool is else { /* Here to install a non-global value. */ - tree oldglobal = get_namespace_value (current_namespace, name); + tree oldglobal = get_namespace_binding (current_namespace, name); tree oldlocal = NULL_TREE; cp_binding_level *oldscope = NULL; cxx_binding *oldbinding = outer_binding (name, NULL, true); @@ -1608,7 +1608,7 @@ pushdecl_maybe_friend_1 (tree x, bool is if (oldlocal == NULL_TREE) oldlocal - = get_namespace_value (current_namespace, DECL_NAME (d)); + = get_namespace_binding (current_namespace, DECL_NAME (d)); } /* If this is an extern function declaration, see if we @@ -1972,7 +1972,7 @@ check_for_out_of_scope_variable (tree de shadowed = DECL_HAS_SHADOWED_FOR_VAR_P (shadowed) ? DECL_SHADOWED_FOR_VAR (shadowed) : NULL_TREE; if (!shadowed) -shadowed = get_namespace_value (current_namespace, DECL_NAME (decl)); +shadowed = get_namespace_binding (current_namespace, DECL_NAME (decl)); if (shadowed) { if (!DECL_ERROR_REPORTED (decl)) @@ -2934,7 +2934,7 @@ push_overloaded_decl_1 (tree decl, int f int doing_global = (namespace_bindings_p () || !(flags & PUSH_LOCAL)); if (doing_global) -old = get_namespace_value (DECL_
Re: [PATCH 2/n] [PR tree-optimization/78496] Simplify ASSERT_EXPRs to facilitate jump threading
On 05/08/2017 01:25 AM, Richard Biener via gcc-patches wrote: ps. An astute observer might note that improving the effectiveness of VRP jump threading seems counterproductive since I've stated I want to remove VRP jump threading. These improvements don't significantly change how I was planning to do that or the infrastructure we're going to rely upon to make that change. All that changes is where we get the information from -- ASSERT_EXPRs vs querying a new API that generates the same information as needed. That API is already there. It's called register_edge_assert_for (it might need a wrapper to be most useful and also needs to be exported / moved from VRP to somewhere else). Both VRP and EVRP use it to "compute" ASSERT_EXPRs As I mentioned in an earlier message, this has a lot of similarity with what Andrew has been doing. Furthermore, parts of this mimick code I've written in DOM and code we need for the backward threader to make it strong enough to eliminate the VRP threading code. I'd really like to be working on unifying all that work within the next few days (as originally hoping for today, but slightly side-tracked last week). So I'm not sure if changing VRP with your patches is a good thing when you could have used the new API in the first place ... I don't see that the changes to date around 78496 change things significantly in regards to the immediate plans to remove ASSERT_EXPR. The work around 78496 raises the bar in terms of what information we need to be able to extract, but that IMHO, is a fine thing to do ;-) However, the existence of register_edge_assert_for does change how I'm looking at the next issue for 78496 as well as how to tackle a host of related issues. It may end up being the case that we stop 78496 work after patch #2, work on ASSERT_EXPR removal, then re-eval 78496. Jeff
Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
On 05/08/2017 01:32 AM, Richard Biener wrote: Note that I tried last stage3 (it ended up being too late) to get rid of ASSERT_EXPRs doing substitute-and-fold itself (basically copy-propagate them out at this point rather than as a separate thing later). This is because the ASSERT_EXPR uses interfere with the single_use checks in match.pd patterns and thus are actually harmful. The barrier I ran into was the ASSERT_EXPR use by the threader ... so now you're making us rely even more on those :/ Perhaps, but it's a short term thing -- Andrew and I want to get rid of ASSERT_EXPRs too. I really think we all share that common goal. I can certainly see how they muck up the single_use checks. They get in the way of other things as well by hiding equivalency information. jeff
Re: [RFC] S/390: Alignment peeling prolog generation
> So the new part is the last point? There's a lot of refactoring in 3/3 that > makes it hard to see what is actually changed ... you need to resist > in doing this, it makes review very hard. The new part is actually spread across the three last "-"s. Attached is a new version of [3/3] split up into two patches with hopefully less blending of refactoring and new functionality. [3/4] Computes full costs when peeling for unknown alignment, uses either read or write and compares the better one with the peeling costs for known alignment. If the peeling for unknown alignment "aligns" more than twice the number of datarefs, it is preferred over the peeling for known alignment. [4/4] Computes the costs for no peeling and compares them with the costs of the best peeling so far. If it is not more expensive, no peeling will be performed. > I think it's always best to align a ref with known alignment as that simplifies > conditions and allows followup optimizations (unrolling of the > prologue / epilogue). > I think for this it's better to also compute full costs rather than relying on > sth as simple as "number of same aligned refs". > > Does the code ever end up misaligning a previously known aligned ref? The following case used to get aligned via the known alignment of dd but would not anymore since peeling for unknown alignment aligns two accesses. I guess the determining factor is still up for scrutiny and should probably > 2. Still, on e.g. s390x no peeling is performed due to costs. void foo(int *restrict a, int *restrict b, int *restrict c, int *restrict d, unsigned int n) { int *restrict dd = __builtin_assume_aligned (d, 8); for (unsigned int i = 0; i < n; i++) { b[i] = b[i] + a[i]; c[i] = c[i] + b[i]; dd[i] = a[i]; } } Regards Robin
[PATCH 3/4] Vect peeling cost model
gcc/ChangeLog: 2017-05-08 Robin Dapp * tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling): Return peel info. (vect_enhance_data_refs_alignment): Compute full costs when peeling for unknown alignment, compare to costs for peeling for known alignment and choose the cheaper one. diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 7b68582..786f826 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -1342,7 +1342,7 @@ vect_peeling_hash_get_lowest_cost (_vect_peel_info **slot, choosing an option with the lowest cost (if cost model is enabled) or the option that aligns as many accesses as possible. */ -static struct data_reference * +static struct _vect_peel_extended_info vect_peeling_hash_choose_best_peeling (hash_table *peeling_htab, loop_vec_info loop_vinfo, unsigned int *npeel, @@ -1369,7 +1369,7 @@ vect_peeling_hash_choose_best_peeling (hash_table *peeling_hta *npeel = res.peel_info.npeel; *body_cost_vec = res.body_cost_vec; - return res.peel_info.dr; + return res; } /* Return true if the new peeling NPEEL is supported. */ @@ -1520,6 +1520,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) enum dr_alignment_support supportable_dr_alignment; struct data_reference *dr0 = NULL, *first_store = NULL; struct data_reference *dr; + struct data_reference *dr0_known_align = NULL; unsigned int i, j; bool do_peeling = false; bool do_versioning = false; @@ -1527,7 +1528,8 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) gimple *stmt; stmt_vec_info stmt_info; unsigned int npeel = 0; - bool all_misalignments_unknown = true; + bool one_misalignment_known = false; + bool one_misalignment_unknown = false; unsigned int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); unsigned possible_npeel_number = 1; tree vectype; @@ -1652,11 +1654,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) npeel_tmp += nelements; } - all_misalignments_unknown = false; - /* Data-ref that was chosen for the case that all the - misalignments are unknown is not relevant anymore, since we - have a data-ref with known alignment. */ - dr0 = NULL; + one_misalignment_known = true; } else { @@ -1664,35 +1662,32 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) peeling for data-ref that has the maximum number of data-refs with the same alignment, unless the target prefers to align stores over load. */ - if (all_misalignments_unknown) -{ - unsigned same_align_drs - = STMT_VINFO_SAME_ALIGN_REFS (stmt_info).length (); - if (!dr0 - || same_align_drs_max < same_align_drs) -{ - same_align_drs_max = same_align_drs; - dr0 = dr; -} - /* For data-refs with the same number of related - accesses prefer the one where the misalign - computation will be invariant in the outermost loop. */ - else if (same_align_drs_max == same_align_drs) - { - struct loop *ivloop0, *ivloop; - ivloop0 = outermost_invariant_loop_for_expr - (loop, DR_BASE_ADDRESS (dr0)); - ivloop = outermost_invariant_loop_for_expr - (loop, DR_BASE_ADDRESS (dr)); - if ((ivloop && !ivloop0) - || (ivloop && ivloop0 - && flow_loop_nested_p (ivloop, ivloop0))) - dr0 = dr; - } + unsigned same_align_drs + = STMT_VINFO_SAME_ALIGN_REFS (stmt_info).length (); + if (!dr0 + || same_align_drs_max < same_align_drs) + { + same_align_drs_max = same_align_drs; + dr0 = dr; + } + /* For data-refs with the same number of related + accesses prefer the one where the misalign + computation will be invariant in the outermost loop. */ + else if (same_align_drs_max == same_align_drs) + { + struct loop *ivloop0, *ivloop; + ivloop0 = outermost_invariant_loop_for_expr + (loop, DR_BASE_ADDRESS (dr0)); + ivloop = outermost_invariant_loop_for_expr + (loop, DR_BASE_ADDRESS (dr)); + if ((ivloop && !ivloop0) + || (ivloop && ivloop0 + && flow_loop_nested_p (ivloop, ivloop0))) + dr0 = dr; + } - if (!first_store && DR_IS_WRITE (dr)) -first_store = dr; -} + if (!first_store && DR_IS_WRITE (dr)) + first_store = dr; /* If there are both known and unknown misaligned accesses in the loop, we choose peeling amount according to the known @@ -1703,6 +1698,8 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) if (!first_store && DR_IS_WRITE (dr))
[PATCH 4/4] Vect peeling cost model
gcc/ChangeLog: 2017-05-08 Robin Dapp * tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): Remove unused variable. (vect_enhance_data_refs_alignment): Compare best peelings costs to doing no peeling and choose no peeling if equal. diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 786f826..67d2f57 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -1293,7 +1293,7 @@ vect_peeling_hash_get_lowest_cost (_vect_peel_info **slot, { vect_peel_info elem = *slot; int dummy; - unsigned int inside_cost = 0, outside_cost = 0, i; + unsigned int inside_cost = 0, outside_cost = 0; gimple *stmt = DR_STMT (elem->dr); stmt_vec_info stmt_info = vinfo_for_stmt (stmt); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); @@ -1520,7 +1520,6 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) enum dr_alignment_support supportable_dr_alignment; struct data_reference *dr0 = NULL, *first_store = NULL; struct data_reference *dr; - struct data_reference *dr0_known_align = NULL; unsigned int i, j; bool do_peeling = false; bool do_versioning = false; @@ -1720,6 +1719,9 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) || loop->inner) do_peeling = false; + struct _vect_peel_extended_info peel_for_known_alignment; + struct _vect_peel_extended_info peel_for_unknown_alignment; + struct _vect_peel_extended_info best_peel; unsigned int unknown_align_inside_cost = UINT_MAX; unsigned int unknown_align_outside_cost = UINT_MAX; unsigned int unknown_align_count = 0; @@ -1731,74 +1733,72 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) /* Check if the target requires to prefer stores over loads, i.e., if misaligned stores are more expensive than misaligned loads (taking drs with same alignment into account). */ - if (first_store && DR_IS_READ (dr0)) -{ - unsigned int load_inside_cost = 0; - unsigned int load_outside_cost = 0; - unsigned int store_inside_cost = 0; - unsigned int store_outside_cost = 0; - stmt_vector_for_cost dummy; - dummy.create (2); - vect_get_peeling_costs_all_drs (dr0, - &load_inside_cost, - &load_outside_cost, - &dummy, vf / 2, vf); - dummy.release (); - + unsigned int load_inside_cost = 0; + unsigned int load_outside_cost = 0; + unsigned int store_inside_cost = 0; + unsigned int store_outside_cost = 0; + + stmt_vector_for_cost dummy; + dummy.create (2); + vect_get_peeling_costs_all_drs (dr0, + &load_inside_cost, + &load_outside_cost, + &dummy, vf / 2, vf); + dummy.release (); + + if (first_store) + { dummy.create (2); vect_get_peeling_costs_all_drs (first_store, &store_inside_cost, &store_outside_cost, &dummy, vf / 2, vf); dummy.release (); + } + else + { + store_inside_cost = UINT_MAX; + store_outside_cost = UINT_MAX; + } - if (load_inside_cost > store_inside_cost - || (load_inside_cost == store_inside_cost - && load_outside_cost > store_outside_cost)) - { - dr0 = first_store; - unknown_align_inside_cost = store_inside_cost; - unknown_align_outside_cost = store_outside_cost; - } - else - { - unknown_align_inside_cost = load_inside_cost; - unknown_align_outside_cost = load_outside_cost; - } - - stmt_vector_for_cost prologue_cost_vec, epilogue_cost_vec; - prologue_cost_vec.create (2); - epilogue_cost_vec.create (2); + if (load_inside_cost > store_inside_cost + || (load_inside_cost == store_inside_cost + && load_outside_cost > store_outside_cost)) + { + dr0 = first_store; + unknown_align_inside_cost = store_inside_cost; + unknown_align_outside_cost = store_outside_cost; + } + else + { + unknown_align_inside_cost = load_inside_cost; + unknown_align_outside_cost = load_outside_cost; + } - int dummy2; - unknown_align_outside_cost += vect_get_known_peeling_cost - (loop_vinfo, vf / 2, &dummy2, - &LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), - &prologue_cost_vec, &epilogue_cost_vec); + stmt_vector_for_cost prologue_cost_vec, epilogue_cost_vec; + prologue_cost_vec.create (2); + epilogue_cost_vec.create (2); - prologue_cost_vec.release (); - epilogue_cost_vec.release (); + int dummy2; + unknown_align_outside_cost += vect_get_known_peeling_cost + (loop_vinfo, vf / 2, &dummy2, + &LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), + &prologue_cost_vec, &epilogue_cost_vec); - unknown_align_count = 1 + STMT_VINFO_SAME_ALIGN_REFS - (vinfo_for_stmt (DR_STMT (dr0))).length (); -} + prologue_cost_vec.release (); + epilogue_cost_vec.release (); - /* In case there are only loads with different unknown misalignments, use - peeling only if it may help to align
Re: [PATCH] [PR79542][Ada] Fix ICE in dwarf2out.c with nested func. inlining
On Wed, May 3, 2017 at 3:22 PM, Jason Merrill wrote: > On Tue, Mar 14, 2017 at 8:24 AM, Pierre-Marie de Rodat > wrote: >> Hello, >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79542 reports an ICE in >> dwarf2out.c for an Ada testcase built with optimization. >> >> This crash happens during the late generation pass because >> add_gnat_descriptive_type cannot find the type DIE corresponding to some >> descriptive type after having tried to generate it. This is because the >> DIE was generated during the early generation pass, but then pruned by >> the type pruning machinery. So why was it pruned? >> >> We are in a situation where we have cloned types (because of inlining, >> IIUC) whose TYPE_NAME have non-null DECL_ABSTRACT_ORIGIN attributes. As >> a consequence: >> >> * In modified_type_die, the "handle C typedef types" part calls >> gen_type_die on the cloned type. >> >> * gen_type_die matches a typedef variant, and then calls gen_decl_die >> on its TYPE_NAME, which will end up calling gen_typedef_die. >> >> * gen_typedef_die checks decl_ultimate_origin for this TYPE_DECL, and >> finds one, so it only adds a DW_AT_abstract_origin attribute to the >> DW_TAG_typedef DIE, but the cloned type itself does not get its own >> DIE. > > That seems like a bug; if gen_typedef_die is going to generate a DIE > for a cloned typedef, it needs to associate the type with the DIE. > >> * Back in modified_type_die, the call to lookup_type_die on the type >> passed to gen_type_die returns NULL. > >> In the end, whole type trees, i.e. the ones referenced by >> DECL_ABSTRACT_ORIGIN attributes, are never referenced from type pruning >> "roots" and are thus pruned. The descriptive type at stake here is one >> of them, hence the assertion failure. >> >> This patch attemps to fix that with what seems to be the most sensible >> thing to do in my opinion: updating the "handle C typedef types" part in >> modified_type_die to check decl_ultimate_origin before calling >> gen_type_die: if that function returns something not null, then we know >> that gen_type_die/gen_typedef_die will not generate a DIE for the input >> type, so we try to process the ultimate origin instead. > > This soundsn good; the DWARF standard says that we don't need to have > a die at all for the cloned typedef. > >> @@ -12496,6 +12496,18 @@ modified_type_die (tree type, int cv_quals, bool >> reverse, >> >>if (qualified_type == dtype) >> { >> + tree origin >> + = TYPE_NAME (qualified_type) == NULL >> +? NULL >> +: decl_ultimate_origin (TYPE_NAME (qualified_type)); > > This is unnecessarily complicated; at this point we already know that > TYPE_NAME (qualified_type) is non-null and in the variable "name". > >> + /* Typedef variants that have an abstract origin don't get their >> own >> +type DIE (see gen_typedef_die), so fall back on the ultimate > > gen_typedef_die does create a DIE for them, it just doesn't do it > properly. But we could change gen_typedef_die to abort in that case, > making this comment correct. Something like this, which also avoids routinely creating DIEs for local typedefs that will only be pruned away later; this patch doesn't change the size of .debug_info in cc1plus. commit 56cce11181d1296e43cb4d603fc8efc6ac2570fa Author: Jason Merrill Date: Thu May 4 15:00:51 2017 -0400 inline-static diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 6caf598..bf6a65b 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -24355,7 +24355,7 @@ gen_typedef_die (tree decl, dw_die_ref context_die) type_die = new_die (DW_TAG_typedef, context_die, decl); origin = decl_ultimate_origin (decl); if (origin != NULL) -add_abstract_origin_attribute (type_die, origin); +gcc_unreachable (), add_abstract_origin_attribute (type_die, origin); else { tree type = TREE_TYPE (decl); @@ -24858,6 +24858,16 @@ process_scope_var (tree stmt, tree decl, tree origin, dw_die_ref context_die) else die = NULL; + if ((origin || DECL_ABSTRACT_ORIGIN (decl)) + && (TREE_CODE (decl_or_origin) == TYPE_DECL + || (VAR_P (decl_or_origin) && TREE_STATIC (decl_or_origin +{ + origin = decl_ultimate_origin (decl_or_origin); + if (decl && VAR_P (decl)) + equate_decl_number_to_die (decl, lookup_decl_die (origin)); + return; +} + if (die != NULL && die->die_parent == NULL) add_child_die (context_die, die); else if (TREE_CODE (decl_or_origin) == IMPORTED_DECL)
Re: [PATCH, v3] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch
On 05/03/2017 08:32 AM, Richard Biener wrote: > As for Bernhards concern I share this -- please intead make the > interface take either a gimple_seq or a gimple_stmt_iterator > instead of a basic-block. That makes it more obvious you > can't use things like gsi_after_labels. Also I think it's more > natural to work backwards as the last stmt in the sequence > _has_ to be __builtin_unreachable () and thus checking that first > is the cheapest thing to do given that in most cases it will > not be __builtin_unreachable () (but a noreturn call or an > inifinite loop). > > Thus, name it gimple_seq_unreachable_p. So you mean something like the following? /* Returns true if the sequence of statements STMTS only contains a call to __builtin_unreachable (). */ bool gimple_seq_unreachable_p (gimple_seq stmts) { gimple_stmt_iterator gsi = gsi_last (stmts); if (!gimple_call_builtin_p (gsi_stmt (gsi), BUILT_IN_UNREACHABLE)) return false; for (gsi_prev (&gsi); !gsi_end_p (gsi); gsi_prev (&gsi)) { gimple *stmt = gsi_stmt (gsi); if (gimple_code (stmt) != GIMPLE_LABEL && !is_gimple_debug (stmt) && !gimple_clobber_p (stmt)) return false; } return true; } > On Wed, 26 Apr 2017, Peter Bergner wrote: >> One difference from the last patch is that I am no longer setting >> default_label to NULL when we emit a decision tree. I noticed that >> the decision tree code seemed to generate slightly better code for >> some of my unit tests if I left it alone. This simplified the >> patch somewhat by removing the changes to emit_case_nodes(). [snip] > > Can you do the gimple_unreachable_bb_p check earlier in > expand_case so it covers the emit_case_decision_tree path as well > (and verify that works, of course)? So basically right at > > /* Find the default case target label. */ > default_label = jump_target_rtx > (CASE_LABEL (gimple_switch_default_label (stmt))); > edge default_edge = EDGE_SUCC (bb, 0); > int default_prob = default_edge->probability; > > handle this case. That is what the previous patch did, but as I mention above, we generate slightly better code for some test cases (other tests seemed to generate the same code) if we don't attempt to handle the decision tree case. I'll note that the current unpatched compiler already knows how to remove unreachable case statement blocks when we expand to a decision tree. I can add that code back if you think that it will have a positive benefit for some test case I haven't tried yet. Peter
Re: [PATCH 1/2] C++ template type diff printing
On Mon, 2017-05-08 at 07:36 -0400, Nathan Sidwell wrote: > On 05/04/2017 07:44 PM, David Malcolm wrote: > > This patch kit implements two new options to make it easier > > to read diagnostics involving mismatched template types: > >-fdiagnostics-show-template-tree and > >-fno-elide-type. > > > > It adds two new formatting codes: %H and %I which are > > equivalent to %qT, but are to be used together for type > > comparisons e.g. > >"can't convert from %H to %I". > > While I can see why one might want %H and %I to imply quoting, it > seems > confusingly inconsistent with the other formatters, which require an > explicit 'q'. This an implementation detail leaking through, but I *think* it's fixable. The implementation has to jump through some hoops to interact with pp_format, because the two codes interact with each other: in order to elide commonality and colorize differences we can't start printing the %H until we've seen the %I (vice versa should work also). The 'q' for quoting the next format code happens in the 2nd phase of pp_format: if 'q' is present, then a quote (and potentially colorization) is emitted to the chunk for the item before and after the item's main content is emitted to the chunk. To make the interaction of %H and %I work, we have to delay the writing to the chunk to a new phase between phases 2 and 3 (and stash a pointer to the chunk we're going to write to). As written, if one uses '%qH' and '%qI', then phase 2 writes the open and close quotes to the chunks, and then the delayed handler blows that away and overwrites the chunk content with (forcibly) quoted content for the types. So I think it can work if we add a "needs quoting" flag to the postprocessing phase, if we need to handle the case where %H and %I ever appear without 'q' (and have the delayed handling stash that flag, and do the quoting there). I'll look at implementing that. > > rather than printing: > > > >could not convert 'std::map >()' > > from 'std::map >' to 'std::map > std::vector >' > > > > with -felide-type (the default), it prints: > > > >could not convert 'std::map >()' > > from 'map<[...],vector>' to 'map<[...],vector> > > Neat. > > Is '-felide-type' a good name? Wouldn't something like > '-fdiagnostic-elide-template-args' be better? Here I'm merely copying clang's option name. http://clang.llvm.org/docs/UsersManual.html#cmdoption-fno-elide-type > > With -fdiagnostics-show-template-tree a tree-like structure of the > > template is printed, showing the differences; in this case: > > 'tree' sounds implementor-speaky, Perhaps > '-fdiagnostic-expand-template-args' or something? As Jason noted, this is a user-visible thing that's tree-like, rather than our "tree" type. Also, I'm (shamelessly) copying clang's name for this. > Right, bikeshedding out of the way ... > > > --- a/gcc/cp/error.c > > +++ b/gcc/cp/error.c > > @@ -99,7 +99,47 @@ static void cp_diagnostic_starter > > (diagnostic_context *, diagnostic_info *); > > static void cp_print_error_function (diagnostic_context *, > > diagnostic_info *); > > > > > + > > +/* Return true iff TYPE_A and TYPE_B are template types that are > > + meaningful to compare. */ > > + > > +static bool > > +comparable_template_types_p (tree type_a, tree type_b) > > +{ > > + if (TREE_CODE (type_a) != RECORD_TYPE) > > +return false; > > + if (TREE_CODE (type_b) != RECORD_TYPE) > > +return false; > > CLASS_TYPE_P (type_a) etc? I'll use this in the next version. > > + int len_max = MAX (len_a, len_b); > > + for (int idx = 0; idx < len_max; idx++) > > +{ > > + tree arg_a = idx < len_a ? TREE_VEC_ELT (args_a, idx) : > > NULL; > > + tree arg_b = idx < len_b ? TREE_VEC_ELT (args_b, idx) : > > NULL; > > + if (arg_a == arg_b) > > Should this use same_type_p for types? If direct comparison is > correct, > a comment would help. What about non-type template arguments? That said, I just tried one, and it's handled poorly by my patch. I'll try to address for the next version. > > + Only non-default template arguments are printed. */ > > + > > +static void > > +print_template_tree_comparison (pretty_printer *pp, tree type_a, > > tree type_b, > > > + for (int idx = 0; idx < len_max; idx++) > > +{ > > + tree arg_a = idx < len_a ? TREE_VEC_ELT (args_a, idx) : > > NULL; > > + tree arg_b = idx < len_b ? TREE_VEC_ELT (args_b, idx) : > > NULL; > > + if (arg_a == arg_b) > > Same question. If these have to be comparable template type, when > can > len_a and len_b be different? I guess I was thinking about variadic templates. I'll have another look at how these are handled. > > + This is called in phase 2 of pp_format, when it is accumulating > > + a series of formatted chunks. We stash the location of the > > chunk > > + we're meant to have written to, so that we can write to it in > > the > > + m_format_postprocessor hook. */ > > + > > +static void > > +def
Re: [patch, fortran] Reduce stack use in blocked matmul
On 05/05/2017 01:31 PM, Thomas Koenig wrote: > Hello world, > > the attached patch reduces the stack usage by the blocked > version of matmul for cases where we don't need the full buffer. > This should improve stack usage. > > Regression-tested. I also added a stress test (around 3 secs of > CPU time on my system), it will only run once due to the "dg-do run" > hack). > > OK for trunk? > OK, thanks. Jerry
Re: [C++ PATCH] namespace bindings
On 05/08/2017 11:17 AM, Andreas Schwab wrote: g++.dg/cpp0x/defaulted34.C has the same problem. I'd managed to flub my xcompiler build. The code path for fn-descriptor-using targets is different, and I'd missed that the shadowed outer 'fn' was expected to have been initialized in that case. Committed the attached fix. nathan -- Nathan Sidwell 2017-05-08 Nathan Sidwell * class.c (build_vtbl_initializer): Don't shadow outer variable with static var. Index: cp/class.c === --- cp/class.c (revision 247746) +++ cp/class.c (working copy) @@ -9769,18 +9769,19 @@ build_vtbl_initializer (tree binfo, /* Likewise for deleted virtuals. */ else if (DECL_DELETED_FN (fn_original)) { - static tree fn; + static tree dvirt_fn; - if (!fn) + if (!dvirt_fn) { tree name = get_identifier ("__cxa_deleted_virtual"); - fn = IDENTIFIER_GLOBAL_VALUE (name); - if (!fn) - fn = push_library_fn + dvirt_fn = IDENTIFIER_GLOBAL_VALUE (name); + if (!dvirt_fn) + dvirt_fn = push_library_fn (name, build_function_type_list (void_type_node, NULL_TREE), NULL_TREE, ECF_NORETURN); } + fn = dvirt_fn; if (!TARGET_VTABLE_USES_DESCRIPTORS) init = fold_convert (vfunc_ptr_type_node, build_fold_addr_expr (fn)); @@ -9789,7 +9790,8 @@ build_vtbl_initializer (tree binfo, { if (!integer_zerop (delta) || vcall_index) { - fn = make_thunk (fn, /*this_adjusting=*/1, delta, vcall_index); + fn = make_thunk (fn, /*this_adjusting=*/1, + delta, vcall_index); if (!DECL_NAME (fn)) finish_thunk (fn); }
Re: Test cases to check OpenACC offloaded function's attributes and classification
Hi! Ping. On Thu, 4 Aug 2016 16:06:10 +0200, I wrote: > Ping. > > On Wed, 27 Jul 2016 10:59:02 +0200, I wrote: > > OK for trunk? (In the mean time, I also added some more testing.) commit b7d61270dfc581a6ea130f7a4fa7506a0a5762d8 Author: Thomas Schwinge Date: Mon May 8 18:22:50 2017 +0200 Test cases to check OpenACC offloaded function's attributes and classification gcc/testsuite/ * c-c++-common/goacc/classify-kernels-unparallelized.c: New file. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-routine.c: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-routine.f95: Likewise. --- .../goacc/classify-kernels-unparallelized.c| 39 .../c-c++-common/goacc/classify-kernels.c | 35 ++ .../c-c++-common/goacc/classify-parallel.c | 28 +++ .../c-c++-common/goacc/classify-routine.c | 30 .../goacc/classify-kernels-unparallelized.f95 | 41 ++ .../gfortran.dg/goacc/classify-kernels.f95 | 37 +++ .../gfortran.dg/goacc/classify-parallel.f95| 30 .../gfortran.dg/goacc/classify-routine.f95 | 29 +++ 8 files changed, 269 insertions(+) diff --git gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c new file mode 100644 index 000..a76351c --- /dev/null +++ gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c @@ -0,0 +1,39 @@ +/* Check offloaded function's attributes and classification for unparallelized + OpenACC kernels. */ + +/* { dg-additional-options "-O2" } + { dg-additional-options "-fdump-tree-ompexp" } + { dg-additional-options "-fdump-tree-parloops1-all" } + { dg-additional-options "-fdump-tree-oaccdevlow" } */ + +#define N 1024 + +extern unsigned int *__restrict a; +extern unsigned int *__restrict b; +extern unsigned int *__restrict c; + +/* An "extern"al mapping of loop iterations/array indices makes the loop + unparallelizable. */ +extern unsigned int f (unsigned int); + +void KERNELS () +{ +#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N]) + for (unsigned int i = 0; i < N; i++) +c[i] = a[f (i)] + b[f (i)]; +} + +/* Check the offloaded function's attributes. + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp target entrypoint\\)\\)" 1 "ompexp" } } */ + +/* Check that exactly one OpenACC kernels construct is analyzed, and that it + can't be parallelized. + { dg-final { scan-tree-dump-times "FAILED:" 1 "parloops1" } } + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), omp target entrypoint\\)\\)" 1 "parloops1" } } + { dg-final { scan-tree-dump-not "SUCCESS: may be parallelized" "parloops1" } } */ + +/* Check the offloaded function's classification and compute dimensions (will + always be 1 x 1 x 1 for non-offloading compilation). + { dg-final { scan-tree-dump-times "(?n)Function is kernels offload" 1 "oaccdevlow" } } + { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } } + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */ diff --git gcc/testsuite/c-c++-common/goacc/classify-kernels.c gcc/testsuite/c-c++-common/goacc/classify-kernels.c new file mode 100644 index 000..199a73e --- /dev/null +++ gcc/testsuite/c-c++-common/goacc/classify-kernels.c @@ -0,0 +1,35 @@ +/* Check offloaded function's attributes and classification for OpenACC + kernels. */ + +/* { dg-additional-options "-O2" } + { dg-additional-options "-fdump-tree-ompexp" } + { dg-additional-options "-fdump-tree-parloops1-all" } + { dg-additional-options "-fdump-tree-oaccdevlow" } */ + +#define N 1024 + +extern unsigned int *__restrict a; +extern unsigned int *__restrict b; +extern unsigned int *__restrict c; + +void KERNELS () +{ +#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N]) + for (unsigned int i = 0; i < N; i++) +c[i] = a[i] + b[i]; +} + +/* Check the offloaded function's attributes. + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp target entrypoint\\)\\)" 1 "ompexp" } } */ + +/* Check that exactly one OpenACC kernels construct is analyzed, and that it + can be parallelized. + { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0, , \\), omp target entrypoint\\)\\)" 1 "parloops1" } } + { dg-final { scan-tree-dum
[PATCH v2,rs6000] Add built-in function support for compare bytes instruction
This patch adds support for the compare bytes instruction, which has been available in the rs6000 architecture since Power6. Thank you to Segher Boessenkool for feedback on the original submission of this patch. The following refinements have been incorporated: 1. Changed the implementation and documentation to present a single overloaded function that handles either 32-bit or 64-bit arguments. 2. Corrected the spelling of compare in the comment describing the RS6000_BTM_CMPB macro. In response to reviewer question of whether this line is too long: it is not. It only appears that way due to alignment of tabs in the diff output. The patch has been bootstrapped and tested on powerpc64le-unknown-linux and powerpc-unknown-linux (big-endian, with both -m32 and -m64 target options) with no regressions. Is this ok for the trunk? gcc/testsuite/ChangeLog: 2017-05-08 Kelvin Nilsen * gcc.target/powerpc/cmpb-1.c: New test. * gcc.target/powerpc/cmpb-2.c: New test. * gcc.target/powerpc/cmpb-3.c: New test. * gcc.target/powerpc/cmpb32-1.c: New test. * gcc.target/powerpc/cmpb32-2.c: New test. gcc/ChangeLog: 2017-05-08 Kelvin Nilsen * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add array entries to represent two legal parameterizations of the overloaded __builtin_cmpb function, as represented by the P6_OV_BUILTIN_CMPB constant. (altivec_resolve_overloaded_builtin): Add special case handling for the __builtin_cmpb function, as represented by the P6_OV_BUILTIN_CMPB constant. * config/rs6000/rs6000-builtin.def (BU_P6_2): New macro. (BU_P6_64BIT_2): New macro. (BU_P6_OVERLOAD_2): New macro (CMPB_32): Add 32-bit compare-bytes support for 32-bit only targets. (CMPB): Add 64-bit compare-bytes support for 32-bit and 64-bit targets. (CMPB): Add overload support to represent both 32-bit and 64-bit compare-bytes function. * config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Add support for TARGET_CMPB. * config/rs6000/rs6000.h: Add support for RS6000_BTM_CMPB. * doc/extend.texi (PowerPC AltiVec Built-in Functions): Add documentation of the __builtin_cmpb overloaded built-in function. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 247069) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -3788,6 +3788,7 @@ HOST_WIDE_INT rs6000_builtin_mask_calculate (void) { return (((TARGET_ALTIVEC)? RS6000_BTM_ALTIVEC : 0) + | ((TARGET_CMPB) ? RS6000_BTM_CMPB : 0) | ((TARGET_VSX) ? RS6000_BTM_VSX : 0) | ((TARGET_SPE) ? RS6000_BTM_SPE : 0) | ((TARGET_PAIRED_FLOAT) ? RS6000_BTM_PAIRED: 0) Index: gcc/config/rs6000/rs6000.h === --- gcc/config/rs6000/rs6000.h (revision 247069) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -2717,6 +2717,7 @@ extern int frame_pointer_needed; aren't in target_flags. */ #define RS6000_BTM_ALWAYS 0 /* Always enabled. */ #define RS6000_BTM_ALTIVEC MASK_ALTIVEC/* VMX/altivec vectors. */ +#define RS6000_BTM_CMPBMASK_CMPB /* ISA 2.05: compare bytes. */ #define RS6000_BTM_VSX MASK_VSX/* VSX (vector/scalar). */ #define RS6000_BTM_P8_VECTOR MASK_P8_VECTOR /* ISA 2.07 vector. */ #define RS6000_BTM_P9_VECTOR MASK_P9_VECTOR /* ISA 3.0 vector. */ Index: gcc/config/rs6000/rs6000-c.c === --- gcc/config/rs6000/rs6000-c.c(revision 247069) +++ gcc/config/rs6000/rs6000-c.c(working copy) @@ -5348,6 +5348,11 @@ const struct altivec_builtin_types altivec_overloa RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { P6_OV_BUILTIN_CMPB, P6_BUILTIN_CMPB_32, +RS6000_BTI_UINTSI, RS6000_BTI_UINTSI, RS6000_BTI_UINTSI, 0 }, + { P6_OV_BUILTIN_CMPB, P6_BUILTIN_CMPB, +RS6000_BTI_UINTDI, RS6000_BTI_UINTDI, RS6000_BTI_UINTDI, 0 }, + { P8V_BUILTIN_VEC_VUPKHSW, P8V_BUILTIN_VUPKHSW, RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 }, { P8V_BUILTIN_VEC_VUPKHSW, P8V_BUILTIN_VUPKHSW, @@ -6409,25 +6414,76 @@ altivec_resolve_overloaded_builtin (location_t loc for (desc = altivec_overloaded_builtins; desc->code && desc->code != fcode; desc++) continue; - -/* For arguments after the last, we have RS6000_BTI_NOT_OPAQUE in - the opX fields. */ -for (; desc->code == fcode; desc++) + +/* Need to special case __builtin_cmp because the overloaded forms + of this function take (unsigned int, unsigned int) or (unsigned + long long int, unsigned long long int). Since C co
[PATCH] Tweak static assertions in std::optional
This simplifies the conditions of some static assertions, using one assertion per condition instead of asserting the conjunction. This means we don't need any string literal, because the conditions are quite readable now. * include/std/optional: Use a separate static_assert per condition. * testsuite/20_util/optional/cons/value_neg.cc: Update dg-error line numbers. Tested powerpc64le-linux, committed to trunk. commit 90daca5d8383567deaff9817c4cca92309573311 Author: Jonathan Wakely Date: Mon May 8 16:23:44 2017 +0100 Tweak static assertions in std::optional * include/std/optional: Use a separate static_assert per condition. * testsuite/20_util/optional/cons/value_neg.cc: Update dg-error line numbers. diff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional index 1724120..5aa926f 100644 --- a/libstdc++-v3/include/std/optional +++ b/libstdc++-v3/include/std/optional @@ -462,10 +462,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Unique tag type. optional<_Tp>> { - static_assert(__and_<__not_, nullopt_t>>, - __not_, in_place_t>>, - __not_>>(), -"Invalid instantiation of optional"); + static_assert(!is_same_v, nullopt_t>); + static_assert(!is_same_v, in_place_t>); + static_assert(!is_reference_v<_Tp>); private: using _Base = _Optional_base<_Tp>; @@ -756,9 +755,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION constexpr _Tp value_or(_Up&& __u) const& { - static_assert(__and_, - is_convertible<_Up&&, _Tp>>(), - "Cannot return value"); + static_assert(is_copy_constructible_v<_Tp>); + static_assert(is_convertible_v<_Up&&, _Tp>); return this->_M_is_engaged() ? this->_M_get() @@ -769,9 +767,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Tp value_or(_Up&& __u) && { - static_assert(__and_, - is_convertible<_Up&&, _Tp>>(), - "Cannot return value" ); + static_assert(is_move_constructible_v<_Tp>); + static_assert(is_convertible_v<_Up&&, _Tp>); return this->_M_is_engaged() ? std::move(this->_M_get()) diff --git a/libstdc++-v3/testsuite/20_util/optional/cons/value_neg.cc b/libstdc++-v3/testsuite/20_util/optional/cons/value_neg.cc index 87907f9..5abe26e 100644 --- a/libstdc++-v3/testsuite/20_util/optional/cons/value_neg.cc +++ b/libstdc++-v3/testsuite/20_util/optional/cons/value_neg.cc @@ -37,8 +37,8 @@ int main() std::optional> oup2 = new int; // { dg-error "conversion" } struct U { explicit U(std::in_place_t); }; std::optional ou(std::in_place); // { dg-error "no matching" } -// { dg-error "no type" "" { target { *-*-* } } 488 } -// { dg-error "no type" "" { target { *-*-* } } 498 } -// { dg-error "no type" "" { target { *-*-* } } 555 } +// { dg-error "no type" "" { target { *-*-* } } 487 } +// { dg-error "no type" "" { target { *-*-* } } 497 } +// { dg-error "no type" "" { target { *-*-* } } 554 } } }
Re: [PATCH] Don't assume __secure_getenv is available
On 05/07/2017 11:37 PM, Janne Blomqvist via fortran wrote: > PING > > On Thu, Apr 27, 2017 at 9:55 PM, Janne Blomqvist > wrote: >> On Thu, Apr 27, 2017 at 9:50 PM, Janne Blomqvist >> wrote: >> [snip] >> >> And on top of that patch this simple typo fix: >> >> diff --git a/libgfortran/runtime/environ.c b/libgfortran/runtime/environ.c >> index 969dcdf..f488e87 100644 >> --- a/libgfortran/runtime/environ.c >> +++ b/libgfortran/runtime/environ.c >> @@ -46,7 +46,7 @@ static char* weak_secure_getenv (const char*) >> char * >> secure_getenv (const char *name) >> { >> -#if SUPPORTS_WEAKREF && defined(HAVE__SECURE_GETENV) >> +#if SUPPORTS_WEAKREF && defined(HAVE___SECURE_GETENV) >>if (weak_secure_getenv) >> return weak_secure_getenv (name); >> #endif >> >> >> >> -- >> Janne Blomqvist > > > Looks Ok to me. I thought you might be waiting for Jakub to reply. Thanks, Jerry
[PATCH] correct length stpncpy handling in -Wformat-overflow (PR 80669)
The -Wformat-overflow warning newly enhanced in GCC 8.0 to detect reading past the end of the source sequence misinterprets the size argument to stpncpy as a request to read that many bytes from the source sequence, rather than the number of bytes to write. Like strncpy, the function never reads past the end of the source string. This constraint is already handled correctly for strncpy by the check_sizes function so the fix is to simply remove the wrong length handling from expand_builtin_stpncpy and let check_sizes do the right thing. The attached patch passes bootstrap and regression test on x86_64. Martin PR middle-end/80669 - Bad -Wstringop-overflow warnings for stpncpy gcc/ChangeLog: PR middle-end/80669 * builtins.c (expand_builtin_stpncpy): Simplify. gcc/testsuite/ChangeLog: PR middle-end/80669 * gcc.dg/builtin-stpncpy.c: New test. diff --git a/gcc/builtins.c b/gcc/builtins.c index 936fcee..4f6c9c4 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -3788,31 +3788,18 @@ expand_builtin_stpncpy (tree exp, rtx) || !warn_stringop_overflow) return NULL_RTX; + /* The source and destination of the call. */ tree dest = CALL_EXPR_ARG (exp, 0); tree src = CALL_EXPR_ARG (exp, 1); - /* The number of bytes to write (not the maximum). */ + /* The exact number of bytes to write (not the maximum). */ tree len = CALL_EXPR_ARG (exp, 2); - /* The length of the source sequence. */ - tree slen = c_strlen (src, 1); - - /* Try to determine the range of lengths that the source expression - refers to. */ - tree lenrange[2]; - if (slen) -lenrange[0] = lenrange[1] = slen; - else -{ - get_range_strlen (src, lenrange); - slen = lenrange[0]; -} + /* The size of the destination object. */ tree destsize = compute_objsize (dest, warn_stringop_overflow - 1); - /* The number of bytes to write is LEN but check_sizes will also - check SLEN if LEN's value isn't known. */ check_sizes (OPT_Wstringop_overflow_, - exp, len, /*maxlen=*/NULL_TREE, slen, destsize); + exp, len, /*maxlen=*/NULL_TREE, src, destsize); return NULL_RTX; } diff --git a/gcc/testsuite/gcc.dg/builtin-stpncpy.c b/gcc/testsuite/gcc.dg/builtin-stpncpy.c new file mode 100644 index 000..e4290d5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/builtin-stpncpy.c @@ -0,0 +1,74 @@ +/* PR tree-optimization/80669 - Bad -Wstringop-overflow warnings for stpncpy + { dg-do compile } + { dg-options "-O2 -Wall" } */ + +#define SIZE_MAX __SIZE_MAX__ + +typedef __SIZE_TYPE__ size_t; + +void sink (char*); + +#define stpncpy (d, s, n) sink (__builtin_stpncpy (d, s, n)) + +size_t value (void); + +size_t range (size_t min, size_t max) +{ + size_t val = value (); + return val < min || max < val ? min : val; +} + +/* Verify that no warning is issued for stpncpy with constant size. */ +void test_cst (char *d) +{ + __builtin_stpncpy (d, "123", 0); + __builtin_stpncpy (d, "123", 1); + __builtin_stpncpy (d, "123", 2); + __builtin_stpncpy (d, "123", 3); + __builtin_stpncpy (d, "123", 4); + __builtin_stpncpy (d, "123", 5); + __builtin_stpncpy (d, "123", 999); + + size_t n = SIZE_MAX / 2; + + __builtin_stpncpy (d, "123", n); + + __builtin_stpncpy (d, "123", n + 1);/* { dg-warning "specified size \[0-9\]+ exceeds maximum object size \[0-9\]+" } */ +} + + +/* Verify that no warning is issued for stpncpy with size in some range. */ +void test_rng (char *d) +{ +#define R(min, max) range (min, max) + + __builtin_stpncpy (d, "123", R (0, 1)); + __builtin_stpncpy (d, "123", R (0, 2)); + __builtin_stpncpy (d, "123", R (0, 3)); + __builtin_stpncpy (d, "123", R (0, 4)); + __builtin_stpncpy (d, "123", R (0, 5)); + + __builtin_stpncpy (d, "123", R (1, 2)); + __builtin_stpncpy (d, "123", R (1, 3)); + __builtin_stpncpy (d, "123", R (1, 4)); + __builtin_stpncpy (d, "123", R (1, 5)); + + __builtin_stpncpy (d, "123", R (2, 3)); + __builtin_stpncpy (d, "123", R (2, 4)); + __builtin_stpncpy (d, "123", R (2, 5)); + + __builtin_stpncpy (d, "123", R (3, 4)); + __builtin_stpncpy (d, "123", R (3, 5)); + + __builtin_stpncpy (d, "123", R (4, 5)); + + __builtin_stpncpy (d, "123", R (5, 6)); + + __builtin_stpncpy (d, "123", R (12345, 23456)); + + size_t n = SIZE_MAX / 2; + + __builtin_stpncpy (d, "123", R (n - 1, n + 1)); + + __builtin_stpncpy (d, "123", R (n + 1, n + 2)); /* { dg-warning "specified size between \[0-9\]+ and \[0-9\]+ exceeds maximum object size \[0-9\]+" } */ +}
Re: [PATCH 1/2] C++ template type diff printing
On 05/08/2017 12:51 PM, David Malcolm wrote: On Mon, 2017-05-08 at 07:36 -0400, Nathan Sidwell wrote: On 05/04/2017 07:44 PM, David Malcolm wrote: Is '-felide-type' a good name? Wouldn't something like '-fdiagnostic-elide-template-args' be better? Here I'm merely copying clang's option name. http://clang.llvm.org/docs/UsersManual.html#cmdoption-fno-elide-type Whelp, that's ... unfortunate. nathan -- Nathan Sidwell
Re: [PATCH, v3] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch
On May 8, 2017 6:41:01 PM GMT+02:00, Peter Bergner wrote: >On 05/03/2017 08:32 AM, Richard Biener wrote: > > As for Bernhards concern I share this -- please intead make the > > interface take either a gimple_seq or a gimple_stmt_iterator > > instead of a basic-block. That makes it more obvious you > > can't use things like gsi_after_labels. Also I think it's more > > natural to work backwards as the last stmt in the sequence > > _has_ to be __builtin_unreachable () and thus checking that first > > is the cheapest thing to do given that in most cases it will > > not be __builtin_unreachable () (but a noreturn call or an > > inifinite loop). > > > > Thus, name it gimple_seq_unreachable_p. > >So you mean something like the following? Yes. > >/* Returns true if the sequence of statements STMTS only contains >a call to __builtin_unreachable (). */ > >bool >gimple_seq_unreachable_p (gimple_seq stmts) >{ > gimple_stmt_iterator gsi = gsi_last (stmts); > > if (!gimple_call_builtin_p (gsi_stmt (gsi), BUILT_IN_UNREACHABLE)) > return false; > > for (gsi_prev (&gsi); !gsi_end_p (gsi); gsi_prev (&gsi)) > { > gimple *stmt = gsi_stmt (gsi); > if (gimple_code (stmt) != GIMPLE_LABEL > && !is_gimple_debug (stmt) > && !gimple_clobber_p (stmt)) > return false; > } > return true; >} > > > > > On Wed, 26 Apr 2017, Peter Bergner wrote: > >> One difference from the last patch is that I am no longer setting > >> default_label to NULL when we emit a decision tree. I noticed that > >> the decision tree code seemed to generate slightly better code for > >> some of my unit tests if I left it alone. This simplified the > >> patch somewhat by removing the changes to emit_case_nodes(). >[snip] > > > > Can you do the gimple_unreachable_bb_p check earlier in > > expand_case so it covers the emit_case_decision_tree path as well > > (and verify that works, of course)? So basically right at > > > > /* Find the default case target label. */ > > default_label = jump_target_rtx > > (CASE_LABEL (gimple_switch_default_label (stmt))); > > edge default_edge = EDGE_SUCC (bb, 0); > > int default_prob = default_edge->probability; > > > > handle this case. > >That is what the previous patch did, but as I mention above, >we generate slightly better code for some test cases (other >tests seemed to generate the same code) if we don't attempt >to handle the decision tree case. I'll note that the current >unpatched compiler already knows how to remove unreachable >case statement blocks when we expand to a decision tree. > >I can add that code back if you think that it will have a >positive benefit for some test case I haven't tried yet. > >Peter
[C++ PATCH] pushdecl
This small patch replaces pushdecl_with_scope with a more-specific pushdecl_outermost_localscope. It's used in 2 places to inject an artifical decl into the outermost block scope of a function (__FUNCTION__ var and lambda capture proxies). This moves some binding-level handling into name-lookup, where it belongs. Applied to trunk. nathan -- Nathan Sidwell 2017-05-08 Nathan Sidwell * name-lookup.h (pushdecl_with_scope): Replace with ... (pushdecl_outermost_localscope): ... this. * name-lookup.c (pushdecl_with_scope): Replace with ... (pushdecl_outermost_localscope): ... this. (pushdecl_namespace_level): Adjust. * decl.c (cp_make_fname_decl): Use pushdecl_outermost_localscope. * lambda.c (insert_capture_proxy): Likewise. Index: decl.c === --- decl.c (revision 247751) +++ decl.c (working copy) @@ -4335,9 +4335,6 @@ cp_make_fname_decl (location_t loc, tree if (name) free (CONST_CAST (char *, name)); - /* As we're using pushdecl_with_scope, we must set the context. */ - DECL_CONTEXT (decl) = current_function_decl; - TREE_STATIC (decl) = 1; TREE_READONLY (decl) = 1; DECL_ARTIFICIAL (decl) = 1; @@ -4346,12 +4343,8 @@ cp_make_fname_decl (location_t loc, tree if (current_function_decl) { - cp_binding_level *b = current_binding_level; - if (b->kind == sk_function_parms) - return error_mark_node; - while (b->level_chain->kind != sk_function_parms) - b = b->level_chain; - pushdecl_with_scope (decl, b, /*is_friend=*/false); + DECL_CONTEXT (decl) = current_function_decl; + decl = pushdecl_outermost_localscope (decl); cp_finish_decl (decl, init, /*init_const_expr_p=*/false, NULL_TREE, LOOKUP_ONLYCONVERTING); } Index: lambda.c === --- lambda.c (revision 247751) +++ lambda.c (working copy) @@ -295,24 +295,13 @@ is_normal_capture_proxy (tree decl) void insert_capture_proxy (tree var) { - cp_binding_level *b; - tree stmt_list; - /* Put the capture proxy in the extra body block so that it won't clash with a later local variable. */ - b = current_binding_level; - for (;;) -{ - cp_binding_level *n = b->level_chain; - if (n->kind == sk_function_parms) - break; - b = n; -} - pushdecl_with_scope (var, b, false); + pushdecl_outermost_localscope (var); /* And put a DECL_EXPR in the STATEMENT_LIST for the same block. */ var = build_stmt (DECL_SOURCE_LOCATION (var), DECL_EXPR, var); - stmt_list = (*stmt_list_stack)[1]; + tree stmt_list = (*stmt_list_stack)[1]; gcc_assert (stmt_list); append_to_statement_list_force (var, &stmt_list); } Index: name-lookup.c === --- name-lookup.c (revision 247751) +++ name-lookup.c (working copy) @@ -2870,14 +2870,23 @@ pushdecl_with_scope_1 (tree x, cp_bindin return x; } -/* Wrapper for pushdecl_with_scope_1. */ +/* Inject X into the local scope just before the function parms. */ tree -pushdecl_with_scope (tree x, cp_binding_level *level, bool is_friend) +pushdecl_outermost_localscope (tree x) { - tree ret; bool subtime = timevar_cond_start (TV_NAME_LOOKUP); - ret = pushdecl_with_scope_1 (x, level, is_friend); + cp_binding_level *b = NULL, *n = current_binding_level; + + if (n->kind == sk_function_parms) +return error_mark_node; + do +{ + b = n; + n = b->level_chain; +} + while (n->kind != sk_function_parms); + tree ret = pushdecl_with_scope_1 (x, b, false); timevar_cond_stop (TV_NAME_LOOKUP, subtime); return ret; } @@ -4350,7 +4359,8 @@ pushdecl_namespace_level (tree x, bool i tree t; bool subtime = timevar_cond_start (TV_NAME_LOOKUP); - t = pushdecl_with_scope (x, NAMESPACE_LEVEL (current_namespace), is_friend); + t = pushdecl_with_scope_1 +(x, NAMESPACE_LEVEL (current_namespace), is_friend); /* Now, the type_shadowed stack may screw us. Munge it so it does what we want. */ Index: name-lookup.h === --- name-lookup.h (revision 247751) +++ name-lookup.h (working copy) @@ -300,6 +300,7 @@ extern tree push_inner_scope (tree); extern void pop_inner_scope (tree, tree); extern void push_binding_level (cp_binding_level *); +extern tree pushdecl_outermost_localscope (tree); extern bool push_namespace (tree); extern void pop_namespace (void); extern void push_nested_namespace (tree); @@ -307,7 +308,6 @@ extern void pop_nested_namespace (tree); extern bool handle_namespace_attrs (tree, tree); extern void pushlevel_class (void); extern void poplevel_class (void); -extern tree pushdecl_with_scope (tree, cp_binding_level *, bool); extern tree lookup_name_prefer_type (tree, int); extern tree lookup_name_real (tree, int, int, bool, int, int); extern tree lookup_type_scope (tree, tag_scope);
[C++ PATCH] function decl pushing
This patch implements a couple of changes. 1) set DECL_ANTICIPATED before pushing an anticipated builtin. My overload management changes will keep the overload list ordered, so needs to know this up front. 2) pushdecl has some funky code to detect when we're trying to push a function decl into its own scope. And push into its containing scope in that case. But we only need that subterfuge because start_preparsed_function sets current_function_decl, before pushing the decl itself. Reordering the order of events there will make pushdecl's weirdness unnecessary. nathan -- Nathan Sidwell 2017-05-08 Nathan Sidwell * decl.c (builtin_function_1): Set DCL_ANTICIPATED before pushing. (start_preparsed_function): Do decl pushing before setting current_funciton_decl and announcing it. Index: decl.c === --- decl.c (revision 247752) +++ decl.c (working copy) @@ -4375,11 +4375,6 @@ builtin_function_1 (tree decl, tree cont DECL_CONTEXT (decl) = context; - if (is_global) -pushdecl_top_level (decl); - else -pushdecl (decl); - /* A function in the user's namespace should have an explicit declaration before it is used. Mark the built-in function as anticipated but not actually declared. */ @@ -4397,6 +4392,11 @@ builtin_function_1 (tree decl, tree cont DECL_ANTICIPATED (decl) = 1; } + if (is_global) +pushdecl_top_level (decl); + else +pushdecl (decl); + return decl; } @@ -14821,17 +14821,10 @@ start_preparsed_function (tree decl1, tr decl1 = newdecl1; } - /* We are now in the scope of the function being defined. */ - current_function_decl = decl1; - - /* Save the parm names or decls from this function's declarator - where store_parm_decls will find them. */ - current_function_parms = DECL_ARGUMENTS (decl1); - /* Make sure the parameter and return types are reasonable. When you declare a function, these types can be incomplete, but they must be complete when you define the function. */ - check_function_type (decl1, current_function_parms); + check_function_type (decl1, DECL_ARGUMENTS (decl1)); /* Build the return declaration for the function. */ restype = TREE_TYPE (fntype); @@ -14848,9 +14841,6 @@ start_preparsed_function (tree decl1, tr cp_apply_type_quals_to_decl (cp_type_quals (restype), resdecl); } - /* Let the user know we're compiling this function. */ - announce_function (decl1); - /* Record the decl so that the function name is defined. If we already have a decl for this name, and it is a FUNCTION_DECL, use the old decl. */ @@ -14922,9 +14912,16 @@ start_preparsed_function (tree decl1, tr maybe_apply_pragma_weak (decl1); } - /* Reset this in case the call to pushdecl changed it. */ + /* We are now in the scope of the function being defined. */ current_function_decl = decl1; + /* Save the parm names or decls from this function's declarator + where store_parm_decls will find them. */ + current_function_parms = DECL_ARGUMENTS (decl1); + + /* Let the user know we're compiling this function. */ + announce_function (decl1); + gcc_assert (DECL_INITIAL (decl1)); /* This function may already have been parsed, in which case just
Re: [PATCH, v3] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch
On 05/08/2017 12:44 PM, Richard Biener wrote: On Wed, 26 Apr 2017, Peter Bergner wrote: One difference from the last patch is that I am no longer setting default_label to NULL when we emit a decision tree. I noticed that the decision tree code seemed to generate slightly better code for some of my unit tests if I left it alone. This simplified the patch somewhat by removing the changes to emit_case_nodes(). [snip] Can you do the gimple_unreachable_bb_p check earlier in expand_case so it covers the emit_case_decision_tree path as well (and verify that works, of course)? So basically right at /* Find the default case target label. */ default_label = jump_target_rtx (CASE_LABEL (gimple_switch_default_label (stmt))); edge default_edge = EDGE_SUCC (bb, 0); int default_prob = default_edge->probability; handle this case. That is what the previous patch did, but as I mention above, we generate slightly better code for some test cases (other tests seemed to generate the same code) if we don't attempt to handle the decision tree case. I'll note that the current unpatched compiler already knows how to remove unreachable case statement blocks when we expand to a decision tree. I can add that code back if you think that it will have a positive benefit for some test case I haven't tried yet. Any comment on the above? Peter
C++ PATCH for c++/80178 (ABI of non-copyable class)
Bug 80178 points out that we are doing bitwise copy when passing a class with only deleted copy/move constructors; we should use invisible reference instead. The second patch updates the trunk to use -fabi-version=12 by default for GCC 8. Tested x86_64-pc-linux-gnu, applying to trunk. commit f34d5751de4b94419ead04c5ab9be0dde4d21953 Author: Jason Merrill Date: Mon May 8 14:24:03 2017 -0400 PR c++/80178 - parameter passing for uncopyable classes * tree.c (type_has_nontrivial_copy_init): True for classes with only deleted copy/move ctors. (remember_deleted_copy, maybe_warn_parm_abi): New. * decl.c (require_complete_types_for_parms, check_function_type): Call maybe_warn_parm_abi. * call.c (convert_for_arg_passing, build_cxx_call): Likewise. diff --git a/gcc/common.opt b/gcc/common.opt index a5c3aea..1330555 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -910,6 +910,10 @@ Driver Undocumented ; and introduces new inheriting constructor handling. ; Default in G++ 7. ; +; 12: Corrects the calling convention for classes with only deleted copy/move +; constructors. +; Default in G++ 8. +; ; Additional positive integers will be assigned as new versions of ; the ABI become the default version of the ABI. fabi-version= diff --git a/gcc/cp/call.c b/gcc/cp/call.c index f1e431c..d9accd1 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -7352,17 +7352,21 @@ convert_for_arg_passing (tree type, tree val, tsubst_flags_t complain) && COMPLETE_TYPE_P (type) && tree_int_cst_lt (TYPE_SIZE (type), TYPE_SIZE (integer_type_node))) val = cp_perform_integral_promotions (val, complain); - if ((complain & tf_warning) - && warn_suggest_attribute_format) + if (complain & tf_warning) { - tree rhstype = TREE_TYPE (val); - const enum tree_code coder = TREE_CODE (rhstype); - const enum tree_code codel = TREE_CODE (type); - if ((codel == POINTER_TYPE || codel == REFERENCE_TYPE) - && coder == codel - && check_missing_format_attribute (type, rhstype)) - warning (OPT_Wsuggest_attribute_format, -"argument of function call might be a candidate for a format attribute"); + if (warn_suggest_attribute_format) + { + tree rhstype = TREE_TYPE (val); + const enum tree_code coder = TREE_CODE (rhstype); + const enum tree_code codel = TREE_CODE (type); + if ((codel == POINTER_TYPE || codel == REFERENCE_TYPE) + && coder == codel + && check_missing_format_attribute (type, rhstype)) + warning (OPT_Wsuggest_attribute_format, +"argument of function call might be a candidate " +"for a format attribute"); + } + maybe_warn_parm_abi (type, EXPR_LOC_OR_LOC (val, input_location)); } return val; } @@ -8234,7 +8238,10 @@ build_cxx_call (tree fn, int nargs, tree *argarray, return error_mark_node; if (MAYBE_CLASS_TYPE_P (TREE_TYPE (fn))) - fn = build_cplus_new (TREE_TYPE (fn), fn, complain); + { + fn = build_cplus_new (TREE_TYPE (fn), fn, complain); + maybe_warn_parm_abi (TREE_TYPE (fn), loc); + } } return convert_from_reference (fn); } diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index d2d48e7..b64fa6d 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -6614,6 +6614,7 @@ extern bool type_has_unique_obj_representations (const_tree); extern bool scalarish_type_p (const_tree); extern bool type_has_nontrivial_default_init (const_tree); extern bool type_has_nontrivial_copy_init (const_tree); +extern void maybe_warn_parm_abi(tree, location_t); extern bool class_tmpl_impl_spec_p (const_tree); extern int zero_init_p (const_tree); extern bool check_abi_tag_redeclaration(const_tree, const_tree, const_tree); diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 2b20bf9..f21058d 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -12394,6 +12394,9 @@ require_complete_types_for_parms (tree parms) { relayout_decl (parms); DECL_ARG_TYPE (parms) = type_passed_as (TREE_TYPE (parms)); + + maybe_warn_parm_abi (TREE_TYPE (parms), + DECL_SOURCE_LOCATION (parms)); } else /* grokparms or complete_type_or_else will have already issued @@ -14646,7 +14649,11 @@ check_function_type (tree decl, tree current_function_parms) TREE_TYPE (decl) = fntype; } else -abstract_virtuals_error (decl, TREE_TYPE (fntype)); +{ + abstract_virtuals_error (decl, TREE_TYPE (fntype)); + maybe_warn_parm_abi (TREE_TYPE (fntype), + DECL_SOURCE_LOCATION (decl)); +} } /* True iff FN is an implicitly-defined default constructor. */ diff --git a/gcc/cp/tr
[PATCH] decl lang hooks
cp/name-lookup has a twisty maze of forwarding functions. One reason is that a couple of the names are used directly by c-common.c. However, we now have lang hooks for these things. This patch changes the C++ FE to override the pushdecl and getdecl lang hooks. In addition to simply overriding them there, I had to fixup a couple of places in c-family/c-common.c and objc/objc-gnu-runtime-abi-01.c to use the pushdecl hook. This patch doesn't reduce the twisty maze, but will enable me to do so shortly. ok? nathan -- Nathan Sidwell 2017-05-08 Nathan Sidwell gcc/c/ * c-tree.h (pushdecl): Declare. gcc/cp/ * cp-lang.c (get_global_decls, cxx_pushdecl): New. (LANG_HOOKS_GETDECLS, LANG_HOOKS_PUSHDECL): Override. * name-lookup.h (pushdecl_top_level): Declare. gcc/c-family/ * c-common.c (c_register_builtin_type): Use pushdecl lang_hook. * c-common.h (pushdecl_top_level, pushdecl): Don't declare here. gcc/objc/ * objc-gnu-runtime-abi-01.c (objc_add_static_instance): Use pushdecl lang_hook. Index: c/c-tree.h === --- c/c-tree.h (revision 247751) +++ c/c-tree.h (working copy) @@ -506,6 +506,7 @@ extern tree c_break_label; extern tree c_cont_label; extern bool global_bindings_p (void); +extern tree pushdecl (tree); extern void push_scope (void); extern tree pop_scope (void); extern void c_bindings_start_stmt_expr (struct c_spot_bindings *); Index: c-family/c-common.c === --- c-family/c-common.c (revision 247751) +++ c-family/c-common.c (working copy) @@ -2589,7 +2589,7 @@ c_register_builtin_type (tree type, cons DECL_ARTIFICIAL (decl) = 1; if (!TYPE_NAME (type)) TYPE_NAME (type) = decl; - pushdecl (decl); + lang_hooks.decls.pushdecl (decl); registered_builtin_types = tree_cons (0, type, registered_builtin_types); } Index: c-family/c-common.h === --- c-family/c-common.h (revision 247751) +++ c-family/c-common.h (working copy) @@ -588,8 +588,7 @@ extern tree push_stmt_list (void); extern tree pop_stmt_list (tree); extern tree add_stmt (tree); extern void push_cleanup (tree, tree, bool); -extern tree pushdecl_top_level (tree); -extern tree pushdecl (tree); + extern tree build_modify_expr (location_t, tree, tree, enum tree_code, location_t, tree, tree); extern tree build_array_notation_expr (location_t, tree, tree, enum tree_code, Index: cp/cp-lang.c === --- cp/cp-lang.c (revision 247751) +++ cp/cp-lang.c (working copy) @@ -35,6 +35,8 @@ static tree cp_eh_personality (void); static tree get_template_innermost_arguments_folded (const_tree); static tree get_template_argument_pack_elems_folded (const_tree); static tree cxx_enum_underlying_base_type (const_tree); +static tree get_global_decls (); +static tree cxx_pushdecl (tree); /* Lang hooks common to C++ and ObjC++ are declared in cp/cp-objcp-common.h; consequently, there should be very few hooks below. */ @@ -78,6 +80,10 @@ static tree cxx_enum_underlying_base_typ #define LANG_HOOKS_EH_RUNTIME_TYPE build_eh_type_type #undef LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE #define LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE cxx_enum_underlying_base_type +#undef LANG_HOOKS_GETDECLS +#define LANG_HOOKS_GETDECLS get_global_decls +#undef LANG_HOOKS_PUSHDECL +#define LANG_HOOKS_PUSHDECL cxx_pushdecl /* Each front end provides its own lang hook initializer. */ struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; @@ -229,5 +235,21 @@ tree cxx_enum_underlying_base_type (cons return underlying_type; } +/* Return the list of decls in the global namespace. */ + +static +tree get_global_decls () +{ + return NAMESPACE_LEVEL (global_namespace)->names; +} + +/* Push DECL into the current scope. */ + +static +tree cxx_pushdecl (tree decl) +{ + return pushdecl (decl); +} + #include "gt-cp-cp-lang.h" #include "gtype-cp.h" Index: cp/name-lookup.h === --- cp/name-lookup.h (revision 247752) +++ cp/name-lookup.h (working copy) @@ -342,4 +342,6 @@ extern tree innermost_non_namespace_valu extern cxx_binding *outer_binding (tree, cxx_binding *, bool); extern void cp_emit_debug_info_for_using (tree, tree); +extern tree pushdecl_top_level (tree); + #endif /* GCC_CP_NAME_LOOKUP_H */ Index: objc/objc-gnu-runtime-abi-01.c === --- objc/objc-gnu-runtime-abi-01.c (revision 247751) +++ objc/objc-gnu-runtime-abi-01.c (working copy) @@ -888,7 +888,7 @@ objc_add_static_instance (tree construct /* We may be writing something else just now. Postpone till end of input. */ DECL_DEFER_OUTPUT (decl) = 1; - pushdecl_top_level (decl); + lang_hooks.decls.pushdecl (decl); rest_of_decl_compilation (decl, 1, 0); /* Add the DECL to the
Re: Use "oacc kernels" attribute for OpenACC kernels
Hi! On Thu, 4 Aug 2016 16:07:00 +0200, I wrote: > Ping. > > On Wed, 27 Jul 2016 12:06:59 +0200, I wrote: > > On Mon, 25 Jan 2016 16:09:14 +0100, Jakub Jelinek wrote: > > > On Mon, Jan 25, 2016 at 10:06:50AM -0500, Nathan Sidwell wrote: > > > > On 01/04/16 10:39, Nathan Sidwell wrote: > > > > >There's currently no robust predicate to determine whether an oacc > > > > >offload > > > > >function is for a kernels region (as opposed to a parallel region). > > > > >[...] > > > > > > > > > >This patch marks TREE_PUBLIC on the offload attribute values, to note > > > > >kernels > > > > >regions, and adds a predicate to check that. [...] > > > > > > > > > >Using these predicates improves the dump output of the openacc device > > > > >lowering > > > > >pass too. > > > > I just submitted a patch adding "Test cases to check OpenACC offloaded > > function's attributes and classification", (Pinged in <877f1r1duw.fsf@hertz.schwinge.homeip.net">http://mid.mail-archive.com/877f1r1duw.fsf@hertz.schwinge.homeip.net>.) > > to actually check the dump output of "oaccdevlow" -- it works. ;-) > > > > > > https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00092.html > > > > ping? > > > > > > Ok, thanks. > > > > It's conceptually and code-wise simpler to just use a "oacc kernels" > > attribute for that. (And, that will make another patch I'm working on > > less convoluted.) > > > > I'm open to suggestions if there is a better place to set the "oacc > > kernels" attribute -- I put it into expand_omp_target, where another > > special thing for GF_OMP_TARGET_KIND_OACC_KERNELS is already being done, > > and before "rewriting" GF_OMP_TARGET_KIND_OACC_KERNELS (and > > GF_OMP_TARGET_KIND_OACC_PARALLEL) into BUILT_IN_GOACC_PARALLEL. My > > reasoning for not setting the attribute earlier (like, in the front > > ends), is that at that point in/before expand_omp_target, we still have > > the distrinction between OACC_PARALLEL/OACC_KERNELS (tree codes), and > > later GF_OMP_TARGET_KIND_OACC_PARALLEL/GF_OMP_TARGET_KIND_OACC_KERNELS > > (GIMPLE_OMP_TARGET subcodes). Another question/possibly cleanup of > > course might be to actually do set the "oacc kernels" attribute in the > > front end and merge OACC_KERNELS into OACC_PARALLEL, and > > GF_OMP_TARGET_KIND_OACC_KERNELS into GF_OMP_TARGET_KIND_OACC_PARALLEL? > > > > But anyway, as a first step: OK for trunk? commit fac5c3214f58812881635d3fb1e1751446d4b660 Author: Thomas Schwinge Date: Mon May 8 21:24:46 2017 +0200 Use "oacc kernels" attribute for OpenACC kernels gcc/ * omp-expand.c (expand_omp_target) : Set "oacc kernels" attribute. * omp-general.c (oacc_set_fn_attrib): Remove is_kernel formal parameter. Adjust all users. (oacc_fn_attrib_kernels_p): Remove function. (execute_oacc_device_lower): Look for "oacc kernels" attribute instead of calling oacc_fn_attrib_kernels_p. * tree-ssa-loop.c (gate_oacc_kernels): Likewise. * tree-parloops.c (create_parallel_loop): If oacc_kernels_p, assert "oacc kernels" attribute is set. gcc/testsuite/ * c-c++-common/goacc/classify-kernels-unparallelized.c: Adjust. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-routine.c: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-routine.f95: Likewise. --- gcc/omp-expand.c | 16 +- gcc/omp-general.c | 18 ++-- gcc/omp-general.h | 4 +--- gcc/omp-offload.c | 25 +++--- .../goacc/classify-kernels-unparallelized.c| 8 +++ .../c-c++-common/goacc/classify-kernels.c | 8 +++ .../c-c++-common/goacc/classify-parallel.c | 2 +- .../c-c++-common/goacc/classify-routine.c | 2 +- .../goacc/classify-kernels-unparallelized.f95 | 8 +++ .../gfortran.dg/goacc/classify-kernels.f95 | 8 +++ .../gfortran.dg/goacc/classify-parallel.f95| 2 +- .../gfortran.dg/goacc/classify-routine.f95 | 2 +- gcc/tree-parloops.c| 5 - gcc/tree-ssa-loop.c| 5 + 14 files changed, 52 insertions(+), 61 deletions(-) diff --git gcc/omp-expand.c gcc/omp-expand.c index 5c48b78..405c60e 100644 --- gcc/omp-expand.c +++ gcc/omp-expand.c @@ -7083,7 +7083,16 @@ expand_omp_target (struct omp_region *region) exit_bb = region->exit; if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS) -mark_loops_in_oac
Re: [patch, fortran] Reduce stack use in blocked matmul
Am 08.05.2017 um 18:58 schrieb Jerry DeLisle: he attached patch reduces the stack usage by the blocked version of matmul for cases where we don't need the full buffer. This should improve stack usage. OK for trunk? OK, thanks. Is this something we should consider for backporting to gcc-7? I think the large block size caused https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79876 . That one was fixed by adjusting the ridiculously low stack size for multi-threaded applications on OSX, but the underlying problem could still bite somewhere or somebody else. Opinions? Regards Thomas
Re: [PATCH 1/7] enhance -Wformat to detect quoting problems (PR 80280 et al.)
On 05/05/2017 11:49 AM, Joseph Myers wrote: On Thu, 4 May 2017, Martin Sebor wrote: I like the flags2 idea. I split up the initialization array to also detect quoted %K, and unquoted %R and %r. With that I ran into test failures that took me a bit to debug. It turns out that there's code (a nasty hack, really) that makes assumptions about some of the conversion specifiers. I dealt with the failures by simplifying the initialization code and removing the hack. This patch is OK. Better testing exposed a couple of issues with the patch: 1) The init_dynamic_diag_info function depends on the order in which the identifiers it looks (e.g. tree_node) for are declared in the translation unit being compiled and in some instances wound up initializing the local_tree_type_node with void_type_node even though tree would subsequently be found to be declared. 2) The warning on %R/%r when used outside a quoted sequence was overfly restrictive. There is code in cp/error.c that relies on that when printing the inlining context of a function. I fixed both of these problems in the attached revision. I've also been running a script to build the targets in config-list.mk. So far it's built just over 40 of the 199 targets and found one quoting problem in a build of sparcv9-solaris2.11. The build fails due to an unrelated error (I opened bug 80673 for it), so I can't say I've found anything that should block the patch from checking in. I'll keep running the script but unless you or someone else would prefer I hold off until they all finish I'll go ahead and commit the attached patch, without waiting for the build of the remaining targets to complete, probably sometime in the second half of the week. Martin PR translation/80280 - Missing closing quote (%>) c/semantics.c and c/c-typeck.c gcc/c-family/ChangeLog: PR translation/80280 * c-format.h (struct format_flag_spec): Add new member. (T89_T): New macro. * c-format.c (local_tree_type_node): New global. (printf_flag_specs, asm_fprintf_flag_spec): Initialize new data. (gcc_diag_flag_specs, scanf_flag_specs, strftime_flag_specs): Ditto. (strfmon_flag_specs): Likewise. (gcc_diag_char_table, gcc_cdiag_char_table): Split up specifiers with distinct quoting properties. (gcc_tdiag_char_table, gcc_cxxdiag_char_table): Same. (flag_chars_t::validate): Add argument and handle bad quoting. (check_format_info_main): Handle quoting problems. (init_dynamic_diag_info): Simplify. gcc/testsuite/ChangeLog: PR translation/80280 * gcc.dg/format/gcc_diag-10.c: New test. diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c index 400eb66..2dba062 100644 --- a/gcc/c-family/c-format.c +++ b/gcc/c-family/c-format.c @@ -53,6 +53,9 @@ struct function_format_info unsigned HOST_WIDE_INT first_arg_num; /* number of first arg (zero for varargs) */ }; +/* Initialized in init_dynamic_diag_info. */ +static tree local_tree_type_node; + static bool decode_format_attr (tree, function_format_info *, int); static int decode_format_type (const char *); @@ -492,17 +495,17 @@ static const format_length_info gcc_gfc_length_specs[] = static const format_flag_spec printf_flag_specs[] = { - { ' ', 0, 0, N_("' ' flag"),N_("the ' ' printf flag"), STD_C89 }, - { '+', 0, 0, N_("'+' flag"),N_("the '+' printf flag"), STD_C89 }, - { '#', 0, 0, N_("'#' flag"),N_("the '#' printf flag"), STD_C89 }, - { '0', 0, 0, N_("'0' flag"),N_("the '0' printf flag"), STD_C89 }, - { '-', 0, 0, N_("'-' flag"),N_("the '-' printf flag"), STD_C89 }, - { '\'', 0, 0, N_("''' flag"),N_("the ''' printf flag"), STD_EXT }, - { 'I', 0, 0, N_("'I' flag"),N_("the 'I' printf flag"), STD_EXT }, - { 'w', 0, 0, N_("field width"), N_("field width in printf format"), STD_C89 }, - { 'p', 0, 0, N_("precision"), N_("precision in printf format"), STD_C89 }, - { 'L', 0, 0, N_("length modifier"), N_("length modifier in printf format"), STD_C89 }, - { 0, 0, 0, NULL, NULL, STD_C89 } + { ' ', 0, 0, 0, N_("' ' flag"),N_("the ' ' printf flag"), STD_C89 }, + { '+', 0, 0, 0, N_("'+' flag"),N_("the '+' printf flag"), STD_C89 }, + { '#', 0, 0, 0, N_("'#' flag"),N_("the '#' printf flag"), STD_C89 }, + { '0', 0, 0, 0, N_("'0' flag"),N_("the '0' printf flag"), STD_C89 }, + { '-', 0, 0, 0, N_("'-' flag"),N_("the '-' printf flag"), STD_C89 }, + { '\'', 0, 0, 0, N_("''' flag"),N_("the ''' printf flag"), STD_EXT }, + { 'I', 0, 0, 0, N_("'I' flag"),N_("the 'I' printf flag"), STD_EXT }, + { 'w', 0, 0, 0, N_("field width"), N_("field width in printf format"), STD_C89 }, + { 'p', 0, 0, 0, N_("precision"), N_("precision in printf format"), STD_C89 },
Re: [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues
On 05/06/2017 03:22 PM, Daniel Santos wrote: gcc-5.4.0 CFLAGS="-march=native -O2 -g": 74 gcc-7.1.0 CFLAGS="-march=native -O2 -g": 74 gcc-7.1.0 CFLAGS="-march=nocona -mtune=generic -O2 -g": 79 gcc-7.1.0 CFLAGS="-march=native -O2 -g -mcall-ms2sysv-xlogues" (patched): 31 I'm building out a clean test environment on another machine to try to rule out clutter issues (and video driver issues) on my workstation. Daniel I've re-run Wine's tests with a new clean VM environment and some changes to include more tests and similar results: Compiler Failures gcc-4.9.4: 39 gcc-7.1.0: 78 gcc-7.1.0-patched (with -mcall-ms2sysv-xlogues): 40 The first error not present in the gcc-4.9.4 tests that I examined looked like a run-of-the-mill race condition in Wine that just happened to not crash when built with 4.9.4. So I'm going to guess that the disappearance of these failures with -mcall-ms2sysv-xlogues is just incidental. I think we're in good condition with this patch set. Daniel
Re: [PATCH 2/n] [PR tree-optimization/78496] Simplify ASSERT_EXPRs to facilitate jump threading
On 05/08/2017 09:54 AM, Jeff Law wrote: So I'm not sure if changing VRP with your patches is a good thing when you could have used the new API in the first place ... I don't see that the changes to date around 78496 change things significantly in regards to the immediate plans to remove ASSERT_EXPR. The work around 78496 raises the bar in terms of what information we need to be able to extract, but that IMHO, is a fine thing to do ;-) As expected, it's pretty easy to change to the newer way of doing things. ie, extracting from the GIMPLE_COND rather than the ASSERT_EXPR. It really isn't a big deal. However, the existence of register_edge_assert_for does change how I'm looking at the next issue for 78496 as well as how to tackle a host of related issues. It may end up being the case that we stop 78496 work after patch #2, work on ASSERT_EXPR removal, then re-eval 78496. And as expected the unwindable VRs do help significantly in the 3rd hunk of the work for 78496. Probably the biggest issue here is how to compose the bits to avoid code duplication between EVRP and the VRP jump threading. Jeff
Re: [PATCH, v3] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch
On 05/08/2017 01:20 PM, Peter Bergner wrote: > That is what the previous patch did, but as I mention above, > we generate slightly better code for some test cases (other > tests seemed to generate the same code) if we don't attempt > to handle the decision tree case. I'll note that the current > unpatched compiler already knows how to remove unreachable > case statement blocks when we expand to a decision tree. I should be more careful with my description here. The patch does affect both unreachable case statements for both decision trees as well as jump tables, and that leads to improved code for both decision trees as well as jump tables. What I meant to say above, is that the current handling of unreachable default case statements in the unpatched compiler seems to lead to slightly better code for some test cases than attempting to handle default_label == NULL in the decision tree code. It was for that reason, I placed the code in expand_case() only in the jump table path. The fact it made the patch smaller was a bonus, since I didn't need to protect emit_case_nodes() from a NULL default_label. As I said, if you think it will help some test case I haven't tried yet, I can add that support back. Peter
[committed] correct %D quoting in gcc/config/sol2-c.c
In preparation for committing the meat of the patch for bug 80280 I've committed the following in r247758 as an obvious fix to prevent the new -Wformat warning from breaking Solaris bootstrap. Bootstrap is currently broken due to 80673 (that looks like a separate issue from the warning). Martin Index: gcc/config/sol2-c.c === --- gcc/config/sol2-c.c (revision 247757) +++ gcc/config/sol2-c.c (working copy) @@ -113,7 +113,7 @@ solaris_pragma_align (cpp_reader *pfile ATTRIBUTE_ tree decl = identifier_global_value (t); if (decl && DECL_P (decl)) warning (0, "%<#pragma align%> must appear before the declaration of " -"%D, ignoring", decl); +"%qD, ignoring", decl); else solaris_pending_aligns = tree_cons (t, build_tree_list (NULL, x), solaris_pending_aligns);
Re: [patch, fortran] Reduce stack use in blocked matmul
On 05/08/2017 12:29 PM, Thomas Koenig wrote: > Am 08.05.2017 um 18:58 schrieb Jerry DeLisle: > > he attached patch reduces the stack usage by the blocked >>> version of matmul for cases where we don't need the full buffer. >>> This should improve stack usage. >>> >>> OK for trunk? >>> >> >> OK, thanks. > > Is this something we should consider for backporting to gcc-7? > I think the large block size caused > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79876 . > > That one was fixed by adjusting the ridiculously low > stack size for multi-threaded applications on OSX, but > the underlying problem could still bite somewhere or > somebody else. > > Opinions? > I think it should be back ported since we really changed the matmul and this was a fallout and using a to much stack might end up being a regression. OK to back port. Jerry
[Patch, Fortran, OOP] PR 79311: ICE in generate_finalization_wrapper, at fortran/class.c:1992
Hi all, the attached patch fixes an ICE-on-valid problem with finalization by making sure that the finalization procedures are properly resolved. In the test case, the finalizer of the component type was not being resolved if the superordinate type had a finalizer itself. The patch also fixes a small error that had no actual impact on the test case ('has_final2' could never become true). Regtesting went well, except for these three failure which also seems to occur on a clean trunk currently: FAIL: gfortran.dg/coarray_lock_7.f90 -O scan-tree-dump-times original FAIL: gfortran.dg/coarray_lock_7.f90 -O scan-tree-dump-times original FAIL: gfortran.dg/mvbits_7.f90 -O0 (test for warnings, line 28) Ok for trunk? Cheers, Janus 2017-05-08 Janus Weil PR fortran/79311 * resolve.c (gfc_resolve_finalizers): Ensure that derived-type components have a their finalizers resolved, also if the superordinate type itself has a finalizer. 2017-05-08 Janus Weil PR fortran/79311 * gfortran.dg/finalize_32.f90: New test. Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (revision 247757) +++ gcc/fortran/resolve.c (working copy) @@ -12385,26 +12385,23 @@ gfc_resolve_finalizers (gfc_symbol* derived, bool if (parent) gfc_resolve_finalizers (parent, finalizable); - /* Return early when not finalizable. Additionally, ensure that derived-type - components have a their finalizables resolved. */ - if (!derived->f2k_derived || !derived->f2k_derived->finalizers) + /* Ensure that derived-type components have a their finalizers resolved. */ + bool has_final = derived->f2k_derived && derived->f2k_derived->finalizers; + for (c = derived->components; c; c = c->next) +if (c->ts.type == BT_DERIVED + && !c->attr.pointer && !c->attr.proc_pointer && !c->attr.allocatable) + { + bool has_final2 = false; + if (!gfc_resolve_finalizers (c->ts.u.derived, &has_final2)) + return false; /* Error. */ + has_final = has_final || has_final2; + } + /* Return early if not finalizable. */ + if (!has_final) { - bool has_final = false; - for (c = derived->components; c; c = c->next) - if (c->ts.type == BT_DERIVED - && !c->attr.pointer && !c->attr.proc_pointer && !c->attr.allocatable) - { - bool has_final2 = false; - if (!gfc_resolve_finalizers (c->ts.u.derived, &has_final)) - return false; /* Error. */ - has_final = has_final || has_final2; - } - if (!has_final) - { - if (finalizable) - *finalizable = false; - return true; - } + if (finalizable) + *finalizable = false; + return true; } /* Walk over the list of finalizer-procedures, check them, and if any one ! { dg-do compile } ! ! PR 79311: [OOP] ICE in generate_finalization_wrapper, at fortran/class.c:1992 ! ! Contributed by DIL module tensor_recursive implicit none type :: tens_signature_t contains final :: tens_signature_dtor end type type :: tens_header_t type(tens_signature_t) :: signature contains final :: tens_header_dtor end type contains subroutine tens_signature_dtor(this) type(tens_signature_t) :: this end subroutine subroutine tens_header_dtor(this) type(tens_header_t) :: this end subroutine end
Re: [PATCH] decl lang hooks
On Mon, 8 May 2017, Nathan Sidwell wrote: > cp/name-lookup has a twisty maze of forwarding functions. One reason is that > a couple of the names are used directly by c-common.c. However, we now have > lang hooks for these things. > > This patch changes the C++ FE to override the pushdecl and getdecl lang hooks. > In addition to simply overriding them there, I had to fixup a couple of places > in c-family/c-common.c and objc/objc-gnu-runtime-abi-01.c to use the pushdecl > hook. The c/ and c-family/ changes are OK. -- Joseph S. Myers jos...@codesourcery.com
[PATCH] -fdump option docs
I've committed this to sort the -fdump options. I left the generic -fdump-noaddr, -fdump-unnumbered and -fdump-unnumbered-links at the start. nathan -- Nathan Sidwell 2017-05-08 Nathan Sidwell * doc/invoke.texi: Alphabetize -fdump options. Index: doc/invoke.texi === --- doc/invoke.texi (revision 247758) +++ doc/invoke.texi (working copy) @@ -540,13 +540,13 @@ Objective-C and Objective-C++ Dialects}. -fdisable-tree-@var{pass_name} @gol -fdisable-tree-@var{pass-name}=@var{range-list} @gol -fdump-noaddr -fdump-unnumbered -fdump-unnumbered-links @gol --fdump-translation-unit@r{[}-@var{n}@r{]} @gol -fdump-class-hierarchy@r{[}-@var{n}@r{]} @gol +-fdump-final-insns@r{[}=@var{file}@r{]} -fdump-ipa-all -fdump-ipa-cgraph -fdump-ipa-inline @gol -fdump-passes @gol -fdump-rtl-@var{pass} -fdump-rtl-@var{pass}=@var{filename} @gol -fdump-statistics @gol --fdump-final-insns@r{[}=@var{file}@r{]} +-fdump-translation-unit@r{[}-@var{n}@r{]} @gol -fdump-tree-all @gol -fdump-tree-@var{switch} @gol -fdump-tree-@var{switch}-@var{options} @gol @@ -12938,16 +12938,6 @@ When doing debugging dumps (see @option{ instruction numbers for the links to the previous and next instructions in a sequence. -@item -fdump-translation-unit @r{(C++ only)} -@itemx -fdump-translation-unit-@var{options} @r{(C++ only)} -@opindex fdump-translation-unit -Dump a representation of the tree structure for the entire translation -unit to a file. The file name is made by appending @file{.tu} to the -source file name, and the file is created in the same directory as the -output file. If the @samp{-@var{options}} form is used, @var{options} -controls the details of the dump as described for the -@option{-fdump-tree} options. - @item -fdump-class-hierarchy @r{(C++ only)} @itemx -fdump-class-hierarchy-@var{options} @r{(C++ only)} @opindex fdump-class-hierarchy @@ -12995,6 +12985,16 @@ whole compilation unit while @samp{-deta the passes generate them. The default with no option is to sum counters for each function compiled. +@item -fdump-translation-unit @r{(C++ only)} +@itemx -fdump-translation-unit-@var{options} @r{(C++ only)} +@opindex fdump-translation-unit +Dump a representation of the tree structure for the entire translation +unit to a file. The file name is made by appending @file{.tu} to the +source file name, and the file is created in the same directory as the +output file. If the @samp{-@var{options}} form is used, @var{options} +controls the details of the dump as described for the +@option{-fdump-tree} options. + @item -fdump-tree-all @itemx -fdump-tree-@var{switch} @itemx -fdump-tree-@var{switch}-@var{options}
[PATCH] xtensa: add support for SSP
gcc/ 2017-03-27 Max Filippov * config/xtensa/xtensa-protos.h (xtensa_initial_elimination_offset): New declaration. * config/xtensa/xtensa.c (xtensa_initial_elimination_offset): New function. Move its body from the INITIAL_ELIMINATION_OFFSET macro definition, add case for FRAME_POINTER_REGNUM when FRAME_GROWS_DOWNWARD. * config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): New macro definition. (INITIAL_ELIMINATION_OFFSET): Replace body with call to xtensa_initial_elimination_offset. --- gcc/config/xtensa/xtensa-protos.h | 1 + gcc/config/xtensa/xtensa.c| 24 gcc/config/xtensa/xtensa.h| 19 --- 3 files changed, 29 insertions(+), 15 deletions(-) diff --git a/gcc/config/xtensa/xtensa-protos.h b/gcc/config/xtensa/xtensa-protos.h index 873557f..ef9417c 100644 --- a/gcc/config/xtensa/xtensa-protos.h +++ b/gcc/config/xtensa/xtensa-protos.h @@ -73,5 +73,6 @@ extern void xtensa_expand_prologue (void); extern void xtensa_expand_epilogue (void); extern void order_regs_for_local_alloc (void); extern enum reg_class xtensa_regno_to_class (int regno); +extern HOST_WIDE_INT xtensa_initial_elimination_offset (int from, int to); #endif /* !__XTENSA_PROTOS_H__ */ diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c index 25ed7db..2578716 100644 --- a/gcc/config/xtensa/xtensa.c +++ b/gcc/config/xtensa/xtensa.c @@ -2676,6 +2676,30 @@ xtensa_frame_pointer_required (void) return false; } +HOST_WIDE_INT +xtensa_initial_elimination_offset (int from, int to) +{ + long frame_size = compute_frame_size (get_frame_size ()); + HOST_WIDE_INT offset; + + switch (from) +{ +case FRAME_POINTER_REGNUM: + if (FRAME_GROWS_DOWNWARD) + offset = frame_size - (WINDOW_SIZE * UNITS_PER_WORD) + - cfun->machine->callee_save_size; + else + offset = 0; + break; +case ARG_POINTER_REGNUM: + offset = frame_size; + break; +default: + gcc_unreachable (); +} + + return offset; +} /* minimum frame = reg save area (4 words) plus static chain (1 word) and the total number of words must be a multiple of 128 bits. */ diff --git a/gcc/config/xtensa/xtensa.h b/gcc/config/xtensa/xtensa.h index 58eb1b2..68ee96a 100644 --- a/gcc/config/xtensa/xtensa.h +++ b/gcc/config/xtensa/xtensa.h @@ -460,9 +460,11 @@ enum reg_class #define STACK_GROWS_DOWNWARD 1 +#define FRAME_GROWS_DOWNWARD flag_stack_protect + /* Offset within stack frame to start allocating local variables at. */ #define STARTING_FRAME_OFFSET \ - crtl->outgoing_args_size + (FRAME_GROWS_DOWNWARD ? 0 : crtl->outgoing_args_size) /* The ARG_POINTER and FRAME_POINTER are not real Xtensa registers, so they are eliminated to either the stack pointer or hard frame pointer. */ @@ -474,20 +476,7 @@ enum reg_class /* Specify the initial difference between the specified pair of registers. */ #define INITIAL_ELIMINATION_OFFSET(FROM, TO, OFFSET) \ - do { \ -long frame_size = compute_frame_size (get_frame_size ()); \ -switch (FROM) \ - { \ - case FRAME_POINTER_REGNUM: \ -(OFFSET) = 0; \ - break; \ - case ARG_POINTER_REGNUM: \ -(OFFSET) = frame_size; \ - break; \ - default: \ - gcc_unreachable (); \ - } \ - } while (0) + (OFFSET) = xtensa_initial_elimination_offset ((FROM), (TO)) /* If defined, the maximum amount of space required for outgoing arguments will be computed and placed into the variable -- 2.1.4
Re: [PATCH] xtensa: add support for SSP
On Mon, May 8, 2017 at 3:55 PM, Max Filippov wrote: > gcc/ > 2017-03-27 Max Filippov > > * config/xtensa/xtensa-protos.h > (xtensa_initial_elimination_offset): New declaration. > * config/xtensa/xtensa.c (xtensa_initial_elimination_offset): > New function. Move its body from the INITIAL_ELIMINATION_OFFSET > macro definition, add case for FRAME_POINTER_REGNUM when > FRAME_GROWS_DOWNWARD. > * config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): New macro > definition. > (INITIAL_ELIMINATION_OFFSET): Replace body with call to > xtensa_initial_elimination_offset. Approved.
Re: [PATCH] xtensa: add support for SSP
On Mon, May 8, 2017 at 4:24 PM, augustine.sterl...@gmail.com wrote: > On Mon, May 8, 2017 at 3:55 PM, Max Filippov wrote: >> gcc/ >> 2017-03-27 Max Filippov >> >> * config/xtensa/xtensa-protos.h >> (xtensa_initial_elimination_offset): New declaration. >> * config/xtensa/xtensa.c (xtensa_initial_elimination_offset): >> New function. Move its body from the INITIAL_ELIMINATION_OFFSET >> macro definition, add case for FRAME_POINTER_REGNUM when >> FRAME_GROWS_DOWNWARD. >> * config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): New macro >> definition. >> (INITIAL_ELIMINATION_OFFSET): Replace body with call to >> xtensa_initial_elimination_offset. > > Approved. Applied to trunk. Thank you! -- Max
Re: [PATCH] handle enumerated types in -Wformat-overflow (PR 80397)
On 04/28/2017 12:35 PM, Jeff Law wrote: On 04/26/2017 11:05 AM, Martin Sebor wrote: On 04/24/2017 03:35 PM, Jeff Law wrote: On 04/11/2017 12:57 PM, Martin Sebor wrote: In a review of my fix for bug 80364 Jakub pointed out that to determine whether an argument to an integer directive is of an integer type the gimple-ssa-sprintf pass tests the type code for equality to INTEGER_TYPE when it should instead be using INTEGRAL_TYPE_P(). This has the effect of the pass being unable to use the available range of arguments of enumerated types, resulting in both false positives and false negatives, and in some cases, in emitting suboptimal code. The attached patch replaces those tests with INTEGRAL_TYPE_P(). Since this is not a regression I submit it for GCC 8. You might consider using POINTER_TYPE_P as well. You mean rather than (TREE_CODE (type) == POINTER_TYPE)? Those I believe are vestiges of the %p handling that was removed sometime last year, and (unless you are recommending I remove them as part of this patch) should probably be removed during the next cleanup.Yes, that can be a follow-up cleanup. For the future, if you find yourself writing something like TREE_CODE (TREE_TYPE (x)) == POINTER_TYPE, you're usually going to be better off using POINTER_TYPE_P (TREE_TYPE (x)). That allows the code to work with C++ references as well as C pointers. I'll keep it in mind, thanks. Should I take this as approval of the patch as is or are there some changes you'd like me to make? Martin
[PATCH][x86] Add missing intrinsics for vrcp14sd/ss instructions.
Hi, This patch adds missing intrinsics for VRCP14SD and VRCP14SS instructions: _mm_mask_rcp14_sd _mm_maskz_rcp14_sd _mm_mask_rcp14_ss _mm_maskz_rcp14_ss These instructions and intrinsics are described in SDM Vol. 2C 5-487. gcc/ * config/i386/avx512fintrin.h (_mm_mask_rcp14_sd, _mm_maskz_rcp14_sd, _mm_mask_rcp14_ss, _mm_maskz_rcp14_ss): New intrinsics. * config/i386/i386-builtin.def (__builtin_ia32_rcp14sd_mask, __builtin_ia32_rcp14ss_mask): New builtins. * config/i386/sse.md (srcp14_mask): New pattern. gcc/testsuite/ * gcc.target/i386/avx512f-vrcp14sd-1.c: Test new intrinsics. * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto. * gcc.target/i386/avx512f-vrcp14ss-1.c: Ditto. * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto. Thanks, Julia 0001-vrcp14sd.patch Description: 0001-vrcp14sd.patch