date:20170508

Re: [PATCH 2/n] [PR tree-optimization/78496] Simplify ASSERT_EXPRs to facilitate jump threading

2017-05-08 Thread Richard Biener via gcc-patches

On Sat, May 6, 2017 at 5:03 PM, Jeff Law  wrote:
> This is the 2nd of 3-5 patches to address pr78496.
>
> Jump threading will examine ASSERT_EXPRs as it walks the IL and record
> information from those ASSERT_EXPRs into the available expression and
> const/copies tables where they're later used to simplify conditionals.
>
> We're missing a couple things though.
>
> First an ASSERT_EXPR with an EQ test creates an equality we can enter into
> the const/copy tables.  We were failing to do that.  This is most
> interesting when the RHS of the condition in the ASSERT_EXPR is a constant,
> obviously.  This has a secondary benefit of doing less work to get better
> optimization.
>
> Second, some ASSERT_EXPRs may start off as a relational test such as x <= 0,
> but after range extraction and propagation they could be simplified into an
> equality comparison.  We already do this with conditionals and generalizing
> that code to handle ASSERT_EXPRs is pretty easy.  This gives us more chances
> to extract simple equivalences from the condition in ASSERT_EXPRs.
>
> This patch fixes those 2 problems.  I don't think it directly helps pr78496
> right now as we're unable to pick up the newly exposed jump threads until
> VRP2.   But that's a short term limitation that I've already addressed
> locally :-)
>
> Bootstrapped, regression tested and installed on the trunk.
>
> jeff
>
>
> ps. An astute observer might note that improving the effectiveness of VRP
> jump threading seems counterproductive since I've stated I want to remove
> VRP jump threading.  These improvements don't significantly change how I was
> planning to do that or the infrastructure we're going to rely upon to make
> that change.  All that changes is where we get the information from --
> ASSERT_EXPRs vs querying a new API that generates the same information as
> needed.

That API is already there.  It's called register_edge_assert_for (it might need
a wrapper to be most useful and also needs to be exported / moved from VRP
to somewhere else).  Both VRP and EVRP use it to "compute" ASSERT_EXPRs.

So I'm not sure if changing VRP with your patches is a good thing when you
could have used the new API in the first place ...

Richard.

>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index b0c253b09ae..0f78f2a2ed1 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,12 @@
> +2017-05-06  Jeff Law  
> +
> +   PR tree-optimization/78496
> +   * tree-vrp.c (simplify_assert_expr_using_ranges): New function.
> +   (simplify_stmt_using_ranges): Call it.
> +   (vrp_dom_walker::before_dom_children): Extract equivalences
> +   from an ASSERT_EXPR with an equality comparison against a
> +   constant.
> +
>  2017-05-06  Richard Sandiford  
>
> * lra-constraints.c (lra_copy_reg_equiv): New function.
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index fca5b87e798..42782a6d17c 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,9 @@
> +2017-05-06  Jeff Law  
> +
> +   PR tree-optimization/78496
> +   * gcc.dg/tree-ssa/ssa-thread-16.c: New test.
> +   * gcc.dg/tree-ssa/ssa-thread-17.c: New test.
> +
>  2017-05-06  Richard Sandiford  
>
> * gcc.target/aarch64/spill_1.c: New test.
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-16.c
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-16.c
> new file mode 100644
> index 000..78c349ca14d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-16.c
> @@ -0,0 +1,38 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
> +
> +/* We should thread the if (exp == 2) conditional on the
> +   the path from inside the if (x) THEN arm.  It is the only
> +   jump threading opportunity in this code.  */
> +
> +/* { dg-final { scan-tree-dump-times "Threaded" 1 "vrp1" } } */
> +
> +
> +extern void abort (void) __attribute__ ((__nothrow__, __leaf__))
> +  __attribute__ ((__noreturn__));
> +
> +int x;
> +
> +
> +int code;
> +void
> +do_jump (int exp)
> +{
> +  switch (code)
> +{
> +case 4:
> +  if ((exp) == 1)
> +   goto normal;
> +  if (x)
> +   {
> + if (exp != 0)
> +   abort ();
> +   }
> +  if ((exp) == 2)
> +   goto normal;
> +case 3:
> +   abort ();
> +}
> +  normal:
> +  ;
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-17.c
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-17.c
> new file mode 100644
> index 000..692658fbb4b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-17.c
> @@ -0,0 +1,36 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
> +
> +/* We should simplify one ASSERT_EXPR from a relational
> +   into an equality test.  */
> +/* { dg-final { scan-tree-dump-times "Folded
> into:\[^\r\n\]*ASSERT_EXPR\*\[^\r\n\]* == 1" 1 "vrp1" } } */
> +
> +/* And simplification of the ASSERT_EXPR leads to a jump threading
> opportunity.  */
> +/* {

Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-08 Thread Richard Biener via gcc-patches

On Fri, May 5, 2017 at 10:53 PM, Jeff Law  wrote:
> On 05/04/2017 08:37 AM, Jeff Law wrote:
>>
>>
>> You understanding is slightly wrong however,  The ASSERT_EXPRs and
>> conditionals map 100% through propagation and into simplification.  It's
>> only during simplification that we lose the direct mapping as we change the
>> conditional in order to remove the unnecessary type conversion. Threading
>> runs after simplification.
>>
>> Another approach here would be to simplify the ASSERT_EXPR in the same
>> manner that we simplify the conditional.  That may even be better for
>> various reasons in the short term.  Let me poke at that.
>
> Now I remember why I didn't do that (simplify the ASSERT_EXPR).  It
> doesn't work :-)
>
> So given an ASSERT_EXPR like:
>
>   v1_378 = ASSERT_EXPR ;
>
> Let's assume for the sake of argument v1_179 was set by a type cast from
> xx_1.  We can simplify that ASSERT_EXPR into
>
>   v1_378 = ASSERT_EXPR ;
>
> But note we can not change operand 0 of the ASSERT_EXPR.  That would change
> the *type* of the ASSERT_EXPR and blows up all kinds of things later.  So
> the ASSERT_EXPR at best can morph into this form:
>
>   v1_378 = ASSERT_EXPR ;
>
> When the threader wants to look for an ASSERT_EXPR that creates a range for
> an object, it does lookups based on a match of operand 0 without digging
> into the expression in operand 1.
>
> That's something we may want to change in the future (it plays into issues
> that arise in patch #3) but in the immediate term it's still best to defer
> that one special case of tweaking a GIMPLE_COND.

Note that I tried last stage3 (it ended up being too late) to get rid
of ASSERT_EXPRs
doing substitute-and-fold itself (basically copy-propagate them out at
this point rather
than as a separate thing later).  This is because the ASSERT_EXPR uses interfere
with the single_use checks in match.pd patterns and thus are actually harmful.

The barrier I ran into was the ASSERT_EXPR use by the threader ... so
now you're making
us rely even more on those :/

Richard.

> Jeff

Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()

2017-05-08 Thread Stephan Bergmann via gcc-patches


On 05/05/2017 07:05 PM, Jonathan Wakely wrote:

As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda
hinted at by http://wg21.link/lwg1200) there's a problem with
char_traits::eof() because it returns int_type(-1) which is
the same value as u'\u', a valid UTF-16 code point.

i.e. because all values of int_type are also valid values of char_type
we cannot meet the requirement that:

"The member eof() shall return an implementation-defined constant
that cannot appear as a valid UTF-16 code unit."

I've reported this as a defect, suggesting that the wording above
needs to change.

One consequence is that basic_streambuf::sputc(u'\u')
always returns the same value, whether it succeeds or not. On success
it returns to_int_type(u'\u') and on failure it returns eof(),
which is the same value. I think that can be solved with the attached
change, which preserves the invariant in [char.traits.require] that
eof() returns:

"a value e such that X::eq_int_type(e,X::to_int_type(c)) is false for
all values c."

This can be true if we ensure that to_int_type never returns the eof()
value. http://www.unicode.org/faq/private_use.html#nonchar10 suggests
doing something like this.

It means that when writing u'\u' to a streambuf we write that
character successfully, but return u'\uFFFD' instead; and when reading
u'\u' from a streambuf we return u'\uFFFD' instead. This is
asymmetrical, as we can write that character but not read it back.  It
might be better to refuse to write u'\u' and write it as the
replacement character instead, but I think I prefer to write the right
character when possible. It also doesn't require any extra changes.

All tests pass with this, does anybody see any problems with this
approach?


Sounds scary to me.  As an application programmer, I'd expect to be able 
to use chart16_t based streams to read and write arbitrary sequences of 
Unicode code points (encoded as sequences of UTF-16 code units).  (Think 
of an application temporarily storing internal strings to a disk file.)


Also, I'd be surprised to find this asymmetric behavior only for U+ 
and not for other noncharacters, and only for char16_t and not for char32_t.


To me, the definition of char16_t's int_type and eof() sounds like a bug 
that needs fixing, not working around?

[PATCH][x86] Fix ADD[SD,SS] and SUB[SD,SS] runtime tests

2017-05-08 Thread Peryt, Sebastian

Hi,

This patch fixes errors in runtime tests for ADDSD, ADDSS, SUBSD and SUBSS 
instructions.

gcc/testsuite/
* gcc.target/i386/avx512f-vaddsd-2.c: Test fixed.
* gcc.target/i386/avx512f-vaddss-2.c: Ditto.
* gcc.target/i386/avx512f-vsubsd-2.c: Ditto.
* gcc.target/i386/avx512f-vsubss-2.c: Ditto.

Is it ok for trunk?

Thanks,
Sebastian


ADD[SD_SS]_SUB[SD_SS]_runtime_tests_fix.patch
Description: ADD[SD_SS]_SUB[SD_SS]_runtime_tests_fix.patch

Re: [PATCH] prevent -Wno-system-headers from suppressing -Wstringop-overflow (PR 79214)

2017-05-08 Thread Andreas Schwab

I see these regressions on ia64:

FAIL: c-c++-common/Wsizeof-pointer-memaccess2.c  -std=gnu++11 (test for excess 
errors)
Excess errors:
/usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:11:53:
 warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' 
reading 8 bytes from a region of size 1 [-Wstringop-overflow=]
/usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:12:55:
 warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' 
reading 8 bytes from a region of size 1 [-Wstringop-overflow=]
/usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:11:53:
 warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' 
reading 8 bytes from a region of size 1 [-Wstringop-overflow=]
/usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:12:55:
 warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' 
reading 8 bytes from a region of size 1 [-Wstringop-overflow=]
/usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:11:53:
 warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' 
reading 8 bytes from a region of size 1 [-Wstringop-overflow=]
/usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:12:55:
 warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' 
reading 8 bytes from a region of size 1 [-Wstringop-overflow=]
/usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:11:53:
 warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' 
reading 8 bytes from a region of size 1 [-Wstringop-overflow=]
/usr/local/gcc/gcc-20170508/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess2.c:12:55:
 warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' 
reading 8 bytes from a region of size 1 [-Wstringop-overflow=]

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [PATCH, RFC] warn about raw pointers that can be unique_ptr

2017-05-08 Thread Marc Glisse


On Sun, 7 May 2017, Trevor Saunders wrote:


On Sun, May 07, 2017 at 07:32:48PM +0200, Marc Glisse wrote:

On Sun, 7 May 2017, tbsaunde+...@tbsaunde.org wrote:


This is a start at warning about various resource allocation issues that can be
improved.  Currently it only warns about functions that call malloc and then
always pass the resulting pointer to free().

It should be pretty simple to extend this to new/delete and new[]/delete[], as
well as checking that in simple cases the pairing is correct.  However it
wasn't obvious to me how to tell if a function call is to a allocating operator
new.  We probably don't want to warn about placement new since complicated
things can be happening there.  There is the DECL_IS_OPERATOR_NEW() but that
doesn't seem to cover class specific operator new overloads, and maybe even
custom ones at global scope?

Other things that may be reasonable to try and include would be open / close
and locks.

It might also be good to warn about functions that take or return unique
ownership of resources, but I'm not quiet sure how to handle functions that
allocate or deallocate shared resources.


Interesting.

1) How do you avoid warning for code that already uses unique_ptr? After
inlining, it will look just like code calling new, and delete on all paths.


Currently by running early enough that no inline has happened.  I'm not
sure how valuable it would be to try and find things post inlining
instead.


Ah, right, for the warning it indeed makes sense to be pre-inlining, as it 
isn't clear you could use unique_ptr if the call to free was not directly 
in this function.



Well, currently you only handle malloc/free, but if you provide an inline
implementation of new/delete that forwards to malloc/free, the issue should
already appear. Or maybe the pass is very early? But then the compiler will
likely seldom notice that the argument to free is actually the return of
malloc.


I wonder how true this last bit is, sure we could find some more things
after more optimization, but its not obvious to me that it would be so
much to really be worth it.  Looking at gcc itself there's plenty of
places where this should already be enough with tracking places where
the pointer is coppied from one ssa name to another.  I feel like the
big areas of improvement are warning about functions that take or return
ownership of things.


I fear that there are many cases where you need more PRE / DOM / SRA / ... 
for the SSA_NAME to end up in a place where malloc and free match, but if 
you already detect some cases, that's good.



2) In some sense, we would want to warn when some path is missing the call
to free, but it seems very hard to distinguish the cases where it is
accidentally missing from those where it is done on purpose. Maybe detect
cases where the only paths missing free are through exceptions?


I didn't think about exceptions, but I'd been thinking of trying to warn
about variables we prove don't escape in alias analysis, but that would
probably be a small set of all allocations.


3) This seems very similar to the conditions needed to transform a call to
malloc into a stack allocation. I was experimenting with this some years
ago, see for instance
https://gcc.gnu.org/ml/gcc-patches/2013-11/msg01049.html (it is probably not
the most recent message on the subject, since the main code moved to CCP
afterwards http://geometrica.saclay.inria.fr/team/Marc.Glisse/tmp/heapstack,
but I can't locate more recent messages at the moment, and it was more a
prototype than a ready patch). Do you think the infrastructure could be
shared between your warning and such an optimization? Or are the goals
likely to diverge too soon for sharing to be practical?


yeah, some of the issues are certainly similar.  I'm not sure if there
would be things worth sharing probably easiest to find out by just
trying it.


It seems that the differences will be larger than I originally expected, 
so it isn't clear the gain would be worth it.



While I remember, one (easy) case you may want to handle is:

p = malloc(n);
if (p != 0) ...

where the else branch does not need to contain a call to free.

(maybe one can add strdup and similar functions later, but better start 
with few functions for now)


--
Marc Glisse

Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()

2017-05-08 Thread Florian Weimer via gcc-patches


On 05/05/2017 07:05 PM, Jonathan Wakely wrote:

As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda
hinted at by http://wg21.link/lwg1200) there's a problem with
char_traits::eof() because it returns int_type(-1) which is
the same value as u'\u', a valid UTF-16 code point.


I think the real bug is that char_traits::int_type is just 
plain wrong.  It has to be a signed integer, and capable of representing 
values in the range 0 .. 65535.  char_traits has a similar 
problem.  char_traits should be fine on glibc because WEOF is 
reserved, something that is probably not the case for char32_t.


Thanks,
Florian

Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()

2017-05-08 Thread Jonathan Wakely via gcc-patches


On 08/05/17 09:52 +0200, Stephan Bergmann via libstdc++ wrote:

On 05/05/2017 07:05 PM, Jonathan Wakely wrote:

As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda
hinted at by http://wg21.link/lwg1200) there's a problem with
char_traits::eof() because it returns int_type(-1) which is
the same value as u'\u', a valid UTF-16 code point.

i.e. because all values of int_type are also valid values of char_type
we cannot meet the requirement that:

"The member eof() shall return an implementation-defined constant
that cannot appear as a valid UTF-16 code unit."

I've reported this as a defect, suggesting that the wording above
needs to change.

One consequence is that basic_streambuf::sputc(u'\u')
always returns the same value, whether it succeeds or not. On success
it returns to_int_type(u'\u') and on failure it returns eof(),
which is the same value. I think that can be solved with the attached
change, which preserves the invariant in [char.traits.require] that
eof() returns:

"a value e such that X::eq_int_type(e,X::to_int_type(c)) is false for
all values c."

This can be true if we ensure that to_int_type never returns the eof()
value. http://www.unicode.org/faq/private_use.html#nonchar10 suggests
doing something like this.

It means that when writing u'\u' to a streambuf we write that
character successfully, but return u'\uFFFD' instead; and when reading
u'\u' from a streambuf we return u'\uFFFD' instead. This is
asymmetrical, as we can write that character but not read it back.  It
might be better to refuse to write u'\u' and write it as the
replacement character instead, but I think I prefer to write the right
character when possible. It also doesn't require any extra changes.

All tests pass with this, does anybody see any problems with this
approach?


Sounds scary to me.  As an application programmer, I'd expect to be 
able to use chart16_t based streams to read and write arbitrary 
sequences of Unicode code points (encoded as sequences of UTF-16 code 
units).  (Think of an application temporarily storing internal strings 
to a disk file.)


Also, I'd be surprised to find this asymmetric behavior only for 
U+ and not for other noncharacters, and only for char16_t and not 
for char32_t.


To me, the definition of char16_t's int_type and eof() sounds like a 
bug that needs fixing, not working around?


Fixing that would require changing the standard and breaking the ABI
of all existing implementations. I've opened a defect report against
that standard, but a change that requires an ABI break isn't likely to
be popular.

Changing the semantics of to_int_type for U+ is far less likely to
affect any ABIs (it's a constexpr function so it's possible somebody
is using the value of to_int_type(char_type(-1)) as a template
argument, but it seems unlikely. It's a much smaller change, "allowed"
by http://www.unicode.org/faq/private_use.html#nonchar10 and it only
affects a noncharacter that is not intended for interchange anyway.

I'm not claiming it's ideal, but it fixes a bug today.

Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()

2017-05-08 Thread Jonathan Wakely via gcc-patches


On 08/05/17 11:53 +0200, Florian Weimer via libstdc++ wrote:

On 05/05/2017 07:05 PM, Jonathan Wakely wrote:

As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda
hinted at by http://wg21.link/lwg1200) there's a problem with
char_traits::eof() because it returns int_type(-1) which is
the same value as u'\u', a valid UTF-16 code point.


I think the real bug is that char_traits::int_type is just 
plain wrong.  It has to be a signed integer,


Why does it have to be signed?

and capable of 
representing values in the range 0 .. 65535.  char_traits 
has a similar problem.  char_traits should be fine on glibc 
because WEOF is reserved, something that is probably not the case for 
char32_t.


I think there are 32-bit values which are not valid UTF-32 code
points, including char32_t(-1) which we use for EOF.

[C++ Patch Ping] Re: [C++ Patch/RFC] PR 80145

2017-05-08 Thread Paolo Carlini


Hi,

gently pinging this...

On 23/03/2017 20:07, Paolo Carlini wrote:

Hi,

this ICE on invalid code isn't a regression, thus a patch probably 
doesn't qualify for Stage 4, but IMHO I made good progress on it and 
I'm sending what I have now anyway... The ICE happens during error 
recovery after a sensible diagnostic for the first declaration in:


auto* foo() { return 0; }
auto* foo();

After the error, finish_function does:

apply_deduced_return_type (fndecl, void_type_node);
fntype = TREE_TYPE (fndecl);

which then is inconsistent with the auto* return type of the second 
declaration and leads to an ICE in merge_types (which duplicate_decls 
thought was safe to call because types_match is true (evidently: 
decls_match uses fndecl_declared_return_type)). Thus, in terms of 
error recovery, I think that in cases like the one at issue it makes 
sense not to replace auto* after the error and leave the return type 
untouched: certainly the below passes testing and avoids ICEing on the 
testcase at issue and a variant of it.

Thanks,
Paolo.

Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()

2017-05-08 Thread Florian Weimer via gcc-patches


On 05/08/2017 12:24 PM, Jonathan Wakely wrote:

On 08/05/17 11:53 +0200, Florian Weimer via libstdc++ wrote:

On 05/05/2017 07:05 PM, Jonathan Wakely wrote:

As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda
hinted at by http://wg21.link/lwg1200) there's a problem with
char_traits::eof() because it returns int_type(-1) which is
the same value as u'\u', a valid UTF-16 code point.


I think the real bug is that char_traits::int_type is just 
plain wrong.  It has to be a signed integer,


Why does it have to be signed?


Hmm.  Maybe it's not strictly required.  int_type(-1) as a distinct 
value is likely sufficient.


and capable of representing values in the range 0 .. 65535.  
char_traits has a similar problem.  char_traits 
should be fine on glibc because WEOF is reserved, something that is 
probably not the case for char32_t.


I think there are 32-bit values which are not valid UTF-32 code
points, including char32_t(-1) which we use for EOF.


I'm not sure if char32_t is restricted to UTF-32 codepoints (the 
standard does not say, I think).  But even UCS-4 is 31-bit only, so 
maybe the problem does not arise there.


Thanks,
Florian

Re: [PATCH, RFC] warn about raw pointers that can be unique_ptr

2017-05-08 Thread Trevor Saunders

On Mon, May 08, 2017 at 10:28:13AM +0200, Marc Glisse wrote:
> On Sun, 7 May 2017, Trevor Saunders wrote:
> 
> > On Sun, May 07, 2017 at 07:32:48PM +0200, Marc Glisse wrote:
> > > On Sun, 7 May 2017, tbsaunde+...@tbsaunde.org wrote:
> > > 
> > > > This is a start at warning about various resource allocation issues 
> > > > that can be
> > > > improved.  Currently it only warns about functions that call malloc and 
> > > > then
> > > > always pass the resulting pointer to free().
> > > > 
> > > > It should be pretty simple to extend this to new/delete and 
> > > > new[]/delete[], as
> > > > well as checking that in simple cases the pairing is correct.  However 
> > > > it
> > > > wasn't obvious to me how to tell if a function call is to a allocating 
> > > > operator
> > > > new.  We probably don't want to warn about placement new since 
> > > > complicated
> > > > things can be happening there.  There is the DECL_IS_OPERATOR_NEW() but 
> > > > that
> > > > doesn't seem to cover class specific operator new overloads, and maybe 
> > > > even
> > > > custom ones at global scope?
> > > > 
> > > > Other things that may be reasonable to try and include would be open / 
> > > > close
> > > > and locks.
> > > > 
> > > > It might also be good to warn about functions that take or return unique
> > > > ownership of resources, but I'm not quiet sure how to handle functions 
> > > > that
> > > > allocate or deallocate shared resources.
> > > 
> > > Interesting.
> > > 
> > > 1) How do you avoid warning for code that already uses unique_ptr? After
> > > inlining, it will look just like code calling new, and delete on all 
> > > paths.
> > 
> > Currently by running early enough that no inline has happened.  I'm not
> > sure how valuable it would be to try and find things post inlining
> > instead.
> 
> Ah, right, for the warning it indeed makes sense to be pre-inlining, as it
> isn't clear you could use unique_ptr if the call to free was not directly in
> this function.

well, you probably could, you'd just need to pass it around some.

> > > Well, currently you only handle malloc/free, but if you provide an inline
> > > implementation of new/delete that forwards to malloc/free, the issue 
> > > should
> > > already appear. Or maybe the pass is very early? But then the compiler 
> > > will
> > > likely seldom notice that the argument to free is actually the return of
> > > malloc.
> > 
> > I wonder how true this last bit is, sure we could find some more things
> > after more optimization, but its not obvious to me that it would be so
> > much to really be worth it.  Looking at gcc itself there's plenty of
> > places where this should already be enough with tracking places where
> > the pointer is coppied from one ssa name to another.  I feel like the
> > big areas of improvement are warning about functions that take or return
> > ownership of things.
> 
> I fear that there are many cases where you need more PRE / DOM / SRA / ...
> for the SSA_NAME to end up in a place where malloc and free match, but if
> you already detect some cases, that's good.

So, I've been using a hacked up version of this to find places to use
auto_bitmap.  I ended up following places where one ssa name is assigned
to another, and moving this pass after the ccp within
all_early_optimizations.  The pass location change was because at the
earlier location there was places where the ssa name from the allocation
was assigned to a vAR_DECL, that was ssa'd away at the later point.  So
it may actually be that optimization helps more than I thought, but I
couldn't see an immediate reason we needed to be still using variables
instead of ssa names in the one spot I looked at.

> > > 2) In some sense, we would want to warn when some path is missing the call
> > > to free, but it seems very hard to distinguish the cases where it is
> > > accidentally missing from those where it is done on purpose. Maybe detect
> > > cases where the only paths missing free are through exceptions?
> > 
> > I didn't think about exceptions, but I'd been thinking of trying to warn
> > about variables we prove don't escape in alias analysis, but that would
> > probably be a small set of all allocations.
> > 
> > > 3) This seems very similar to the conditions needed to transform a call to
> > > malloc into a stack allocation. I was experimenting with this some years
> > > ago, see for instance
> > > https://gcc.gnu.org/ml/gcc-patches/2013-11/msg01049.html (it is probably 
> > > not
> > > the most recent message on the subject, since the main code moved to CCP
> > > afterwards 
> > > http://geometrica.saclay.inria.fr/team/Marc.Glisse/tmp/heapstack,
> > > but I can't locate more recent messages at the moment, and it was more a
> > > prototype than a ready patch). Do you think the infrastructure could be
> > > shared between your warning and such an optimization? Or are the goals
> > > likely to diverge too soon for sharing to be practical?
> > 
> > yeah, some of the issues are certain

Re: [PATCH][AArch64] Allow const0_rtx operand for atomic compare-exchange patterns

2017-05-08 Thread Kyrill Tkachov


Ping.

Thanks,
Kyrill

On 24/04/17 10:37, Kyrill Tkachov wrote:

Pinging this back into context so that I don't forget about it...

https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01648.html

Thanks,
Kyrill

On 28/02/17 12:29, Kyrill Tkachov wrote:

Hi all,

For the testcase in this patch we currently generate:
foo:
mov w1, 0
ldaxr   w2, [x0]
cmp w2, 3
bne .L2
stxrw3, w1, [x0]
cmp w3, 0
.L2:
csetw0, eq
ret

Note that the STXR could have been storing the WZR register instead of moving 
zero into w1.
This is due to overly strict predicates and constraints in the store exclusive 
pattern and the
atomic compare exchange expanders and splitters.
This simple patch fixes that in the patterns concerned and with it we can 
generate:
foo:
ldaxr   w1, [x0]
cmp w1, 3
bne .L2
stxrw2, wzr, [x0]
cmp w2, 0
.L2:
csetw0, eq
ret


Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for GCC 8?

Thanks,
Kyrill

2017-02-28  Kyrylo Tkachov  

* config/aarch64/atomics.md (atomic_compare_and_swap expander):
Use aarch64_reg_or_zero predicate for operand 4.
(aarch64_compare_and_swap define_insn_and_split):
Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
(aarch64_store_exclusive): Likewise for operand 2.

2017-02-28  Kyrylo Tkachov  

* gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.

Re: [PATCH][AArch64] Emit tighter strong atomic compare-exchange loop when comparing against zero

2017-05-08 Thread Kyrill Tkachov


Ping.

Thanks,
Kyrill

On 24/04/17 10:38, Kyrill Tkachov wrote:

Pinging this back into context so that I don't forget about it...

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00376.html

Thanks,
Kyrill

On 08/03/17 16:35, Kyrill Tkachov wrote:

Hi all,

For the testcase in this patch where the value of x is zero we currently 
generate:
foo:
mov w1, 4
.L2:
ldaxr   w2, [x0]
cmp w2, 0
bne .L3
stxrw3, w1, [x0]
cbnzw3, .L2
.L3:
csetw0, eq
ret

We currently cannot merge the cmp and b.ne inside the loop into a cbnz because 
we need
the condition flags set for the return value of the function (i.e. the cset at 
the end).
But if we re-jig the sequence in that case we can generate a tighter loop:
foo:
mov w1, 4
.L2:
ldaxr   w2, [x0]
cbnzw2, .L3
stxrw3, w1, [x0]
cbnzw3, .L2
.L3:
cmp w2, 0
csetw0, eq
ret

So we add an explicit compare after the loop and inside the loop we use the 
fact that
we're comparing against zero to emit a CBNZ. This means we may re-do the 
comparison twice
(once inside the CBNZ, once at the CMP at the end), but there is now less code 
inside the loop.

I've seen this sequence appear in glibc locking code so maybe it's worth adding 
the extra bit
of complexity to the compare-exchange splitter to catch this case.

Bootstrapped and tested on aarch64-none-linux-gnu. In previous iterations of 
the patch where
I had gotten some logic wrong it would cause miscompiles of libgomp leading to 
timeouts in its
testsuite but this version passes everything cleanly.

Ok for GCC 8? (I know it's early, but might as well get it out in case someone 
wants to try it out)

Thanks,
Kyrill

2017-03-08  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_split_compare_and_swap):
Emit CBNZ inside loop when doing a strong exchange and comparing
against zero.  Generate the CC flags after the loop.

2017-03-08  Kyrylo Tkachov  

* gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: New test.

Re: [PATCH] PR libstdc++/80624 satisfy invariant for char_traits::eof()

2017-05-08 Thread Jonathan Wakely via gcc-patches


On 08/05/17 12:52 +0200, Florian Weimer via libstdc++ wrote:

On 05/08/2017 12:24 PM, Jonathan Wakely wrote:

On 08/05/17 11:53 +0200, Florian Weimer via libstdc++ wrote:

On 05/05/2017 07:05 PM, Jonathan Wakely wrote:

As discussed at http://stackoverflow.com/q/43769773/981959 (and kinda
hinted at by http://wg21.link/lwg1200) there's a problem with
char_traits::eof() because it returns int_type(-1) which is
the same value as u'\u', a valid UTF-16 code point.


I think the real bug is that char_traits::int_type is 
just plain wrong.  It has to be a signed integer,


Why does it have to be signed?


Hmm.  Maybe it's not strictly required.  int_type(-1) as a distinct 
value is likely sufficient.


Agreed.

and capable of representing values in the range 0 .. 65535.  
char_traits has a similar problem.  char_traits 
should be fine on glibc because WEOF is reserved, something that 
is probably not the case for char32_t.


I think there are 32-bit values which are not valid UTF-32 code
points, including char32_t(-1) which we use for EOF.


I'm not sure if char32_t is restricted to UTF-32 codepoints (the 
standard does not say, I think).  But even UCS-4 is 31-bit only, so 
maybe the problem does not arise there.


It's not really clear what the encoding of char32_t is (see
http://talesofcpp.fusionfenix.com/post-10/episode-seven-one-char-to-rule-them-all
for a good analysis) but whether it's UCS-4 or UTF-32, U+ is
not in the universal character set, so we can use 0x for
char_traits::eof().

So I think only char_traits has this problem.

Re: [committed] Get rid of macros for diagnostic_report_current_module

2017-05-08 Thread Nathan Sidwell


On 05/05/2017 05:35 PM, David Malcolm wrote:


Committed to trunk as r247664.

gcc/ChangeLog:
* diagnostic.c (last_module_changed_p): New function.
(set_last_module): New function.


Given that C++ modules-ts is a thing, I wonder if there's a better name 
for these functions?  that might be less confusing down the road.


nathan

--
Nathan Sidwell

[C++ Patch] PR 80186 ("ICE on C++ code with invalid constructor...")

2017-05-08 Thread Paolo Carlini


Hi,

in order to avoid this error recovery issue I think we want to check the 
return value of grok_ctor_properties, as we do in decl.c, for its only 
other use. Tested x86_64-linux.


Thanks, Paolo.

//


/cp
2017-05-08  Paolo Carlini  

PR c++/80186
* pt.c (tsubst_decl): Early return error_mark_node if
grok_ctor_properties returns false.

/testsuite
2017-05-08  Paolo Carlini  

PR c++/80186
* g++.dg/template/crash126.C: New.
Index: cp/pt.c
===
--- cp/pt.c (revision 247733)
+++ cp/pt.c (working copy)
@@ -12407,8 +12407,8 @@ tsubst_decl (tree t, tree args, tsubst_flags_t com
if (DECL_CONSTRUCTOR_P (r) || DECL_DESTRUCTOR_P (r))
  {
maybe_retrofit_in_chrg (r);
-   if (DECL_CONSTRUCTOR_P (r))
- grok_ctor_properties (ctx, r);
+   if (DECL_CONSTRUCTOR_P (r) && !grok_ctor_properties (ctx, r))
+ RETURN (error_mark_node);
/* If this is an instantiation of a member template, clone it.
   If it isn't, that'll be handled by
   clone_constructors_and_destructors.  */
Index: testsuite/g++.dg/template/crash126.C
===
--- testsuite/g++.dg/template/crash126.C(revision 0)
+++ testsuite/g++.dg/template/crash126.C(working copy)
@@ -0,0 +1,13 @@
+// PR c++/80186
+
+template < class T, class > struct A
+{
+  A ();
+  A (A &);
+  A (A < T, T >);  // { dg-error "invalid constructor" }
+};
+
+void f () 
+{
+  A < int, int > (A < int, int >());  // { dg-error "cannot bind" }
+}

Re: [PATCH 1/2] C++ template type diff printing

2017-05-08 Thread Nathan Sidwell


On 05/04/2017 07:44 PM, David Malcolm wrote:

This patch kit implements two new options to make it easier
to read diagnostics involving mismatched template types:
   -fdiagnostics-show-template-tree and
   -fno-elide-type.

It adds two new formatting codes: %H and %I which are
equivalent to %qT, but are to be used together for type
comparisons e.g.
   "can't convert from %H to %I".


While I can see why one might want %H and %I to imply quoting, it seems 
confusingly inconsistent with the other formatters, which require an 
explicit 'q'.




rather than printing:

   could not convert 'std::map >()'
 from 'std::map >' to 'std::map >'

with -felide-type (the default), it prints:

   could not convert 'std::map >()'
 from 'map<[...],vector>' to 'map<[...],vector>


Neat.

Is '-felide-type' a good name?  Wouldn't something like 
'-fdiagnostic-elide-template-args' be better?




With -fdiagnostics-show-template-tree a tree-like structure of the
template is printed, showing the differences; in this case:


'tree' sounds implementor-speaky,  Perhaps 
'-fdiagnostic-expand-template-args' or something?


Right, bikeshedding out of the way ...


--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -99,7 +99,47 @@ static void cp_diagnostic_starter (diagnostic_context *, 
diagnostic_info *);
  static void cp_print_error_function (diagnostic_context *, diagnostic_info *);
  



+
+/* Return true iff TYPE_A and TYPE_B are template types that are
+   meaningful to compare.  */
+
+static bool
+comparable_template_types_p (tree type_a, tree type_b)
+{
+  if (TREE_CODE (type_a) != RECORD_TYPE)
+return false;
+  if (TREE_CODE (type_b) != RECORD_TYPE)
+return false;


CLASS_TYPE_P (type_a) etc?



+  int len_max = MAX (len_a, len_b);
+  for (int idx = 0; idx < len_max; idx++)
+{
+  tree arg_a = idx < len_a ? TREE_VEC_ELT (args_a, idx) : NULL;
+  tree arg_b = idx < len_b ? TREE_VEC_ELT (args_b, idx) : NULL;
+  if (arg_a == arg_b)


Should this use same_type_p for types?  If direct comparison is correct, 
a comment would help.




+   Only non-default template arguments are printed.  */
+
+static void
+print_template_tree_comparison (pretty_printer *pp, tree type_a, tree type_b,



+  for (int idx = 0; idx < len_max; idx++)
+{
+  tree arg_a = idx < len_a ? TREE_VEC_ELT (args_a, idx) : NULL;
+  tree arg_b = idx < len_b ? TREE_VEC_ELT (args_b, idx) : NULL;
+  if (arg_a == arg_b)


Same question.  If these have to be comparable template type, when can 
len_a and len_b be different?





+   This is called in phase 2 of pp_format, when it is accumulating
+   a series of formatted chunks.  We stash the location of the chunk
+   we're meant to have written to, so that we can write to it in the
+   m_format_postprocessor hook.  */
+
+static void
+defer_half_of_type_diff (pretty_printer *pp, char spec, tree type,


If this is called 'phase 2'  why not name the function for that, rather 
than 'half'?



+case 'H':
+case 'I':
+  {
+   defer_half_of_type_diff (pp, *spec, next_tree, buffer_ptr, verbose);


Why not tell defer_half_of_type_diff whether it's doing the %H bit or 
the %I bit directly, rather than have it also check for H and I?


I've not looked more than cursorily at the non C++ bits.

nathan

--
Nathan Sidwell

Re: [PATCH GCC8][11/33]New interfaces for tree affine

2017-05-08 Thread Richard Biener via gcc-patches

On Thu, May 4, 2017 at 5:21 PM, Bin.Cheng  wrote:
> On Mon, Apr 24, 2017 at 11:43 AM, Richard Biener
>  wrote:
>> On Tue, Apr 18, 2017 at 12:43 PM, Bin Cheng  wrote:
>>> Hi,
>>> This patch adds three simple interfaces for tree affine which will be used 
>>> in
>>> cost computation later.
>>>
>>> Is it OK?
>>
>>
>> +static inline tree
>> +aff_combination_type (aff_tree *aff)
>>
>> misses a function comment.  Please do not introduce new 'static inline'
>> function in headers but instead use plain 'inline'.
>>
>> +/* Return true if AFF is simple enough.  */
>> +static inline bool
>> +aff_combination_simple_p (aff_tree *aff)
>> +{
>>
>> what is "simple"?  Based on that find a better name.
>> "singleton"?  But aff_combination_const_p isn't
>> simple_p (for whatever reason).
> Patch updated.  The one (13th) depending on this one is updated too.

Ok with

+inline bool
+aff_combination_singleton_var_p (aff_tree *aff)
+{
+  gcc_assert (aff != NULL);

this assert removed (aff->n will ICE anyway if NULL)

+  return (aff->n == 1
+ && aff->offset == 0
+ && (aff->elts[0].coef == 1 || aff->elts[0].coef == -1));

please adjust the comment to say "one (negated) singleton variable".

Thanks,
Richard.

> Thanks,
> bin
>>
>> Richard.
>>
>>
>>
>>> Thanks,
>>> bin
>>> 2017-04-11  Bin Cheng  
>>>
>>> * tree-affine.h (aff_combination_type): New interface.
>>> (aff_combination_const_p, aff_combination_simple_p): New interfaces.

Re: [PATCH GCC8][28/33]Don't count non-interger PHIs for register pressure

2017-05-08 Thread Richard Biener via gcc-patches

On Thu, May 4, 2017 at 5:33 PM, Bin.Cheng  wrote:
> On Wed, Apr 26, 2017 at 3:32 PM, Bin.Cheng  wrote:
>> On Wed, Apr 26, 2017 at 3:23 PM, Richard Biener
>>  wrote:
>>> On Wed, Apr 26, 2017 at 3:37 PM, Bin.Cheng  wrote:
 On Wed, Apr 26, 2017 at 2:32 PM, Richard Biener
  wrote:
> On Tue, Apr 18, 2017 at 12:52 PM, Bin Cheng  wrote:
>> Hi,
>> Given only integer variables are meaningful for register pressure 
>> estimation in IVOPTs,
>> this patch skips non-integer type PHIs when counting register pressure.
>> Is it OK?
>
> Huh.  I suppose it only makes a difference because you are ignoring
> POINTER_TYPE_P
> IVs?  At least I would be surprised if get_iv returns true for float
> or vector PHIs (yeah, see
> early out in get_iv)?  So why exclude POINTER_TYPE_P IVs?
 Hmm, but if get_iv returns non-NULL, the phi won't be counted because
 loop is continued?  Actually, all IV and invariants are skipped by
 checking get_iv, so this is only to skip floating point phis.
>>>
>>> Err, but AFAICS get_iv will return non-NULL for POINTER_TYPE_P IVs
>>> which you then skip by your added
>>>
>>> +  if (!INTEGRAL_TYPE_P (TREE_TYPE (op)))
>>> +   continue;
>>>
>>> thus float IVs are always skipped by means if get_iv returning NULL.
>>>
>>> Oh, the get_iv check continues for non-NULL result ... so it makes sense.
>>> But still, why exclude POINTER_TYPE_P non-IV ops?
>> POINTER_TYPE_P is simply an overlook,  will update patch.
> Here is updated version picking up POINTER_TYPE_P.

Ok.

Richard.

> Thanks,
> bin
>>
>> Thanks,
>> bin
>>>
>>> Richard.
>>>
 Thanks,
 bin
>
> Richard.
>
>> Thanks,
>> bin
>>
>> 2017-04-11  Bin Cheng  
>>
>> * tree-ssa-loop-ivopts.c (determine_set_costs): Skip non-interger
>> when counting register pressure.
>>

Re: [PATCH GCC8][17/33]Treat complex cand step as invriant expression

2017-05-08 Thread Richard Biener via gcc-patches

On Thu, May 4, 2017 at 7:55 PM, Bin.Cheng  wrote:
> On Wed, May 3, 2017 at 2:43 PM, Richard Biener
>  wrote:
>> On Tue, Apr 18, 2017 at 12:46 PM, Bin Cheng  wrote:
>>> Hi,
>>> We generally need to compute cand step in loop preheader and use it in loop 
>>> body.
>>> Unless it's an SSA_NAME of constant integer, an invariant expression is 
>>> needed.
>>
>> I'm confused as to what this patch does.  Comments talk about "Handle step 
>> as"
>> but then we print "Depend on inv...".  And we share bitmaps, well it seems
>>
>> + find_inv_vars (data, &step, &cand->inv_vars);
>> +
>> + iv_inv_expr_ent *inv_expr = get_loop_invariant_expr (data, step);
>> + /* Share bitmap between inv_vars and inv_exprs for cand.  */
>> + if (inv_expr != NULL)
>> +   {
>> + cand->inv_exprs = cand->inv_vars;
>> + cand->inv_vars = NULL;
>> + if (cand->inv_exprs)
>> +   bitmap_clear (cand->inv_exprs);
>> + else
>> +   cand->inv_exprs = BITMAP_ALLOC (NULL);
>> +
>> + bitmap_set_bit (cand->inv_exprs, inv_expr->id);
>>
>> just shares the bitmap allocation (and leaks cand->inv_exprs?).
>>
>> Note that generally it might be cheaper to use bitmap_head instead of
>> bitmap in the various structures (and then bitmap_initialize ()), this
>> saves one indirection.
>>
>> Anyway, the relation between inv_vars and inv_exprs is what confuses me.
>> Maybe it's the same as for cost_pair?  invariants vs. loop invariants?
>> whatever that means...
>>
>> That is, can you expand the comments in cost_pair / iv_cand for inv_vars
>> vs. inv_exprs, esp what "invariant" actually means?
> When we represent use with cand, there will be computation which is
> loop invariant.  The invariant computation is an newly created
> invariant expression and is based on ssa_vars existed before ivopts.
> If the invariant expression is complicated, we handle and call it as
> invariant expression.  We say the cost_pair depends on the inv.exprs.
> If the invariant expression is simple enough, we record all existing
> ssa_vars it based on in inv_vars.  We say the cost_pair depends on the
> inv.vars.   The same words stand for struct iv_cand.  If cand.step is
> simple enough, we simply record the ssa_var it based on in inv_vars,
> otherwise, the step is a new invariant expression which doesn't exist
> before, we record it in cand.inv_exprs.
>
> Add comment for inv_vars/inv_exprs, is this OK?   I noticed there is a
> redundant field cost_pair.inv_expr, I deleted it as obvious in a
> standalone patch.

Thanks for the explanation.

Ok.

Thanks,
Richard.

> Thanks,
> bin
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> bin
>>>
>>> 2017-04-11  Bin Cheng  
>>>
>>> * tree-ssa-loop-ivopts.c (struct iv_cand): New field inv_exprs.
>>> (dump_cand): Support iv_cand.inv_exprs.
>>> (add_candidate_1): Record invariant exprs in iv_cand.inv_exprs
>>> for candidates.
>>> (iv_ca_set_no_cp, iv_ca_set_cp, free_loop_data): Support
>>> iv_cand.inv_exprs.

Re: [C++ PATCH] namespace bindings

2017-05-08 Thread Andreas Schwab

On Mai 05 2017, Nathan Sidwell  wrote:

> This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces
> {get,set}_namespace_binding with get_namespace_value and set_global_value
> respectively.

I'm getting this regression on ia64:

FAIL: g++.dg/template/virtual3.C  -std=c++11 (test for excess errors)
Excess errors:
/usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: 
error: use of deleted function 'virtual B::~B()'
/usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: 
error: use of deleted function 'virtual B::~B()'
/usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: 
error: use of deleted function 'virtual B::~B()'
/usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: 
error: use of deleted function 'virtual B::~B()'
/usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: 
error: use of deleted function 'virtual B::~B()'
/usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: 
error: use of deleted function 'virtual B::~B()'
/usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: 
error: use of deleted function 'virtual B::~B()'
/usr/local/gcc/gcc-20170508/gcc/testsuite/g++.dg/template/virtual3.C:11:2: 
error: use of deleted function 'virtual B::~B()'

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

[Testsuite, committed] Fix vector peeling test failures

2017-05-08 Thread Wilco Dijkstra

This fixes a few failures on ARM and AArch64 due to a recent change in
alignment peeling by switching the vector cost model off
(https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00407.html).

Tested on AArch64, ARM and x64 - committed as obvious.

ChangeLog: 
2017-05-08  Wilco Dijkstra  

* testsuite/gcc.dg/vect/vect-44.c: Add -fno-vect-cost-model.
* gcc/testsuite/gcc.dg/vect/vect-50.c: Likewise.
--

diff --git a/gcc/testsuite/gcc.dg/vect/vect-44.c 
b/gcc/testsuite/gcc.dg/vect/vect-44.c
index 
186f9cfc9e26d6eb53514dec0fac176d696ec578..fbc593572429422c8e527c5e5559c515efd38aa6
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-44.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-44.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_float } */
+/* { dg-options "-fno-vect-cost-model" } */
 
 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-50.c 
b/gcc/testsuite/gcc.dg/vect/vect-50.c
index 
78bfd8d3920445fe51c7393a82870ea85f62bb55..0d5febc165ee3bf3b7b595237168d9d4b9604d4b
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-50.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-50.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_float } */
+/* { dg-options "-fno-vect-cost-model" } */
 
 #include 
 #include "tree-vect.h"

[PATCH] More strict-overflow VRP cleanup

2017-05-08 Thread Richard Biener


The following removes more optimization disabling when strict-overflow
sensitive simplifications were done.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2017-05-08  Richard Biener  

* tree-vrp.c (gimple_assign_nonzero_warnv_p): Rename to ...
(gimple_assign_nonzero): ... this and remove strict_overflow_p
argument.
(gimple_stmt_nonzero_warnv_p): Rename to ...
(gimple_stmt_nonzero_p): ... this and remove strict_overflow_p
argument.
(vrp_stmt_computes_nonzero): Remove strict_overflow_p argument.
(extract_range_basic): Adjust, do not disable propagation on
strict overflow sensitive simplification.
(vrp_visit_cond_stmt): Likewise.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 247733)
+++ gcc/tree-vrp.c  (working copy)
@@ -857,27 +857,28 @@ symbolic_range_based_on_p (value_range *
*STRICT_OVERFLOW_P.*/
 
 static bool
-gimple_assign_nonzero_warnv_p (gimple *stmt, bool *strict_overflow_p)
+gimple_assign_nonzero_p (gimple *stmt)
 {
   enum tree_code code = gimple_assign_rhs_code (stmt);
+  bool strict_overflow_p;
   switch (get_gimple_rhs_class (code))
 {
 case GIMPLE_UNARY_RHS:
   return tree_unary_nonzero_warnv_p (gimple_assign_rhs_code (stmt),
 gimple_expr_type (stmt),
 gimple_assign_rhs1 (stmt),
-strict_overflow_p);
+&strict_overflow_p);
 case GIMPLE_BINARY_RHS:
   return tree_binary_nonzero_warnv_p (gimple_assign_rhs_code (stmt),
  gimple_expr_type (stmt),
  gimple_assign_rhs1 (stmt),
  gimple_assign_rhs2 (stmt),
- strict_overflow_p);
+ &strict_overflow_p);
 case GIMPLE_TERNARY_RHS:
   return false;
 case GIMPLE_SINGLE_RHS:
   return tree_single_nonzero_warnv_p (gimple_assign_rhs1 (stmt),
- strict_overflow_p);
+ &strict_overflow_p);
 case GIMPLE_INVALID_RHS:
   gcc_unreachable ();
 default:
@@ -891,12 +892,12 @@ gimple_assign_nonzero_warnv_p (gimple *s
*STRICT_OVERFLOW_P.*/
 
 static bool
-gimple_stmt_nonzero_warnv_p (gimple *stmt, bool *strict_overflow_p)
+gimple_stmt_nonzero_p (gimple *stmt)
 {
   switch (gimple_code (stmt))
 {
 case GIMPLE_ASSIGN:
-  return gimple_assign_nonzero_warnv_p (stmt, strict_overflow_p);
+  return gimple_assign_nonzero_p (stmt);
 case GIMPLE_CALL:
   {
tree fndecl = gimple_call_fndecl (stmt);
@@ -934,13 +935,13 @@ gimple_stmt_nonzero_warnv_p (gimple *stm
 }
 }
 
-/* Like tree_expr_nonzero_warnv_p, but this function uses value ranges
+/* Like tree_expr_nonzero_p, but this function uses value ranges
obtained so far.  */
 
 static bool
-vrp_stmt_computes_nonzero (gimple *stmt, bool *strict_overflow_p)
+vrp_stmt_computes_nonzero (gimple *stmt)
 {
-  if (gimple_stmt_nonzero_warnv_p (stmt, strict_overflow_p))
+  if (gimple_stmt_nonzero_p (stmt))
 return true;
 
   /* If we have an expression of the form &X->a, then the expression
@@ -3927,8 +3928,7 @@ extract_range_basic (value_range *vr, gi
   if (INTEGRAL_TYPE_P (type)
   && gimple_stmt_nonnegative_warnv_p (stmt, &sop))
 set_value_range_to_nonnegative (vr, type);
-  else if (vrp_stmt_computes_nonzero (stmt, &sop)
-  && !sop)
+  else if (vrp_stmt_computes_nonzero (stmt))
 set_value_range_to_nonnull (vr, type);
   else
 set_value_range_to_varying (vr);
@@ -7568,7 +7568,6 @@ static void
 vrp_visit_cond_stmt (gcond *stmt, edge *taken_edge_p)
 {
   tree val;
-  bool sop;
 
   *taken_edge_p = NULL;
 
@@ -7634,25 +7633,14 @@ vrp_visit_cond_stmt (gcond *stmt, edge *
  additional checking.  Testing on several code bases (GCC, DLV,
  MICO, TRAMP3D and SPEC2000) showed that doing this results in
  4 more predicates folded in SPEC.  */
-  sop = false;
 
+  bool sop;
   val = vrp_evaluate_conditional_warnv_with_ops (gimple_cond_code (stmt),
 gimple_cond_lhs (stmt),
 gimple_cond_rhs (stmt),
 false, &sop, NULL);
   if (val)
-{
-  if (!sop)
-   *taken_edge_p = find_taken_edge (gimple_bb (stmt), val);
-  else
-   {
- if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file,
-"\nIgnoring predicate evaluation because "
-"it assumes that signed overflow is undefined");
- val = NULL_TREE;
-   }
-}
+*taken_edge_p = find_taken_edge (gimple_bb (stmt)

Re: [C++ PATCH] namespace bindings

2017-05-08 Thread Jason Merrill via gcc-patches

On Fri, May 5, 2017 at 4:07 PM, Nathan Sidwell  wrote:
> This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces
> {get,set}_namespace_binding with get_namespace_value and set_global_value
> respectively.

I'd prefer to stick with the _binding naming, as "value" is more ambiguous.

Jason

Re: [PATCH 1/2] C++ template type diff printing

2017-05-08 Thread Jason Merrill via gcc-patches

On Mon, May 8, 2017 at 7:36 AM, Nathan Sidwell  wrote:

I agree with most of Nathan's comments, but

> 'tree' sounds implementor-speaky,  Perhaps
> '-fdiagnostic-expand-template-args' or something?

I like "tree" here; I read it as referring to the formatting of the
abstract tree data structure, not the GCC internal type.

Jason

Re: [C++ Patch] PR 80186 ("ICE on C++ code with invalid constructor...")

2017-05-08 Thread Jason Merrill via gcc-patches

OK.

On Mon, May 8, 2017 at 7:25 AM, Paolo Carlini  wrote:
> Hi,
>
> in order to avoid this error recovery issue I think we want to check the
> return value of grok_ctor_properties, as we do in decl.c, for its only other
> use. Tested x86_64-linux.
>
> Thanks, Paolo.
>
> //
>
>

Re: [C++ Patch/RFC] PR 80145

2017-05-08 Thread Jason Merrill via gcc-patches

OK.

On Thu, Mar 23, 2017 at 3:07 PM, Paolo Carlini  wrote:
> Hi,
>
> this ICE on invalid code isn't a regression, thus a patch probably doesn't
> qualify for Stage 4, but IMHO I made good progress on it and I'm sending
> what I have now anyway... The ICE happens during error recovery after a
> sensible diagnostic for the first declaration in:
>
> auto* foo() { return 0; }
> auto* foo();
>
> After the error, finish_function does:
>
> apply_deduced_return_type (fndecl, void_type_node);
> fntype = TREE_TYPE (fndecl);
>
> which then is inconsistent with the auto* return type of the second
> declaration and leads to an ICE in merge_types (which duplicate_decls
> thought was safe to call because types_match is true (evidently: decls_match
> uses fndecl_declared_return_type)). Thus, in terms of error recovery, I
> think that in cases like the one at issue it makes sense not to replace
> auto* after the error and leave the return type untouched: certainly the
> below passes testing and avoids ICEing on the testcase at issue and a
> variant of it.
>
> Thanks,
> Paolo.
>
> //
>

Re: [C++ PATCH] namespace bindings

2017-05-08 Thread Nathan Sidwell


On 05/08/2017 08:07 AM, Andreas Schwab wrote:

On Mai 05 2017, Nathan Sidwell  wrote:


This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces
{get,set}_namespace_binding with get_namespace_value and set_global_value
respectively.


I'm getting this regression on ia64:



Sorry, I'll take a look ...


--
Nathan Sidwell

Re: {PATCH] New C++ warning -Wcatch-value

2017-05-08 Thread Jason Merrill via gcc-patches

On Sun, May 7, 2017 at 8:05 PM, Martin Sebor  wrote:
> On 05/07/2017 02:03 PM, Volker Reichelt wrote:
>>
>> On  2 May, Martin Sebor wrote:
>>>
>>> On 05/01/2017 02:38 AM, Volker Reichelt wrote:

 Hi,

 catching exceptions by value is a bad thing, as it may cause slicing,
 i.e.
 a) a superfluous copy
 b) which is only partial.
 See also
 https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#e15-catch-exceptions-from-a-hierarchy-by-reference

 To warn the user about catch handlers of non-reference type,
 the following patch adds a new C++/ObjC++ warning option
 "-Wcatch-value".
>>>
>>>
>>> I think the problems related to catching exceptions by value
>>> apply to (a subset of) class types but not so much to fundamental
>>> types.  I would expect indiscriminately warning on every type to
>>> be overly restrictive.
>>>
>>> The Enforcement section of the C++ guideline suggests to
>>>
>>>Flag by-value exceptions if their types are part of a hierarchy
>>>(could require whole-program analysis to be perfect).
>>>
>>> The corresponding CERT C++ Coding Standard guideline offers
>>> a similar suggestion here:
>>>
>>>https://www.securecoding.cert.org/confluence/x/TAD5CQ
>>>
>>> so I would suggest to model the warning on that approach (within
>>> limits of a single translation unit, of course).   I.e., warn only
>>> for catching by value objects of non-trivial types, or perhaps even
>>> only polymorphic types?
>>>
>>> Martin
>>
>>
>> I've never seen anybody throw integers in real-world code, so I didn't
>> want to complicate things for this case. But maybe I should only warn
>> about class-types.
>>
>> IMHO it makes sense to warn about non-polymorphic class types
>> (although slicing is not a problem there), because you still have to
>> deal with redundant copies.
>>
>> Another thing would be pointers. I've never seen pointers in catch
>> handlers (except some 'catch (const char*)' which I would consider
>> bad practice). Therefore I'd like to warn about 'catch (const A*)'
>> which might be a typo that should read 'catch (const A&)' instead.
>>
>> Would that be OK?
>
>
> To my knowledge, catch by value of non-polymorphic types (and
> certainly fundamental types) is not a cause of common bugs.
> It's part of the recommended practice to throw by value, catch
> by reference, which is grounded in avoiding the slicing problem.
> It's also sometimes recommended for non-trivial class types to
> avoid creating a copy of the object (which, for non-trivial types,
> may need to allocate resource and could throw).  Otherwise, it's
> not dissimilar to pass-by value vs pass-by-reference (or even
> pass-by-pointer).  Both may be good practices for some types or
> in some situations but neither is necessary to avoid bugs or
> universally applicable to achieve superior performance.
>
> The pointer case is interesting.  In C++ Coding Standards,
> Sutter and Alexandrescu recommend to throw (and catch) smart
> pointers over plain pointers because it obviates having to deal
> with memory management issues.  That's sound advice but it seems
> more like a design guideline than a coding rule aimed at directly
> preventing bugs.  I also think that the memory management bugs
> that it might find might be more easily detected at the throw
> site instead.  E.g., warning on the throw expression below:
>
>   {
> Exception e;
> throw &e;
>   }
>
> or perhaps even on
>
>   {
> throw *new Exception ();
>   }
>
> A more sophisticated (and less restrictive) checker could detect
> and warn on "throw " if it found a catch (T) or catch (T&)
> in the same file and no catch (T*) (but not warn otherwise).
>
> Martin
>
> PS After re-reading some of the coding guidelines on this topic
> it occurs to me that (if your patch doesn't handle this case yet)
> it might be worth considering to enhance it to also warn on
> rethrowing caught polymorphic objects (i.e., warn on
>
>   catch (E &e) { throw e; }
>
> and suggest to use "throw;" instead, for the same reason: to
> help avoid slicing.
>
> PPS  It may be a useful feature to implement some of other ideas
> you mentioned (e.g., throw by value rather than pointer) but it
> feels like a separate and more ambitious project than detecting
> the relatively common and narrow slicing problem.

I agree with Martin's comments.

Jason

Re: [C++ PATCH] namespace bindings

2017-05-08 Thread Nathan Sidwell


On 05/08/2017 08:56 AM, Jason Merrill wrote:

On Fri, May 5, 2017 at 4:07 PM, Nathan Sidwell  wrote:

This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces
{get,set}_namespace_binding with get_namespace_value and set_global_value
respectively.


I'd prefer to stick with the _binding naming, as "value" is more ambiguous.


Ok.  It is getting the 'value' binding, ignoring the 'type' binding, fwiw.

nathan

--
Nathan Sidwell

[PATCH] Small PRE speedup

2017-05-08 Thread Richard Biener


Was sitting in my tree.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-05-08  Richard Biener  

* tree-ssa-pre.c (bitmap_set_and): Avoid bitmap copy.
(bitmap_set_subtract_values): Likewise.

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 247733)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -817,19 +818,23 @@ bitmap_set_and (bitmap_set_t dest, bitma
 
   if (dest != orig)
 {
-  bitmap_head temp;
-  bitmap_initialize (&temp, &grand_bitmap_obstack);
-
   bitmap_and_into (&dest->values, &orig->values);
-  bitmap_copy (&temp, &dest->expressions);
-  EXECUTE_IF_SET_IN_BITMAP (&temp, 0, i, bi)
+
+  unsigned int to_clear = -1U;
+  FOR_EACH_EXPR_ID_IN_SET (dest, i, bi)
{
+ if (to_clear != -1U)
+   {
+ bitmap_clear_bit (&dest->expressions, to_clear);
+ to_clear = -1U;
+   }
  pre_expr expr = expression_for_id (i);
  unsigned int value_id = get_expr_value_id (expr);
  if (!bitmap_bit_p (&dest->values, value_id))
-   bitmap_clear_bit (&dest->expressions, i);
+   to_clear = i;
}
-  bitmap_clear (&temp);
+  if (to_clear != -1U)
+   bitmap_clear_bit (&dest->expressions, to_clear);
 }
 }
 
@@ -862,18 +867,20 @@ bitmap_set_subtract_values (bitmap_set_t
 {
   unsigned int i;
   bitmap_iterator bi;
-  bitmap_head temp;
-
-  bitmap_initialize (&temp, &grand_bitmap_obstack);
-
-  bitmap_copy (&temp, &a->expressions);
-  EXECUTE_IF_SET_IN_BITMAP (&temp, 0, i, bi)
+  pre_expr to_remove = NULL;
+  FOR_EACH_EXPR_ID_IN_SET (a, i, bi)
 {
+  if (to_remove)
+   {
+ bitmap_remove_from_set (a, to_remove);
+ to_remove = NULL;
+   }
   pre_expr expr = expression_for_id (i);
   if (bitmap_set_contains_value (b, get_expr_value_id (expr)))
-   bitmap_remove_from_set (a, expr);
+   to_remove = expr;
 }
-  bitmap_clear (&temp);
+  if (to_remove)
+bitmap_remove_from_set (a, to_remove);
 }

Re: [Testsuite, committed] Fix vector peeling test failures

2017-05-08 Thread Richard Biener via gcc-patches

On Mon, May 8, 2017 at 2:41 PM, Wilco Dijkstra  wrote:
> This fixes a few failures on ARM and AArch64 due to a recent change in
> alignment peeling by switching the vector cost model off
> (https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00407.html).
>
> Tested on AArch64, ARM and x64 - committed as obvious.

Thanks.  Note that I'm not sure what -fno-vect-cost-model actually means when
peeling for alignment, so this fix might not "prevail".

Richard.

> ChangeLog:
> 2017-05-08  Wilco Dijkstra  
>
> * testsuite/gcc.dg/vect/vect-44.c: Add -fno-vect-cost-model.
> * gcc/testsuite/gcc.dg/vect/vect-50.c: Likewise.
> --
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-44.c 
> b/gcc/testsuite/gcc.dg/vect/vect-44.c
> index 
> 186f9cfc9e26d6eb53514dec0fac176d696ec578..fbc593572429422c8e527c5e5559c515efd38aa6
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-44.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-44.c
> @@ -1,4 +1,5 @@
>  /* { dg-require-effective-target vect_float } */
> +/* { dg-options "-fno-vect-cost-model" } */
>
>  #include 
>  #include "tree-vect.h"
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-50.c 
> b/gcc/testsuite/gcc.dg/vect/vect-50.c
> index 
> 78bfd8d3920445fe51c7393a82870ea85f62bb55..0d5febc165ee3bf3b7b595237168d9d4b9604d4b
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-50.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-50.c
> @@ -1,4 +1,5 @@
>  /* { dg-require-effective-target vect_float } */
> +/* { dg-options "-fno-vect-cost-model" } */
>
>  #include 
>  #include "tree-vect.h"

Re: [PATCH] disable -Walloc-size-larger-than and -Wstringop-overflow for non-C front ends (PR 80545)

2017-05-08 Thread Martin Sebor


On 05/04/2017 10:13 PM, Jeff Law wrote:

On 04/28/2017 04:02 PM, Martin Sebor wrote:

The two options were included in -Wall and enabled for all front
ends but only made to be recognized by the driver for the C family
of compilers.  That made it impossible to suppress those warnings
when compiling code for those other front ends (like Fortran).

The attached patch adjusts the warnings so that they are only
enabled for the C family of front ends and not for any others,
as per Richard's suggestion.  (The other solution would have
been to make the warnings available to all front ends.  Since
non-C languages don't have a way of calling the affected
functions -- or do they? -- this is probably not necessary.)

Martin

gcc-80545.diff


PR driver/80545 - option -Wstringop-overflow not recognized by Fortran

gcc/c-family/ChangeLog:

PR driver/80545
* c.opt (-Walloc-size-larger-than, -Wstringop-overflow): Enable
and make available for the C family only.

OK.
jeff


It turns out that this is not the right fix.  I overlooked that
-Wstringop-overflow is meant to be enabled by default and while
removing the Init(2) bit and replacing it with LangEnabledBy (C
ObjC C++ ObjC++, Wall, 2, 0) suppresses the warning in Fortran
it also disables it by default in C/C++ unless -Wall is used.

By my reading of the Option properties part of the GCC Internals
manual there is no way to initialize a warning to on by default
while making it available only in a subset of languages.  The
only way I can think of is to initialize it in the .opt file to
something like -1 and then change it at some point to 2 somewhere
in the C/C++ front ends.  That seems pretty cumbersome.  Am I
missing some trick?

Martin

Re: [C++ PATCH] namespace bindings

2017-05-08 Thread Nathan Sidwell


On 05/08/2017 08:07 AM, Andreas Schwab wrote:

On Mai 05 2017, Nathan Sidwell  wrote:


This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces
{get,set}_namespace_binding with get_namespace_value and set_global_value
respectively.


I'm getting this regression on ia64:

FAIL: g++.dg/template/virtual3.C  -std=c++11 (test for excess errors)


I'm not seeing this.  On an ia64-linux target (hosted on x86_64-linux) I 
get the attached output, which is the same as I get on an x86_64-linux 
native build.


For the failure mode you describe I'd expect many more test failures to 
occur.  I wonder if there's some uninitialized variable or something?


nathan

--
Nathan Sidwell
virtual3.C:8:8: error: deleted function 'virtual B::~B()'
 struct B : A<0>, A<1>  // { dg-error "deleted|context" }
^
virtual3.C:5:11: error: overriding non-deleted function 'A< >::~A() 
[with int  = 0]'
   virtual ~A();   // { dg-message "non-deleted|private" }
   ^
virtual3.C:8:8: note: 'virtual B::~B()' is implicitly deleted because the 
default definition would be ill-formed:
 struct B : A<0>, A<1>  // { dg-error "deleted|context" }
^
virtual3.C:8:8: error: 'A< >::~A() [with int  = 0]' is 
private within this context
virtual3.C:5:11: note: declared private here
   virtual ~A();   // { dg-message "non-deleted|private" }
   ^
virtual3.C:8:8: error: 'A< >::~A() [with int  = 1]' is 
private within this context
 struct B : A<0>, A<1>  // { dg-error "deleted|context" }
^
virtual3.C:5:11: note: declared private here
   virtual ~A();   // { dg-message "non-deleted|private" }
   ^
virtual3.C:8:8: error: deleted function 'virtual B::~B()'
 struct B : A<0>, A<1>  // { dg-error "deleted|context" }
^
virtual3.C:5:11: error: overriding non-deleted function 'A< >::~A() 
[with int  = 1]'
   virtual ~A();   // { dg-message "non-deleted|private" }
   ^
virtual3.C: In constructor 'B::B()':
virtual3.C:10:7: error: 'A< >::~A() [with int  = 0]' is 
private within this context
   B() {}   // { dg-error "context" }
   ^
virtual3.C:5:11: note: declared private here
   virtual ~A();   // { dg-message "non-deleted|private" }
   ^
virtual3.C:10:7: error: 'A< >::~A() [with int  = 1]' is 
private within this context
   B() {}   // { dg-error "context" }
   ^
virtual3.C:5:11: note: declared private here
   virtual ~A();   // { dg-message "non-deleted|private" }
   ^

Re: [AArch64] Tighten move constraints for symbolic operands

2017-05-08 Thread Richard Earnshaw (lists)

On 31/03/17 09:34, Richard Sandiford wrote:
> The movsi and movdi constraints allowed the source to be any
> absolute symbolic expression ("S").  That's OK for operands that
> have already been vetted by the aarch64_mov_operand predicate but
> causes problems if the register allocator substitutes an equivalence
> (the usual "the constraints can't accept more than the predicates"
> restriction).
> 
> Although all other uses of "S" in the backend are redundant and could
> in principle be removed, "S" itself is a publicly-documented constraint
> and so we'd have to keep its definition.  This patch therefore adds a
> new "Usa" constraint for legitimate absolute address operands.
> 
> I saw this for a large testcase in which an equivalence was used
> for a label_ref jump table.  It's not something that can be cut
> down easily or that would give a robust regression test.
> 
> I don't think this is a regression, so maybe we don't want it for GCC 7.
> 
> Tested on aarch64-linux-gnu.  OK to install?
> 

OK.

R.

> Thanks,
> Richard
> 
> 
> gcc/
>   * config/aarch64/constraints.md (Usa): New constraint.
>   * config/aarch64/aarch64.md (*movsi_aarch64, *movdi_aarch64): Use it.
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 3717edf..260bd64 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1074,7 +1074,7 @@
>  
>  (define_insn_and_split "*movsi_aarch64"
>[(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r ,r,*w,m,  
> m,r,r  ,*w, r,*w")
> - (match_operand:SI 1 "aarch64_mov_operand"  " r,r,k,M,n,Dv,m, 
> m,rZ,*w,S,Ush,rZ,*w,*w"))]
> + (match_operand:SI 1 "aarch64_mov_operand"  " r,r,k,M,n,Dv,m, 
> m,rZ,*w,Usa,Ush,rZ,*w,*w"))]
>"(register_operand (operands[0], SImode)
>  || aarch64_reg_or_zero (operands[1], SImode))"
>"@
> @@ -1108,7 +1108,7 @@
>  
>  (define_insn_and_split "*movdi_aarch64"
>[(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r ,r,*w,m,  
> m,r,r,  *w, r,*w,w")
> - (match_operand:DI 1 "aarch64_mov_operand"  " r,r,k,N,n,Dv,m, 
> m,rZ,*w,S,Ush,rZ,*w,*w,Dd"))]
> + (match_operand:DI 1 "aarch64_mov_operand"  " r,r,k,N,n,Dv,m, 
> m,rZ,*w,Usa,Ush,rZ,*w,*w,Dd"))]
>"(register_operand (operands[0], DImode)
>  || aarch64_reg_or_zero (operands[1], DImode))"
>"@
> diff --git a/gcc/config/aarch64/constraints.md 
> b/gcc/config/aarch64/constraints.md
> index b77e096..5852a42 100644
> --- a/gcc/config/aarch64/constraints.md
> +++ b/gcc/config/aarch64/constraints.md
> @@ -104,6 +104,14 @@
>(and (match_code "high")
> (match_test "aarch64_valid_symref (XEXP (op, 0), GET_MODE (XEXP (op, 
> 0)))")))
>  
> +(define_constraint "Usa"
> +  "@internal
> +   A constraint that matches an absolute symbolic address that can be
> +   loaded by a single ADR."
> +  (and (match_code "const,symbol_ref,label_ref")
> +   (match_test "aarch64_symbolic_address_p (op)")
> +   (match_test "aarch64_mov_operand_p (op, GET_MODE (op))")))
> +
>  (define_constraint "Uss"
>"@internal
>A constraint that matches an immediate shift constant in SImode."
>

[PATCH, GCC/ARM, FYI] Define TM_MULTILIB_CONFIG for ARM multilib

2017-05-08 Thread Thomas Preudhomme


Hi,

TM_MULTILIB_CONFIG is not set in config.gcc when building with multilib
for arm targets, leading to config/arm/t-multilib not including any of
the files (t-aprofile and t-rmprofile) definining the architecture and
FPU to build multilib for. This patch fixes that by setting
TM_MULTILIB_CONFIG to with_multilib_list's value after it has been
checked. It also fix a trailing whitespace issue.


ChangeLog entry is as follows:

*** gcc/ChangeLog **

2017-05-08  Thomas Preud'homme  

* config.gcc (arm*-*-*): Set TM_MULTILIB_CONFIG from
with_multilib_list after it has been checked.


Committing as obvious.

Best regards,

Thomas
diff --git a/gcc/config.gcc b/gcc/config.gcc
index fd7caebf0d155df0b25cc6a8374af57555707fc3..e8aaf2df9d709bf9d95f0a249c79f5813519d51f 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3803,7 +3803,7 @@ case "${target}" in
 		;;
 	esac
 done
-			fi
+			fi
 
 			if test "x${tmake_profile_file}" != x ; then
 # arm/t-aprofile and arm/t-rmprofile are only
@@ -3820,6 +3820,7 @@ case "${target}" in
 fi
 
 tmake_file="${tmake_file} ${tmake_profile_file}"
+TM_MULTILIB_CONFIG="$with_multilib_list"
 			fi
 		fi
 		;;

Re: [PATCH 1/2] Automatic context save/restore for regular interrupts.

2017-05-08 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-05-05 13:02:43 
+0200]:

> The AUX_IRQ_CTRL register controls the behavior of automated register
> save and restore or prologue and epilogue sequences during a non-fast
> interrupt entry and exit, and context save and restore instructions.
> 
> A user passes to the compiler the configuration of the AUX_IRQ_CTRL
> register via mirq-ctrl-saved option.  This option, specifies
> gneral-purposes registers that the processor saves/restores on
> interrupt entry and exit, and it is only valid for ARC EM and ARC HS
> cores.
> 
> gcc/
> 2017-05-05  Claudiu Zissulescu  
> 
>   * config/arc/arc.c (irq_ctrl_saved): New variable.
>   (ARC_AUTOBLINK_IRQ_P): Define.
>   (ARC_AUTOFP_IRQ_P): Likewise.
>   (ARC_AUTO_IRQ_P): Likewise.
>   (irq_range): New function.
>   (arc_must_save_register): Likewise.
>   (arc_must_save_return_addr): Likewise.
>   (arc_dwarf_emit_irq_save_regs): Likewise.
>   (arc_override_options): Handle deferred options.
>   (MUST_SAVE_REGISTER): Deleted, replaced by arc_must_save_register.
>   (MUST_SAVE_RETURN_ADDR): Deleted, replaced by
>   arc_must_save_return_addr.
>   (arc_compute_frame_size): Handle automated save and restore of
>   registers.
>   (arc_expand_prologue): Likewise.
>   (arc_expand_epilogue): Likewise.
>   * config/arc/arc.md (stack_irq_dwarf): New unspec instruction.
>   * config/arc/arc.opt (mirq-ctrl-saved): New option.
>   * doc/invoke.texi (mirq-ctrl-saved): Document option.
>   * testsuite/gcc.target/arc/interrupt-5.c: Newfile.
>   * testsuite/gcc.target/arc/interrupt-6.c: Likewise.
>   * testsuite/gcc.target/arc/interrupt-7.c: Likewise.
>   * testsuite/gcc.target/arc/interrupt-8.c: Likewise.
>   * testsuite/gcc.target/arc/interrupt-9.c: Likewise.
> ---
>  gcc/config/arc/arc.c   | 329 
> ++---
>  gcc/config/arc/arc.md  |   8 +
>  gcc/config/arc/arc.opt |   4 +
>  gcc/doc/invoke.texi|  11 +-
>  gcc/testsuite/gcc.target/arc/interrupt-5.c |  19 ++
>  gcc/testsuite/gcc.target/arc/interrupt-6.c |  22 ++
>  gcc/testsuite/gcc.target/arc/interrupt-7.c |  16 ++
>  gcc/testsuite/gcc.target/arc/interrupt-8.c |  27 +++
>  gcc/testsuite/gcc.target/arc/interrupt-9.c |  17 ++
>  9 files changed, 421 insertions(+), 32 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-5.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-6.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-7.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-8.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-9.c
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 4574481..a61faef 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "builtins.h"
>  #include "rtl-iter.h"
>  #include "alias.h"
> +#include "opts.h"
>  
>  /* Which cpu we're compiling for (ARC600, ARC601, ARC700).  */
>  static char arc_cpu_name[10] = "";
> @@ -111,6 +112,29 @@ struct GTY (()) arc_ccfsm
>int target_label;
>  };
>  
> +/* Status of the IRQ_CTRL_AUX register.  */
> +typedef struct irq_ctrl_saved_t
> +{
> +  short irq_save_last_reg;  /* Last register number used by
> +IRQ_CTRL_SAVED aux_reg.  */
> +  bool  irq_save_blink; /* True if BLINK is automatically
> +saved.  */
> +  bool  irq_save_lpcount;   /* True if LPCOUNT is automatically
> +saved.  */
> +} irq_ctrl_saved_t;

I'm pretty sure GNU style is to put the comments above each entry in
the struct, such as:

 /* Last register number used by IRQ_CTRL_SAVED aux_reg.  */
 short irq_save_last_reg;

I did have a quick look though GCC and couldn't see many examples of
comments to the right.

> +static irq_ctrl_saved_t irq_ctrl_saved;
> +
> +#define ARC_AUTOBLINK_IRQ_P(FNTYPE)  \
> +  (ARC_INTERRUPT_P (FNTYPE) && irq_ctrl_saved.irq_save_blink)
> +
> +#define ARC_AUTOFP_IRQ_P(FNTYPE) \
> +  (ARC_INTERRUPT_P (FNTYPE) && (irq_ctrl_saved.irq_save_last_reg > 26))
> +
> +#define ARC_AUTO_IRQ_P(FNTYPE)   \
> +  (ARC_INTERRUPT_P (FNTYPE)  \
> +   && (irq_ctrl_saved.irq_save_blink \
> +   || (irq_ctrl_saved.irq_save_last_reg >= 0)))
> +
>  #define arc_ccfsm_current cfun->machine->ccfsm_current
>  
>  #define ARC_CCFSM_BRANCH_DELETED_P(STATE) \
> @@ -806,11 +830,110 @@ arc_init (void)
>  }
>  }
>  
> +/* Parse -mirq-ctrl-saved= option string.  Registers may be specified
> +   individually, or as ranges such as "r0-r3".  Registers accepted are
> +   r0 through r31 and lp_count.  Registers and ranges must be
> +   comma-separated.  */

The comment seem

[PATCH, rs6000] Add x86 instrinsic headers to GCC PPC64LE taget

2017-05-08 Thread Steven Munroe

A common issue in porting applications and packages is that someone may
have forgotten that there is more than one hardware platform. 

A specific example is applications using Intel x86 intrinsic functions
without appropriate conditional compile guards. Another example is a
developer tasked to port a large volume of code containing important
functions "optimized" with Intel x86 intrinsics, but without the skill
or time to perform the same optimization for another platform. Often the
developer who wrote the original optimization has moved on and those
left to maintain the application / package lack understanding of the
original x86 intrinsic code or design.

For PowerPC this can be acute especially for HPC vector SIMD codes. The
PowerISA (as implemented for POWER and OpenPOWER servers) has extensive
vector hardware facilities and GCC proves a large set of vector
intrinsics. Thus I would like to restrict this support to PowerPC
targets that support VMX/VSX and PowerISA-2.07 (power8) and later.

But the difference in (intrinsic) spelling alone is enough stop many
application developers in their tracks.

So I propose to submit a series of patches to implement the PowerPC64LE
equivalent of a useful subset of the x86 intrinsics. The final size and
usefulness of this effort is to be determined. The proposal is to
incrementally port intrinsic header files from the ./config/i386 tree to
the ./config/rs6000 tree. This naturally provides the same header
structure and intrinsic names which will simplify code porting.

It seems natural to work from the bottom (oldest) up. For example
starting with mmintrin.h and working our way up the following headers:

smmintrin.h(SSE4.1)  includes tmmintrin,h
tmmintrin.h(SSSE3)   includes pmmintrin.h
pmmintrin.h(SSE3)includes emmintrin,h
emmintrin.h(SSE2)includes xmmintrin.h
xmmintrin.h(SSE) includes mmintrin.h and mm_malloc.h
mmintrin.h (MMX)

There is a smattering of non-vector intrinsics in common use.
Like the Bit Manipulation Instructions (BMI & BMI2).

bmiintrin.h
bmi2intrin.h
x86intrin.h (collector includes BMI headers and many others)

The older intrinsic (BMI/MMX/SSE) instructions have been integrated into
GCC and many of the intrinsic implementations are simple C code or GCC
built-ins. The remaining intrinsic functions are implemented as platform
specific builtins (__builtin_ia32_*) and need to be mapped to equivalent
PowerPC builtin or vector intrinsic from altivec.h is required.

Of course as part of this process we will port as many of the
corresponding DejaGnu tests from gcc/testsuite/gcc.target/i386/ to
gcc/testsuite/gcc.target/powerpc/ as appropriate. So far the dg-do run
tests only require minor source changes, mostly to the platform specific
dg-* directives. A few dg-do compile tests are needed to insure we are
getting the expected folding/Common subexpression elimination (CSE) to
generate the optimum sequence for PowerPC.

To get the ball rolling I include the BMI intrinsics ported to PowerPC
for review as they are reasonable size (31 intrinsic implementations).

[gcc]

2017-05-04  Steven Munroe  

* config.gcc (powerpc*-*-*): Add bmi2intrin.h, bmiintrin.h,
and x86intrin.h
* config/rs6000/bmiintrin.h: New file.
* config/rs6000/bmi2intrin.h: New file.
* config/rs6000/x86intrin.h: New file.

[gcc/testsuite]

2017-05-04  Steven Munroe  

* gcc.target/powerpc/bmi-andn-1.c: New file
* gcc.target/powerpc/bmi-andn-2.c: New file.
* gcc.target/powerpc/bmi-bextr-1.c: New file.
* gcc.target/powerpc/bmi-bextr-2.c: New file.
* gcc.target/powerpc/bmi-bextr-4.c: New file.
* gcc.target/powerpc/bmi-bextr-5.c: New file.
* gcc.target/powerpc/bmi-blsi-1.c: New file.
* gcc.target/powerpc/bmi-blsi-2.c: New file.
* gcc.target/powerpc/bmi-blsmsk-1.c: new file.
* gcc.target/powerpc/bmi-blsmsk-2.c: New file.
* gcc.target/powerpc/bmi-blsr-1.c: New file.
* gcc.target/powerpc/bmi-blsr-2.c: New File.
* gcc.target/powerpc/bmi-check.h: New File.
* gcc.target/powerpc/bmi-tzcnt-1.c: new file.
* gcc.target/powerpc/bmi-tzcnt-2.c: New file.
* gcc.target/powerpc/bmi2-bzhi32-1.c: New file.
* gcc.target/powerpc/bmi2-bzhi64-1.c: New file.
* gcc.target/powerpc/bmi2-bzhi64-1a.c: New file.
* gcc.target/powerpc/bmi2-check.h: New file.
* gcc.target/powerpc/bmi2-mulx32-1.c: New file.
* gcc.target/powerpc/bmi2-mulx32-2.c: New file.
* gcc.target/powerpc/bmi2-mulx64-1.c: New file.
* gcc.target/powerpc/bmi2-mulx64-2.c: New file.
* gcc.target/powerpc/bmi2-pdep32-1.c: New file.
* gcc.target/powerpc/bmi2-pdep64-1.c: New file.
* gcc.target/powerpc/bmi2-pext32-1.c: New File.
* gcc.target/powerpc/bmi2-pext64-1.c: New file.
* gcc.target/powerpc/bmi2-pext64-1a.c: New File.

Index: gcc/testsuite/gcc.target/powerpc/bmi-blsi-

Re: [PATCH 2/2] Fast interrupts support.

2017-05-08 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-05-05 13:02:44 
+0200]:

> When a processor enters a fast interrupts handler, and duplicate
> register banks are configured, the processor saves the user context by
> saving the registers in the main register bank to these additional
> registers in the duplicate register bank.  In this fast interrupt
> context, when you specify the rgf_banked_regs option,the compiler does
> not save the registers duplicated in the additional register bank are
> not saved.
> 
> gcc/
> 2016-10-04  Claudiu Zissulescu  
>   Andrew Burgess  
> 
>   * config/arc/arc.c (ARC_AUTOBLINK_IRQ_P): Consider fast interrupts
>   case also.
>   (ARC_AUTOFP_IRQ_P): Likewise.
>   (ARC_AUTO_IRQ_P): Likewise.
>   (rgf_banked_register_count): New variable.
>   (parse_mrgf_banked_regs_option): New function.
>   (arc_override_options): Handle rgf_banked_regs option.
>   (arc_handle_interrupt_attribute): Add firq option.
>   (arc_compute_function_type): Return fast irq type when required.
>   (arc_must_save_register): Handle fast interrupts.
>   (arc_expand_prologue): Do not emit dwarf info for fast interrupts.
>   (arc_return_address_regs): Update.
>   * config/arc/arc.h (arc_return_address_regs): Update.
>   (arc_function_type): Add fast interrupt type.
>   (ARC_INTERRUPT_P): Update.
>   (RC_FAST_INTERRUPT_P): Define.
>   * config/arc/arc.md (simple_return): Update for fast interrupts.
>   (p_return_i): Likewise.
>   * config/arc/arc.opt (mrgf-banked-regs): New option.
>   * doc/invoke.texi (mrgf-banked-regs): Document.
>   * testsuite/gcc.target/arc/firq-1.c: New file.
>   * testsuite/gcc.target/arc/firq-2.c: Likewise.
>   * testsuite/gcc.target/arc/firq-3.c: Likewise.
>   * testsuite/gcc.target/arc/firq-4.c: Likewise.
>   * testsuite/gcc.target/arc/firq-5.c: Likewise.
>   * testsuite/gcc.target/arc/firq-6.c: Likewise.

Looks fine to me,

Thanks,
Andrew



> ---
>  gcc/config/arc/arc.c  | 106 
> +++---
>  gcc/config/arc/arc.h  |  13 +++--
>  gcc/config/arc/arc.md |   9 ++-
>  gcc/config/arc/arc.opt|   4 ++
>  gcc/doc/invoke.texi   |  10 
>  gcc/testsuite/gcc.target/arc/firq-1.c |  27 +
>  gcc/testsuite/gcc.target/arc/firq-2.c |  31 ++
>  gcc/testsuite/gcc.target/arc/firq-3.c |  40 +
>  gcc/testsuite/gcc.target/arc/firq-4.c |  31 ++
>  gcc/testsuite/gcc.target/arc/firq-5.c |  15 +
>  gcc/testsuite/gcc.target/arc/firq-6.c |  21 +++
>  11 files changed, 277 insertions(+), 30 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arc/firq-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/firq-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/firq-3.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/firq-4.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/firq-5.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/firq-6.c
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index a61faef..a0cd597 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -125,16 +125,25 @@ typedef struct irq_ctrl_saved_t
>  static irq_ctrl_saved_t irq_ctrl_saved;
>  
>  #define ARC_AUTOBLINK_IRQ_P(FNTYPE)  \
> -  (ARC_INTERRUPT_P (FNTYPE) && irq_ctrl_saved.irq_save_blink)
> -
> -#define ARC_AUTOFP_IRQ_P(FNTYPE) \
> -  (ARC_INTERRUPT_P (FNTYPE) && (irq_ctrl_saved.irq_save_last_reg > 26))
> -
> -#define ARC_AUTO_IRQ_P(FNTYPE)   \
> -  (ARC_INTERRUPT_P (FNTYPE)  \
> -   && (irq_ctrl_saved.irq_save_blink \
> +  ((ARC_INTERRUPT_P (FNTYPE) \
> +&& irq_ctrl_saved.irq_save_blink)\
> +   || (ARC_FAST_INTERRUPT_P (FNTYPE) \
> +   && rgf_banked_register_count > 8))
> +
> +#define ARC_AUTOFP_IRQ_P(FNTYPE) \
> +  ((ARC_INTERRUPT_P (FNTYPE) \
> +&& (irq_ctrl_saved.irq_save_last_reg > 26))  \
> +  || (ARC_FAST_INTERRUPT_P (FNTYPE)  \
> +  && rgf_banked_register_count > 8))
> +
> +#define ARC_AUTO_IRQ_P(FNTYPE)   \
> +  (ARC_INTERRUPT_P (FNTYPE) && !ARC_FAST_INTERRUPT_P (FNTYPE)\
> +   && (irq_ctrl_saved.irq_save_blink \
> || (irq_ctrl_saved.irq_save_last_reg >= 0)))
>  
> +/* Number of registers in second bank for FIRQ support.  */
> +static int rgf_banked_register_count;
> +
>  #define arc_ccfsm_current cfun->machine->ccfsm_current
>  
>  #define ARC_CCFSM_BRANCH_DELETED_P(STATE) \
> @@ -924,6 +933,27 @@ irq_range (const char *cstr)
>irq_ctrl_saved.irq_save_lpcount  = (lpcount == 60);
>  }
>  
> +/* Parse -mrgf-banked-regs=NUM option strin

Re: [PATCH 3/3] [ARC] Add support for advanced mpy/mac instructions.

2017-05-08 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-04-25 15:03:44 
+0200]:

> Add support for MAC and DMPY instructions. ARCv2 Only.
> 
> gcc/
> 2016-12-08  Claudiu Zissulescu  
> 
>   * config/arc/arc.c (arc_conditional_register_usage): Handle ACCL,
>   ACCH registers.
>   * config/arc/arc.md (mulsidi3): Use advanced mpy instructions when
>   available.
>   (umulsidi3): Likewise.
>   (mulsidi3_700): Disable this pattern when we have advanced mpy
>   instructions.
>   (umulsidi3_700): Likewise.
>   (maddsidi4): New pattern.
>   (macd, mac, mac_r, umaddsidi4, macdu, macu, macu_r): Likewise.
>   (mpyd_arcv2hs, mpyd_imm_arcv2hs, mpydu_arcv2hs): Likewise.
>   (mpydu_imm_arcv2hs): Likewise.
>   * config/arc/predicates.md (accl_operand): New predicate.

Looks good, but is it not possible to write some tests?

Thanks,
Andrew




> ---
>  gcc/config/arc/arc.c |   5 +-
>  gcc/config/arc/arc.md| 266 
> ++-
>  gcc/config/arc/predicates.md |   5 +
>  3 files changed, 272 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 170b8dd..238244b 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -1789,8 +1789,9 @@ arc_conditional_register_usage (void)
>arc_regno_reg_class[PROGRAM_COUNTER_REGNO] = GENERAL_REGS;
>  
>/*ARCV2 Accumulator.  */
> -  if (TARGET_V2
> -  && (TARGET_FP_DP_FUSED || TARGET_FP_SP_FUSED))
> +  if ((TARGET_V2
> +   && (TARGET_FP_DP_FUSED || TARGET_FP_SP_FUSED))
> +  || TARGET_PLUS_DMPY)
>{
>  arc_regno_reg_class[ACCL_REGNO] = WRITABLE_CORE_REGS;
>  arc_regno_reg_class[ACCH_REGNO] = WRITABLE_CORE_REGS;
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index ffab390..db5867c 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -2157,6 +2157,18 @@
>"TARGET_ANY_MPY"
>  "
>  {
> +  if (TARGET_PLUS_MACD)
> +{
> + if (CONST_INT_P (operands[2]))
> +   {
> + emit_insn (gen_mpyd_imm_arcv2hs (operands[0], operands[1], 
> operands[2]));
> +   }
> + else
> +   {
> + emit_insn (gen_mpyd_arcv2hs (operands[0], operands[1], operands[2]));
> +   }
> + DONE;
> +}
>if (TARGET_MPY)
>  {
>operands[2] = force_reg (SImode, operands[2]);
> @@ -2237,7 +2249,7 @@
>[(set (match_operand:DI 0 "register_operand" "=&r")
>   (mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "%c"))
>(sign_extend:DI (match_operand:SI 2 "extend_operand" "cL"]
> -  "TARGET_MPY"
> +  "TARGET_MPY && !TARGET_PLUS_MACD"
>"#"
>"&& reload_completed"
>[(const_int 0)]
> @@ -2390,6 +2402,18 @@
>(zero_extend:DI (match_operand:SI 2 "nonmemory_operand" ""]
>""
>  {
> +  if (TARGET_PLUS_MACD)
> +{
> + if (CONST_INT_P (operands[2]))
> +   {
> + emit_insn (gen_mpydu_imm_arcv2hs (operands[0], operands[1], 
> operands[2]));
> +   }
> + else
> +   {
> + emit_insn (gen_mpydu_arcv2hs (operands[0], operands[1], operands[2]));
> +   }
> + DONE;
> +}
>if (TARGET_MPY)
>  {
>operands[2] = force_reg (SImode, operands[2]);
> @@ -2480,7 +2504,7 @@
>[(set (match_operand:DI 0 "dest_reg_operand" "=&r")
>   (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "%c"))
>(zero_extend:DI (match_operand:SI 2 "extend_operand" "cL"]
> -  "TARGET_MPY"
> +  "TARGET_MPY && !TARGET_PLUS_MACD"
>"#"
>"reload_completed"
>[(const_int 0)]
> @@ -6215,6 +6239,244 @@
>""
>[(set_attr "length" "0")])
>  
> +;; MAC and DMPY instructions
> +(define_insn_and_split "maddsidi4"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> + (plus:DI
> +  (mult:DI
> +   (sign_extend:DI (match_operand:SI 1 "register_operand" "%r"))
> +   (sign_extend:DI (match_operand:SI 2 "extend_operand" "ri")))
> +  (match_operand:DI 3 "register_operand" "r")))]
> +  "TARGET_PLUS_DMPY"
> +  "#"
> +  "TARGET_PLUS_DMPY && reload_completed"
> +  [(const_int 0)]
> +  "{
> +   rtx acc_reg = gen_rtx_REG (DImode, ACC_REG_FIRST);
> +   emit_move_insn (acc_reg, operands[3]);
> +   if (TARGET_PLUS_MACD)
> + emit_insn (gen_macd (operands[0], operands[1], operands[2]));
> +   else
> + {
> +  emit_insn (gen_mac (operands[1], operands[2]));
> +  emit_move_insn (operands[0], acc_reg);
> + }
> +   DONE;
> +   }"
> +  [(set_attr "type" "multi")
> +   (set_attr "length" "36")])
> +
> +(define_insn "macd"
> +  [(set (match_operand:DI 0 "even_register_operand" "=Rcr,r,r")
> + (plus:DI
> +  (mult:DI
> +   (sign_extend:DI (match_operand:SI 1 "register_operand" "%0,c,c"))
> +   (sign_extend:DI (match_operand:SI 2 "extend_operand" " c,cI,Cal")))
> +  (reg:DI ARCV2_ACC)))
> +   (set (reg:DI ARCV2_ACC)
> + (plus:DI
> +  (mult:DI (sign_extend:DI (match_dup 1))
> +   (sign_extend:DI (match_dup 2)))
> +  (

[PATCH] Clarify mt_allocator documentation w.r.t deallocation

2017-05-08 Thread Jonathan Wakely


The interwebs seem confused by these docs, and think it means that
mt_allocator just leaks memory instead of deallocating it. The
interwebs are pretty dumb sometimes.

* doc/xml/manual/mt_allocator.xml: Clarify deallocation behaviour.
* doc/html/*: Regenerate.

Committed to trunk.

commit 4d46264542fef99c1eaf8d10e996c741f109ba44
Author: Jonathan Wakely 
Date:   Mon May 8 16:04:53 2017 +0100

Clarify mt_allocator documentation w.r.t deallocation

* doc/xml/manual/mt_allocator.xml: Clarify deallocation behaviour.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/mt_allocator.xml 
b/libstdc++-v3/doc/xml/manual/mt_allocator.xml
index 12fe2ee..3254bf8 100644
--- a/libstdc++-v3/doc/xml/manual/mt_allocator.xml
+++ b/libstdc++-v3/doc/xml/manual/mt_allocator.xml
@@ -281,7 +281,8 @@ The _S_initialize() function:
 
 
  Notes about deallocation. This allocator does not explicitly
-release memory. Because of this, memory debugging programs like
+release memory back to the OS, but keeps its own freelists instead.
+Because of this, memory debugging programs like
 valgrind or purify may notice leaks: sorry about this
 inconvenience. Operating systems will reclaim allocated memory at
 program termination anyway. If sidestepping this kind of noise is

Re: [C++ PATCH] namespace bindings

2017-05-08 Thread Andreas Schwab

On Mai 08 2017, Nathan Sidwell  wrote:

> On 05/08/2017 08:07 AM, Andreas Schwab wrote:
>> On Mai 05 2017, Nathan Sidwell  wrote:
>>
>>> This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces
>>> {get,set}_namespace_binding with get_namespace_value and set_global_value
>>> respectively.
>>
>> I'm getting this regression on ia64:
>>
>> FAIL: g++.dg/template/virtual3.C  -std=c++11 (test for excess errors)
>
> I'm not seeing this.  On an ia64-linux target (hosted on x86_64-linux) I
> get the attached output, which is the same as I get on an x86_64-linux
> native build.

g++.dg/cpp0x/defaulted34.C has the same problem.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

MinGW compilation warnings in libiberty's include/environ.h

2017-05-08 Thread Eli Zaretskii

When compiling libiberty (as part of GDB) with MinGW on MS-Windows, I
see the following warning:

 gcc -c -DHAVE_CONFIG_H -O2 -gdwarf-4 -g3 -D__USE_MINGW_ACCESS  -I. 
-I./../include   -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes 
-pedantic  -D_GNU_SOURCE ./setenv.c -o setenv.o
 In file included from ./setenv.c:64:0:
 ./../include/environ.h:30:1: warning: function declaration isn't a 
prototype [-Wstrict-prototypes]
  extern char **environ;
  ^

This was already reported 4 years ago, here:

  https://gcc.gnu.org/ml/gcc-patches/2013-03/msg00471.html

and was solved back then.  But it looks like the offending code was
copied to include/environ.h without the fix, and the warning is thus
back.

The problem is with this code in environ.h:

  #ifndef HAVE_ENVIRON_DECL
  #  ifdef __APPLE__
  # include 
  # define environ (*_NSGetEnviron ())
  #  else
  extern char **environ;
  #  endif
  #  define HAVE_ENVIRON_DECL
  #endif

which causes the MinGW compiler to see the declaration of environ,
whereas MinGW's stdlib.h has this:

  #ifdef __MSVCRT__
  # define _environ  (*__p__environ())
  extern _CRTIMP __cdecl __MINGW_NOTHROW  char ***__p__environ(void);
  # define _wenviron  (*__p__wenviron())
  extern _CRTIMP __cdecl __MINGW_NOTHROW  wchar_t ***__p__wenviron(void);

  #else  /* ! __MSVCRT__ */
  # ifndef __DECLSPEC_SUPPORTED
  # define _environ (*_imp___environ_dll)
  extern char ***_imp___environ_dll;

  # else  /* __DECLSPEC_SUPPORTED */
  # define _environ  _environ_dll
  __MINGW_IMPORT char ** _environ_dll;
  # endif  /* __DECLSPEC_SUPPORTED */
  #endif  /* ! __MSVCRT__ */

  #define environ _environ

Can this be fixed again, please?  The solution, as back then, is this:

  #ifndef HAVE_ENVIRON_DECL
  #  ifdef __APPLE__
  # include 
  # define environ (*_NSGetEnviron ())
  #  else
  # ifndef environ
  extern char **environ;
  # endif
  #  endif
  #  define HAVE_ENVIRON_DECL
  #endif

Thanks.

MinGW compilation warnings in libiberty's waitpid.c

2017-05-08 Thread Eli Zaretskii

When compiling libiberty (as part of GDB) with MinGW on MS-Windows, I
see the following warning:

 gcc -c -DHAVE_CONFIG_H -O2 -gdwarf-4 -g3 -D__USE_MINGW_ACCESS  -I. 
-I./../include   -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes 
-pedantic  -D_GNU_SOURCE ./waitpid.c -o waitpid.o
 ./waitpid.c: In function 'waitpid':
 ./waitpid.c:31:18: warning: implicit declaration of function 'wait' 
[-Wimplicit-function-declaration]
int wpid = wait(stat_loc);
   ^

The file waitpid.c should not be built on MinGW, as it is not needed
on Windows, and will not work if the function is called (because
there's no 'wait' function on MS-Windows).

MinGW compilation warnings in libiberty's xstrndup.c

2017-05-08 Thread Eli Zaretskii

When compiling libiberty (as part of GDB) with MinGW on MS-Windows, I
see the following warning:

 gcc -c -DHAVE_CONFIG_H -O2 -gdwarf-4 -g3 -D__USE_MINGW_ACCESS  -I. 
-I./../include   -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes 
-pedantic  -D_GNU_SOURCE ./xstrndup.c -o xstrndup.o
 ./xstrndup.c: In function 'xstrndup':
 ./xstrndup.c:51:16: warning: implicit declaration of function 'strnlen' 
[-Wimplicit-function-declaration]
size_t len = strnlen (s, n);
 ^

This happens because libiberty.h uses incorrect guards for the
prototype of strnlen:

  #if defined (HAVE_DECL_STRNLEN) && !HAVE_DECL_STRNLEN
  extern size_t strnlen (const char *, size_t);
  #endif

It should use HAVE_STRNLEN instead, because that's the only
strnlen-related macro defined in config.g when strnlen is probed by
the configure script.

Thanks.

[CHKP] Attempt to fix PR79765 (multiversioning and instrumentation)

2017-05-08 Thread Alexander Ivchenko

Hi,

I'm trying to fix the problem with function multiversioning and MPX
instrumentation (PR79765) and I face several issues. I would
appreciate your advice:


The first problem that arises is that multiversioning tries to make
versions out of thunks, which do not have bodies. This is fixed with
this patch:

diff --git a/gcc/ipa-chkp.c b/gcc/ipa-chkp.c
index c7fc3a0..5de11b3 100644
--- a/gcc/ipa-chkp.c
+++ b/gcc/ipa-chkp.c
@@ -718,6 +718,9 @@ chkp_produce_thunks (bool early)
 0, CGRAPH_FREQ_BASE);
  node->create_reference (node->instrumented_version,
   IPA_REF_CHKP, NULL);
+ DECL_ATTRIBUTES (node->decl)
+   = remove_attribute ("target_clones",
+   DECL_ATTRIBUTES (node->decl));
  /* Thunk shouldn't be a cdtor.  */
  DECL_STATIC_CONSTRUCTOR (node->decl) = 0;
  DECL_STATIC_DESTRUCTOR (node->decl) = 0;

However, that is not all that is needed to be fixed. Consider case:

 __attribute__((target_clones("arch=core-avx2", "default")))
int test_fun()
 {
  return test_fun();
 }

Now, in multiple_target.c: create_dispatcher_calls  we are replacing
the call to test_fun with a call to the ifunc resolvler:

(rr) p debug_function (node->decl, TDF_VOPS)
__attribute__((target ("default"), target_clones ("arch=core-avx2",
"default")))
test_fun ()
{
   [100.00%]:
  _2 = test_fun ();
  # VUSE <.MEM_1(D)>
  return _2;
}

(rr) p e
$103 =
(cgraph_edge *) 0x7f658d121340
  (rr) p e->callee

 $104 = 
(rr) p e->caller

 $105 = 
(rr) p inode
$106 =


We do..

 e->redirect_callee (inode);

e->redirect_call_stmt_to_callee ();

And after we have:

(rr) p debug_function (node->decl, TDF_VOPS)

__attribute__((target ("default"), target_clones ("arch=core-avx2",
"default")))   test_fun ()

 {
   
[100.00%]:
  # .MEM = VDEF <.MEM>
   _2 = test_fun.ifunc ();
  # VUSE <.MEM_1(D)>
 return _2;

  }

For MPX-instrumented code the outcome of the same procedure is different:

(rr) p e->caller
$5 = 
(rr) p e->callee
$6 = 
(rr) p inode
$7 = 

(rr) p debug_function (node->decl, TDF_VOPS)
__attribute__((target ("default"), chkp instrumented, target_clones
("arch=core-avx2", "default")))
test_fun.chkp ()
{
   [100.00%]:
  _2 = test_fun.chkp ();
  # VUSE <.MEM_1(D)>
  return _2;
}

-- We don't have   # .MEM = VDEF <.MEM>  part and hence, we hit
internal compiler error in verify_ssa, which fails. The reason for
that is that in  redirect_call_stmt_to_callee for instrumented code
the path is different and update_stmt_fn is not called:

if (e->callee->clone.combined_args_to_skip
 || skip_bounds)
{..}
else
{
..
update_stmt_fn (DECL_STRUCT_FUNCTION (e->caller->decl), new_stmt);
}

Calling this function after if-then-else fixes the issue:

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index e505b10..3dde20d 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1527,8 +1527,8 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 {
   new_stmt = e->call_stmt;
   gimple_call_set_fndecl (new_stmt, e->callee->decl);
-  update_stmt_fn (DECL_STRUCT_FUNCTION (e->caller->decl), new_stmt);
 }
+  update_stmt_fn (DECL_STRUCT_FUNCTION (e->caller->decl), new_stmt);

   /* If changing the call to __cxa_pure_virtual or similar noreturn function,
  adjust gimple_call_fntype too.  */

This call makes sense to me, but is it a correct thing to do?

And finally, the third point. In the same if-then-else, in the "if"
part there is a call to "gsi_replace", which uses "cfun", but
targetclone is ipa pass and hence has cfun defined to NULL. So, is it
a bug at all to use "gsi_replace" in ipa pass?

thanks in advance,
Alexander

Re: [PATCH, RFC] warn about raw pointers that can be unique_ptr

2017-05-08 Thread Joseph Myers

Note that the new option will need documenting in invoke.texi for any 
patch version actually proposed for inclusion.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [C++ PATCH] namespace bindings

2017-05-08 Thread Nathan Sidwell


On 05/08/2017 08:56 AM, Jason Merrill wrote:

On Fri, May 5, 2017 at 4:07 PM, Nathan Sidwell  wrote:

This cleanup patch kills IDENTIFIER_NAMESPACE_VALUE and replaces
{get,set}_namespace_binding with get_namespace_value and set_global_value
respectively.


I'd prefer to stick with the _binding naming, as "value" is more ambiguous.


I've reverted that bit of the change.

nathan

--
Nathan Sidwell
2017-05-08  Nathan Sidwell  

	Revert _binding -> _value change.
	* name-lookup.h (get_namespace_value, set_global_value): Rename to ...
	(get_namespace_binding, set_global_binding): ... these.
	* name-lookup.c (get_namespace_value, set_global_value): Rename to ...
	(get_namespace_binding, set_global_binding): ... these.
	(arg_assoc_namespace, pushdecl_maybe_friend_1,
	check_for_out_of_scope_variable, push_overloaded_decl_1,
	lookup_name_innermost_nonclass_level, push_namespace): Adjust.
	* cp-tree.h (IDENTIFIER_GLOBAL_VALUE,
	SET_IDENTIFIER_GLOBAL_VALUE): Adjust.
	* decl.c (poplevel): Adjust.
	* pt.c (make_constrained_auto): Likewise.

Index: cp-tree.h
===
--- cp-tree.h	(revision 247739)
+++ cp-tree.h	(working copy)
@@ -554,9 +554,9 @@ struct GTY(()) ptrmem_cst {
 typedef struct ptrmem_cst * ptrmem_cst_t;
 
 #define IDENTIFIER_GLOBAL_VALUE(NODE) \
-  get_namespace_value (NULL_TREE, (NODE))
+  get_namespace_binding (NULL_TREE, (NODE))
 #define SET_IDENTIFIER_GLOBAL_VALUE(NODE, VAL) \
-  set_global_value ((NODE), (VAL))
+  set_global_binding ((NODE), (VAL))
 
 #define CLEANUP_P(NODE)		TREE_LANG_FLAG_0 (TRY_BLOCK_CHECK (NODE))
 
Index: decl.c
===
--- decl.c	(revision 247739)
+++ decl.c	(working copy)
@@ -693,7 +693,7 @@ poplevel (int keep, int reverse, int fun
 	   /*class_p=*/true);
 	  tree ns_binding = NULL_TREE;
 	  if (!ob)
-	ns_binding = get_namespace_value (current_namespace, name);
+	ns_binding = get_namespace_binding (current_namespace, name);
 
 	  if (ob && ob->scope == current_binding_level->level_chain)
 	/* We have something like:
Index: name-lookup.c
===
--- name-lookup.c	(revision 247739)
+++ name-lookup.c	(working copy)
@@ -215,7 +215,7 @@ arg_assoc_namespace (struct arg_lookup *
   if (arg_assoc_namespace (k, TREE_PURPOSE (value)))
 	return true;
 
-  value = get_namespace_value (scope, k->name);
+  value = get_namespace_binding (scope, k->name);
   if (!value)
 return false;
 
@@ -1250,7 +1250,7 @@ pushdecl_maybe_friend_1 (tree x, bool is
   /* In case this decl was explicitly namespace-qualified, look it
 	 up in its namespace context.  */
   if (DECL_NAMESPACE_SCOPE_P (x) && namespace_bindings_p ())
-	t = get_namespace_value (DECL_CONTEXT (x), name);
+	t = get_namespace_binding (DECL_CONTEXT (x), name);
   else
 	t = lookup_name_innermost_nonclass_level (name);
 
@@ -1267,7 +1267,7 @@ pushdecl_maybe_friend_1 (tree x, bool is
 	  t = innermost_non_namespace_value (name);
 	  /* Or in the innermost namespace.  */
 	  if (! t)
-	t = get_namespace_value (DECL_CONTEXT (x), name);
+	t = get_namespace_binding (DECL_CONTEXT (x), name);
 	  /* Does it have linkage?  Note that if this isn't a DECL, it's an
 	 OVERLOAD, which is OK.  */
 	  if (t && DECL_P (t) && ! (TREE_STATIC (t) || DECL_EXTERNAL (t)))
@@ -1521,7 +1521,7 @@ pushdecl_maybe_friend_1 (tree x, bool is
 	{
 	  tree decl;
 
-	  decl = get_namespace_value (current_namespace, name);
+	  decl = get_namespace_binding (current_namespace, name);
 	  if (decl && TREE_CODE (decl) == OVERLOAD)
 	decl = OVL_FUNCTION (decl);
 
@@ -1568,7 +1568,7 @@ pushdecl_maybe_friend_1 (tree x, bool is
   else
 	{
 	  /* Here to install a non-global value.  */
-	  tree oldglobal = get_namespace_value (current_namespace, name);
+	  tree oldglobal = get_namespace_binding (current_namespace, name);
 	  tree oldlocal = NULL_TREE;
 	  cp_binding_level *oldscope = NULL;
 	  cxx_binding *oldbinding = outer_binding (name, NULL, true);
@@ -1608,7 +1608,7 @@ pushdecl_maybe_friend_1 (tree x, bool is
 
 	  if (oldlocal == NULL_TREE)
 		oldlocal
-		  = get_namespace_value (current_namespace, DECL_NAME (d));
+		  = get_namespace_binding (current_namespace, DECL_NAME (d));
 	}
 
 	  /* If this is an extern function declaration, see if we
@@ -1972,7 +1972,7 @@ check_for_out_of_scope_variable (tree de
 shadowed = DECL_HAS_SHADOWED_FOR_VAR_P (shadowed)
   ? DECL_SHADOWED_FOR_VAR (shadowed) : NULL_TREE;
   if (!shadowed)
-shadowed = get_namespace_value (current_namespace, DECL_NAME (decl));
+shadowed = get_namespace_binding (current_namespace, DECL_NAME (decl));
   if (shadowed)
 {
   if (!DECL_ERROR_REPORTED (decl))
@@ -2934,7 +2934,7 @@ push_overloaded_decl_1 (tree decl, int f
   int doing_global = (namespace_bindings_p () || !(flags & PUSH_LOCAL));
 
   if (doing_global)
-old = get_namespace_value (DECL_

Re: [PATCH 2/n] [PR tree-optimization/78496] Simplify ASSERT_EXPRs to facilitate jump threading

2017-05-08 Thread Jeff Law


On 05/08/2017 01:25 AM, Richard Biener via gcc-patches wrote:



ps. An astute observer might note that improving the effectiveness of VRP
jump threading seems counterproductive since I've stated I want to remove
VRP jump threading.  These improvements don't significantly change how I was
planning to do that or the infrastructure we're going to rely upon to make
that change.  All that changes is where we get the information from --
ASSERT_EXPRs vs querying a new API that generates the same information as
needed.


That API is already there.  It's called register_edge_assert_for (it might need
a wrapper to be most useful and also needs to be exported / moved from VRP
to somewhere else).  Both VRP and EVRP use it to "compute" ASSERT_EXPRs
As I mentioned in an earlier message, this has a lot of similarity with 
what Andrew has been doing.  Furthermore, parts of this mimick code I've 
written in DOM and code we need for the backward threader to make it 
strong enough to eliminate the VRP threading code.  I'd really like to 
be working on unifying all that work within the next few days (as 
originally hoping for today, but slightly side-tracked last week).





So I'm not sure if changing VRP with your patches is a good thing when you
could have used the new API in the first place ...
I don't see that the changes to date around 78496 change things 
significantly in regards to the immediate plans to remove ASSERT_EXPR. 
The work around 78496 raises the bar in terms of what information we 
need to be able to extract, but that IMHO, is a fine thing to do ;-)



However, the existence of register_edge_assert_for does change how I'm 
looking at the next issue for 78496 as well as how to tackle a host of 
related issues.  It may end up being the case that we stop 78496 work 
after patch #2, work on ASSERT_EXPR removal, then re-eval 78496.


Jeff

Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-08 Thread Jeff Law


On 05/08/2017 01:32 AM, Richard Biener wrote:


Note that I tried last stage3 (it ended up being too late) to get rid
of ASSERT_EXPRs
doing substitute-and-fold itself (basically copy-propagate them out at
this point rather
than as a separate thing later).  This is because the ASSERT_EXPR uses interfere
with the single_use checks in match.pd patterns and thus are actually harmful.

The barrier I ran into was the ASSERT_EXPR use by the threader ... so
now you're making
us rely even more on those :/
Perhaps, but it's a short term thing -- Andrew and I want to get rid of 
ASSERT_EXPRs too.  I really think we all share that common goal.


I can certainly see how they muck up the single_use checks.  They get in 
the way of other things as well by hiding equivalency information.


jeff

Re: [RFC] S/390: Alignment peeling prolog generation

2017-05-08 Thread Robin Dapp

> So the new part is the last point?  There's a lot of refactoring in
3/3 that
> makes it hard to see what is actually changed ...  you need to resist
> in doing this, it makes review very hard.

The new part is actually spread across the three last "-"s.  Attached is
a new version of [3/3] split up into two patches with hopefully less
blending of refactoring and new functionality.

[3/4] Computes full costs when peeling for unknown alignment, uses
either read or write and compares the better one with the peeling costs
for known alignment.  If the peeling for unknown alignment "aligns" more
than twice the number of datarefs, it is preferred over the peeling for
known alignment.

[4/4] Computes the costs for no peeling and compares them with the costs
of the best peeling so far.  If it is not more expensive, no peeling
will be performed.

> I think it's always best to align a ref with known alignment as that
simplifies
> conditions and allows followup optimizations (unrolling of the
> prologue / epilogue).
> I think for this it's better to also compute full costs rather than
relying on
> sth as simple as "number of same aligned refs".
>
> Does the code ever end up misaligning a previously known aligned ref?

The following case used to get aligned via the known alignment of dd but
would not anymore since peeling for unknown alignment aligns two
accesses.  I guess the determining factor is still up for scrutiny and
should probably > 2.  Still, on e.g. s390x no peeling is performed due
to costs.

void foo(int *restrict a, int *restrict b, int *restrict c, int
*restrict d, unsigned int n)
{
  int *restrict dd = __builtin_assume_aligned (d, 8);
  for (unsigned int i = 0; i < n; i++)
{
  b[i] = b[i] + a[i];
  c[i] = c[i] + b[i];
  dd[i] = a[i];
}
}

Regards
 Robin

[PATCH 3/4] Vect peeling cost model

2017-05-08 Thread Robin Dapp

gcc/ChangeLog:

2017-05-08  Robin Dapp  

* tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling):
Return peel info.
(vect_enhance_data_refs_alignment):
Compute full costs when peeling for unknown alignment, compare
to costs for peeling for known alignment and choose the cheaper
one.
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 7b68582..786f826 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1342,7 +1342,7 @@ vect_peeling_hash_get_lowest_cost (_vect_peel_info **slot,
choosing an option with the lowest cost (if cost model is enabled) or the
option that aligns as many accesses as possible.  */
 
-static struct data_reference *
+static struct _vect_peel_extended_info
 vect_peeling_hash_choose_best_peeling (hash_table *peeling_htab,
    loop_vec_info loop_vinfo,
unsigned int *npeel,
@@ -1369,7 +1369,7 @@ vect_peeling_hash_choose_best_peeling (hash_table *peeling_hta
 
*npeel = res.peel_info.npeel;
*body_cost_vec = res.body_cost_vec;
-   return res.peel_info.dr;
+   return res;
 }
 
 /* Return true if the new peeling NPEEL is supported.  */
@@ -1520,6 +1520,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   enum dr_alignment_support supportable_dr_alignment;
   struct data_reference *dr0 = NULL, *first_store = NULL;
   struct data_reference *dr;
+  struct data_reference *dr0_known_align = NULL;
   unsigned int i, j;
   bool do_peeling = false;
   bool do_versioning = false;
@@ -1527,7 +1528,8 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   gimple *stmt;
   stmt_vec_info stmt_info;
   unsigned int npeel = 0;
-  bool all_misalignments_unknown = true;
+  bool one_misalignment_known = false;
+  bool one_misalignment_unknown = false;
   unsigned int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
   unsigned possible_npeel_number = 1;
   tree vectype;
@@ -1652,11 +1654,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   npeel_tmp += nelements;
 }
 
-  all_misalignments_unknown = false;
-  /* Data-ref that was chosen for the case that all the
- misalignments are unknown is not relevant anymore, since we
- have a data-ref with known alignment.  */
-  dr0 = NULL;
+	  one_misalignment_known = true;
 }
   else
 {
@@ -1664,35 +1662,32 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
  peeling for data-ref that has the maximum number of data-refs
  with the same alignment, unless the target prefers to align
  stores over load.  */
-  if (all_misalignments_unknown)
-{
-		  unsigned same_align_drs
-		= STMT_VINFO_SAME_ALIGN_REFS (stmt_info).length ();
-  if (!dr0
-		  || same_align_drs_max < same_align_drs)
-{
-  same_align_drs_max = same_align_drs;
-  dr0 = dr;
-}
-		  /* For data-refs with the same number of related
-		 accesses prefer the one where the misalign
-		 computation will be invariant in the outermost loop.  */
-		  else if (same_align_drs_max == same_align_drs)
-		{
-		  struct loop *ivloop0, *ivloop;
-		  ivloop0 = outermost_invariant_loop_for_expr
-			  (loop, DR_BASE_ADDRESS (dr0));
-		  ivloop = outermost_invariant_loop_for_expr
-			  (loop, DR_BASE_ADDRESS (dr));
-		  if ((ivloop && !ivloop0)
-			  || (ivloop && ivloop0
-			  && flow_loop_nested_p (ivloop, ivloop0)))
-			dr0 = dr;
-		}
+	  unsigned same_align_drs
+		= STMT_VINFO_SAME_ALIGN_REFS (stmt_info).length ();
+	  if (!dr0
+		  || same_align_drs_max < same_align_drs)
+		{
+		  same_align_drs_max = same_align_drs;
+		  dr0 = dr;
+		}
+	  /* For data-refs with the same number of related
+		 accesses prefer the one where the misalign
+		 computation will be invariant in the outermost loop.  */
+	  else if (same_align_drs_max == same_align_drs)
+		{
+		  struct loop *ivloop0, *ivloop;
+		  ivloop0 = outermost_invariant_loop_for_expr
+		(loop, DR_BASE_ADDRESS (dr0));
+		  ivloop = outermost_invariant_loop_for_expr
+		(loop, DR_BASE_ADDRESS (dr));
+		  if ((ivloop && !ivloop0)
+		  || (ivloop && ivloop0
+			  && flow_loop_nested_p (ivloop, ivloop0)))
+		dr0 = dr;
+		}
 
-  if (!first_store && DR_IS_WRITE (dr))
-first_store = dr;
-}
+	  if (!first_store && DR_IS_WRITE (dr))
+		first_store = dr;
 
   /* If there are both known and unknown misaligned accesses in the
  loop, we choose peeling amount according to the known
@@ -1703,6 +1698,8 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   if (!first_store && DR_IS_WRITE (dr))

[PATCH 4/4] Vect peeling cost model

2017-05-08 Thread Robin Dapp

gcc/ChangeLog:

2017-05-08  Robin Dapp  

* tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost):
Remove unused variable.
(vect_enhance_data_refs_alignment):
Compare best peelings costs to doing no peeling and choose no
peeling if equal.
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 786f826..67d2f57 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1293,7 +1293,7 @@ vect_peeling_hash_get_lowest_cost (_vect_peel_info **slot,
 {
   vect_peel_info elem = *slot;
   int dummy;
-  unsigned int inside_cost = 0, outside_cost = 0, i;
+  unsigned int inside_cost = 0, outside_cost = 0;
   gimple *stmt = DR_STMT (elem->dr);
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
@@ -1520,7 +1520,6 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   enum dr_alignment_support supportable_dr_alignment;
   struct data_reference *dr0 = NULL, *first_store = NULL;
   struct data_reference *dr;
-  struct data_reference *dr0_known_align = NULL;
   unsigned int i, j;
   bool do_peeling = false;
   bool do_versioning = false;
@@ -1720,6 +1719,9 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   || loop->inner)
 do_peeling = false;
 
+  struct _vect_peel_extended_info peel_for_known_alignment;
+  struct _vect_peel_extended_info peel_for_unknown_alignment;
+  struct _vect_peel_extended_info best_peel;
   unsigned int unknown_align_inside_cost = UINT_MAX;
   unsigned int unknown_align_outside_cost = UINT_MAX;
   unsigned int unknown_align_count = 0;
@@ -1731,74 +1733,72 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   /* Check if the target requires to prefer stores over loads, i.e., if
  misaligned stores are more expensive than misaligned loads (taking
  drs with same alignment into account).  */
-  if (first_store && DR_IS_READ (dr0))
-{
-	  unsigned int load_inside_cost = 0;
-	  unsigned int load_outside_cost = 0;
-	  unsigned int store_inside_cost = 0;
-	  unsigned int store_outside_cost = 0;
-	  stmt_vector_for_cost dummy;
-	  dummy.create (2);
-	  vect_get_peeling_costs_all_drs (dr0,
-	  &load_inside_cost,
-	  &load_outside_cost,
-	  &dummy, vf / 2, vf);
-	  dummy.release ();
-
+  unsigned int load_inside_cost = 0;
+  unsigned int load_outside_cost = 0;
+  unsigned int store_inside_cost = 0;
+  unsigned int store_outside_cost = 0;
+
+  stmt_vector_for_cost dummy;
+  dummy.create (2);
+  vect_get_peeling_costs_all_drs (dr0,
+  &load_inside_cost,
+  &load_outside_cost,
+  &dummy, vf / 2, vf);
+  dummy.release ();
+
+  if (first_store)
+	{
 	  dummy.create (2);
 	  vect_get_peeling_costs_all_drs (first_store,
 	  &store_inside_cost,
 	  &store_outside_cost,
 	  &dummy, vf / 2, vf);
 	  dummy.release ();
+	}
+  else
+	{
+	  store_inside_cost = UINT_MAX;
+	  store_outside_cost = UINT_MAX;
+	}
 
-  if (load_inside_cost > store_inside_cost
-  || (load_inside_cost == store_inside_cost
-		  && load_outside_cost > store_outside_cost))
-	{
-	  dr0 = first_store;
-	  unknown_align_inside_cost = store_inside_cost;
-	  unknown_align_outside_cost = store_outside_cost;
-	}
-	  else
-	{
-	  unknown_align_inside_cost = load_inside_cost;
-	  unknown_align_outside_cost = load_outside_cost;
-	}
-
-	  stmt_vector_for_cost prologue_cost_vec, epilogue_cost_vec;
-	  prologue_cost_vec.create (2);
-	  epilogue_cost_vec.create (2);
+  if (load_inside_cost > store_inside_cost
+	  || (load_inside_cost == store_inside_cost
+	  && load_outside_cost > store_outside_cost))
+	{
+	  dr0 = first_store;
+	  unknown_align_inside_cost = store_inside_cost;
+	  unknown_align_outside_cost = store_outside_cost;
+	}
+  else
+	{
+	  unknown_align_inside_cost = load_inside_cost;
+	  unknown_align_outside_cost = load_outside_cost;
+	}
 
-	  int dummy2;
-	  unknown_align_outside_cost += vect_get_known_peeling_cost
-	(loop_vinfo, vf / 2, &dummy2,
-	 &LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo),
-	 &prologue_cost_vec, &epilogue_cost_vec);
+  stmt_vector_for_cost prologue_cost_vec, epilogue_cost_vec;
+  prologue_cost_vec.create (2);
+  epilogue_cost_vec.create (2);
 
-	  prologue_cost_vec.release ();
-	  epilogue_cost_vec.release ();
+  int dummy2;
+  unknown_align_outside_cost += vect_get_known_peeling_cost
+	(loop_vinfo, vf / 2, &dummy2,
+	 &LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo),
+	 &prologue_cost_vec, &epilogue_cost_vec);
 
-	  unknown_align_count = 1 + STMT_VINFO_SAME_ALIGN_REFS
-	(vinfo_for_stmt (DR_STMT (dr0))).length ();
-}
+  prologue_cost_vec.release ();
+  epilogue_cost_vec.release ();
 
-  /* In case there are only loads with different unknown misalignments, use
- peeling only if it may help to align

Re: [PATCH] [PR79542][Ada] Fix ICE in dwarf2out.c with nested func. inlining

2017-05-08 Thread Jason Merrill

On Wed, May 3, 2017 at 3:22 PM, Jason Merrill  wrote:
> On Tue, Mar 14, 2017 at 8:24 AM, Pierre-Marie de Rodat
>  wrote:
>> Hello,
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79542 reports an ICE in
>> dwarf2out.c for an Ada testcase built with optimization.
>>
>> This crash happens during the late generation pass because
>> add_gnat_descriptive_type cannot find the type DIE corresponding to some
>> descriptive type after having tried to generate it. This is because the
>> DIE was generated during the early generation pass, but then pruned by
>> the type pruning machinery. So why was it pruned?
>>
>> We are in a situation where we have cloned types (because of inlining,
>> IIUC) whose TYPE_NAME have non-null DECL_ABSTRACT_ORIGIN attributes. As
>> a consequence:
>>
>>   * In modified_type_die, the "handle C typedef types" part calls
>> gen_type_die on the cloned type.
>>
>>   * gen_type_die matches a typedef variant, and then calls gen_decl_die
>> on its TYPE_NAME, which will end up calling gen_typedef_die.
>>
>>   * gen_typedef_die checks decl_ultimate_origin for this TYPE_DECL, and
>> finds one, so it only adds a DW_AT_abstract_origin attribute to the
>> DW_TAG_typedef DIE, but the cloned type itself does not get its own
>> DIE.
>
> That seems like a bug; if gen_typedef_die is going to generate a DIE
> for a cloned typedef, it needs to associate the type with the DIE.
>
>>   * Back in modified_type_die, the call to lookup_type_die on the type
>> passed to gen_type_die returns NULL.
>
>> In the end, whole type trees, i.e. the ones referenced by
>> DECL_ABSTRACT_ORIGIN attributes, are never referenced from type pruning
>> "roots" and are thus pruned. The descriptive type at stake here is one
>> of them, hence the assertion failure.
>>
>> This patch attemps to fix that with what seems to be the most sensible
>> thing to do in my opinion: updating the "handle C typedef types" part in
>> modified_type_die to check decl_ultimate_origin before calling
>> gen_type_die: if that function returns something not null, then we know
>> that gen_type_die/gen_typedef_die will not generate a DIE for the input
>> type, so we try to process the ultimate origin instead.
>
> This soundsn good; the DWARF standard says that we don't need to have
> a die at all for the cloned typedef.
>
>> @@ -12496,6 +12496,18 @@ modified_type_die (tree type, int cv_quals, bool 
>> reverse,
>>
>>if (qualified_type == dtype)
>> {
>> + tree origin
>> +  = TYPE_NAME (qualified_type) == NULL
>> +? NULL
>> +: decl_ultimate_origin (TYPE_NAME (qualified_type));
>
> This is unnecessarily complicated; at this point we already know that
> TYPE_NAME (qualified_type) is non-null and in the variable "name".
>
>> + /* Typedef variants that have an abstract origin don't get their 
>> own
>> +type DIE (see gen_typedef_die), so fall back on the ultimate
>
> gen_typedef_die does create a DIE for them, it just doesn't do it
> properly.  But we could change gen_typedef_die to abort in that case,
> making this comment correct.

Something like this, which also avoids routinely creating DIEs for
local typedefs that will only be pruned away later; this patch doesn't
change the size of .debug_info in cc1plus.
commit 56cce11181d1296e43cb4d603fc8efc6ac2570fa
Author: Jason Merrill 
Date:   Thu May 4 15:00:51 2017 -0400

inline-static

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 6caf598..bf6a65b 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -24355,7 +24355,7 @@ gen_typedef_die (tree decl, dw_die_ref context_die)
   type_die = new_die (DW_TAG_typedef, context_die, decl);
   origin = decl_ultimate_origin (decl);
   if (origin != NULL)
-add_abstract_origin_attribute (type_die, origin);
+gcc_unreachable (), add_abstract_origin_attribute (type_die, origin);
   else
 {
   tree type = TREE_TYPE (decl);
@@ -24858,6 +24858,16 @@ process_scope_var (tree stmt, tree decl, tree origin, 
dw_die_ref context_die)
   else
 die = NULL;
 
+  if ((origin || DECL_ABSTRACT_ORIGIN (decl))
+  && (TREE_CODE (decl_or_origin) == TYPE_DECL
+ || (VAR_P (decl_or_origin) && TREE_STATIC (decl_or_origin
+{
+  origin = decl_ultimate_origin (decl_or_origin);
+  if (decl && VAR_P (decl))
+   equate_decl_number_to_die (decl, lookup_decl_die (origin));
+  return;
+}
+
   if (die != NULL && die->die_parent == NULL)
 add_child_die (context_die, die);
   else if (TREE_CODE (decl_or_origin) == IMPORTED_DECL)

Re: [PATCH, v3] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch

2017-05-08 Thread Peter Bergner


On 05/03/2017 08:32 AM, Richard Biener wrote:
> As for Bernhards concern I share this -- please intead make the
> interface take either a gimple_seq or a gimple_stmt_iterator
> instead of a basic-block.  That makes it more obvious you
> can't use things like gsi_after_labels.  Also I think it's more
> natural to work backwards as the last stmt in the sequence
> _has_ to be __builtin_unreachable () and thus checking that first
> is the cheapest thing to do given that in most cases it will
> not be __builtin_unreachable () (but a noreturn call or an
> inifinite loop).
>
> Thus, name it gimple_seq_unreachable_p.

So you mean something like the following?


/* Returns true if the sequence of statements STMTS only contains
   a call to __builtin_unreachable ().  */

bool
gimple_seq_unreachable_p (gimple_seq stmts)
{
  gimple_stmt_iterator gsi = gsi_last (stmts);

  if (!gimple_call_builtin_p (gsi_stmt (gsi), BUILT_IN_UNREACHABLE))
return false;

  for (gsi_prev (&gsi); !gsi_end_p (gsi); gsi_prev (&gsi))
{
  gimple *stmt = gsi_stmt (gsi);
  if (gimple_code (stmt) != GIMPLE_LABEL
  && !is_gimple_debug (stmt)
  && !gimple_clobber_p (stmt))
  return false;
}
  return true;
}



> On Wed, 26 Apr 2017, Peter Bergner wrote:
>> One difference from the last patch is that I am no longer setting
>> default_label to NULL when we emit a decision tree.  I noticed that
>> the decision tree code seemed to generate slightly better code for
>> some of my unit tests if I left it alone.  This simplified the
>> patch somewhat by removing the changes to emit_case_nodes().
[snip]
>
> Can you do the gimple_unreachable_bb_p check earlier in
> expand_case so it covers the emit_case_decision_tree path as well
> (and verify that works, of course)?  So basically right at
>
>   /* Find the default case target label.  */
>   default_label = jump_target_rtx
>   (CASE_LABEL (gimple_switch_default_label (stmt)));
>   edge default_edge = EDGE_SUCC (bb, 0);
>   int default_prob = default_edge->probability;
>
> handle this case.

That is what the previous patch did, but as I mention above,
we generate slightly better code for some test cases (other
tests seemed to generate the same code) if we don't attempt
to handle the decision tree case.  I'll note that the current
unpatched compiler already knows how to remove unreachable
case statement blocks when we expand to a decision tree.

I can add that code back if you think that it will have a
positive benefit for some test case I haven't tried yet.

Peter

Re: [PATCH 1/2] C++ template type diff printing

2017-05-08 Thread David Malcolm

On Mon, 2017-05-08 at 07:36 -0400, Nathan Sidwell wrote:
> On 05/04/2017 07:44 PM, David Malcolm wrote:
> > This patch kit implements two new options to make it easier
> > to read diagnostics involving mismatched template types:
> >-fdiagnostics-show-template-tree and
> >-fno-elide-type.
> > 
> > It adds two new formatting codes: %H and %I which are
> > equivalent to %qT, but are to be used together for type
> > comparisons e.g.
> >"can't convert from %H to %I".
> 
> While I can see why one might want %H and %I to imply quoting, it
> seems 
> confusingly inconsistent with the other formatters, which require an 
> explicit 'q'.

This an implementation detail leaking through, but I *think* it's
fixable.

The implementation has to jump through some hoops to interact with
pp_format, because the two codes interact with each other: in order to
elide commonality and colorize differences we can't start printing the
%H until we've seen the %I (vice versa should work also).

The 'q' for quoting the next format code happens in the 2nd phase of
 pp_format: if 'q' is present, then a quote (and potentially
colorization) is emitted to the chunk for the item before and after the
item's main content is emitted to the chunk.

To make the interaction of %H and %I work, we have to delay the writing
to the chunk to a new phase between phases 2 and 3 (and stash a pointer
to the chunk we're going to write to).

As written, if one uses '%qH' and '%qI', then phase 2 writes the open
and close quotes to the chunks, and then the delayed handler blows that
away and overwrites the chunk content with (forcibly) quoted content
for the types.

So I think it can work if we add a "needs quoting" flag to the
postprocessing phase, if we need to handle the case where %H and %I
ever appear without 'q' (and have the delayed handling stash that flag,
and do the quoting there).

I'll look at implementing that.

> > rather than printing:
> > 
> >could not convert 'std::map >()'
> >  from 'std::map >' to 'std::map > std::vector >'
> > 
> > with -felide-type (the default), it prints:
> > 
> >could not convert 'std::map >()'
> >  from 'map<[...],vector>' to 'map<[...],vector>
> 
> Neat.
> 
> Is '-felide-type' a good name?  Wouldn't something like 
> '-fdiagnostic-elide-template-args' be better?

Here I'm merely copying clang's option name.
  http://clang.llvm.org/docs/UsersManual.html#cmdoption-fno-elide-type

> > With -fdiagnostics-show-template-tree a tree-like structure of the
> > template is printed, showing the differences; in this case:
> 
> 'tree' sounds implementor-speaky,  Perhaps 
> '-fdiagnostic-expand-template-args' or something?

As Jason noted, this is a user-visible thing that's tree-like, rather
than our "tree" type.

Also, I'm (shamelessly) copying clang's name for this.

> Right, bikeshedding out of the way ...
> 
> > --- a/gcc/cp/error.c
> > +++ b/gcc/cp/error.c
> > @@ -99,7 +99,47 @@ static void cp_diagnostic_starter
> > (diagnostic_context *, diagnostic_info *);
> >   static void cp_print_error_function (diagnostic_context *,
> > diagnostic_info *);
> >   
> 
> > +
> > +/* Return true iff TYPE_A and TYPE_B are template types that are
> > +   meaningful to compare.  */
> > +
> > +static bool
> > +comparable_template_types_p (tree type_a, tree type_b)
> > +{
> > +  if (TREE_CODE (type_a) != RECORD_TYPE)
> > +return false;
> > +  if (TREE_CODE (type_b) != RECORD_TYPE)
> > +return false;
> 
> CLASS_TYPE_P (type_a) etc?

I'll use this in the next version.

> > +  int len_max = MAX (len_a, len_b);
> > +  for (int idx = 0; idx < len_max; idx++)
> > +{
> > +  tree arg_a = idx < len_a ? TREE_VEC_ELT (args_a, idx) :
> > NULL;
> > +  tree arg_b = idx < len_b ? TREE_VEC_ELT (args_b, idx) :
> > NULL;
> > +  if (arg_a == arg_b)
> 
> Should this use same_type_p for types?  If direct comparison is
> correct, 
> a comment would help.

What about non-type template arguments?  That said, I just tried one,
and it's handled poorly by my patch.  I'll try to address for the next
version.

> > +   Only non-default template arguments are printed.  */
> > +
> > +static void
> > +print_template_tree_comparison (pretty_printer *pp, tree type_a,
> > tree type_b,
> 
> > +  for (int idx = 0; idx < len_max; idx++)
> > +{
> > +  tree arg_a = idx < len_a ? TREE_VEC_ELT (args_a, idx) :
> > NULL;
> > +  tree arg_b = idx < len_b ? TREE_VEC_ELT (args_b, idx) :
> > NULL;
> > +  if (arg_a == arg_b)
> 
> Same question.  If these have to be comparable template type, when
> can 
> len_a and len_b be different?

I guess I was thinking about variadic templates.  I'll have another
look at how these are handled.

> > +   This is called in phase 2 of pp_format, when it is accumulating
> > +   a series of formatted chunks.  We stash the location of the
> > chunk
> > +   we're meant to have written to, so that we can write to it in
> > the
> > +   m_format_postprocessor hook.  */
> > +
> > +static void
> > +def

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-08 Thread Jerry DeLisle

On 05/05/2017 01:31 PM, Thomas Koenig wrote:
> Hello world,
> 
> the attached patch reduces the stack usage by the blocked
> version of matmul for cases where we don't need the full buffer.
> This should improve stack usage.
> 
> Regression-tested.  I also added a stress test (around 3 secs of
> CPU time on my system), it will only run once due to the "dg-do  run"
> hack).
> 
> OK for trunk?
> 

OK, thanks.

Jerry

Re: [C++ PATCH] namespace bindings

2017-05-08 Thread Nathan Sidwell


On 05/08/2017 11:17 AM, Andreas Schwab wrote:


g++.dg/cpp0x/defaulted34.C has the same problem.


I'd managed to flub my xcompiler build.  The code path for 
fn-descriptor-using targets is different, and I'd missed that the 
shadowed outer 'fn' was expected to have been initialized  in that case.


Committed the attached fix.

nathan

--
Nathan Sidwell
2017-05-08  Nathan Sidwell  

	* class.c (build_vtbl_initializer): Don't shadow outer variable
	with static var.

Index: cp/class.c
===
--- cp/class.c	(revision 247746)
+++ cp/class.c	(working copy)
@@ -9769,18 +9769,19 @@ build_vtbl_initializer (tree binfo,
 	  /* Likewise for deleted virtuals.  */
 	  else if (DECL_DELETED_FN (fn_original))
 	{
-	  static tree fn;
+	  static tree dvirt_fn;
 
-	  if (!fn)
+	  if (!dvirt_fn)
 		{
 		  tree name = get_identifier ("__cxa_deleted_virtual");
-		  fn = IDENTIFIER_GLOBAL_VALUE (name);
-		  if (!fn)
-		fn = push_library_fn
+		  dvirt_fn = IDENTIFIER_GLOBAL_VALUE (name);
+		  if (!dvirt_fn)
+		dvirt_fn = push_library_fn
 		  (name,
 		   build_function_type_list (void_type_node, NULL_TREE),
 		   NULL_TREE, ECF_NORETURN);
 		}
+	  fn = dvirt_fn;
 	  if (!TARGET_VTABLE_USES_DESCRIPTORS)
 		init = fold_convert (vfunc_ptr_type_node,
  build_fold_addr_expr (fn));
@@ -9789,7 +9790,8 @@ build_vtbl_initializer (tree binfo,
 	{
 	  if (!integer_zerop (delta) || vcall_index)
 		{
-		  fn = make_thunk (fn, /*this_adjusting=*/1, delta, vcall_index);
+		  fn = make_thunk (fn, /*this_adjusting=*/1,
+   delta, vcall_index);
 		  if (!DECL_NAME (fn))
 		finish_thunk (fn);
 		}

Re: Test cases to check OpenACC offloaded function's attributes and classification

2017-05-08 Thread Thomas Schwinge

Hi!

Ping.

On Thu, 4 Aug 2016 16:06:10 +0200, I wrote:
> Ping.
> 
> On Wed, 27 Jul 2016 10:59:02 +0200, I wrote:
> > OK for trunk?

(In the mean time, I also added some more testing.)

commit b7d61270dfc581a6ea130f7a4fa7506a0a5762d8
Author: Thomas Schwinge 
Date:   Mon May 8 18:22:50 2017 +0200

Test cases to check OpenACC offloaded function's attributes and 
classification

gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c: New file.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-routine.f95: Likewise.
---
 .../goacc/classify-kernels-unparallelized.c| 39 
 .../c-c++-common/goacc/classify-kernels.c  | 35 ++
 .../c-c++-common/goacc/classify-parallel.c | 28 +++
 .../c-c++-common/goacc/classify-routine.c  | 30 
 .../goacc/classify-kernels-unparallelized.f95  | 41 ++
 .../gfortran.dg/goacc/classify-kernels.f95 | 37 +++
 .../gfortran.dg/goacc/classify-parallel.f95| 30 
 .../gfortran.dg/goacc/classify-routine.f95 | 29 +++
 8 files changed, 269 insertions(+)

diff --git gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c 
gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
new file mode 100644
index 000..a76351c
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
@@ -0,0 +1,39 @@
+/* Check offloaded function's attributes and classification for unparallelized
+   OpenACC kernels.  */
+
+/* { dg-additional-options "-O2" }
+   { dg-additional-options "-fdump-tree-ompexp" }
+   { dg-additional-options "-fdump-tree-parloops1-all" }
+   { dg-additional-options "-fdump-tree-oaccdevlow" } */
+
+#define N 1024
+
+extern unsigned int *__restrict a;
+extern unsigned int *__restrict b;
+extern unsigned int *__restrict c;
+
+/* An "extern"al mapping of loop iterations/array indices makes the loop
+   unparallelizable.  */
+extern unsigned int f (unsigned int);
+
+void KERNELS ()
+{
+#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N])
+  for (unsigned int i = 0; i < N; i++)
+c[i] = a[f (i)] + b[f (i)];
+}
+
+/* Check the offloaded function's attributes.
+   { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp target 
entrypoint\\)\\)" 1 "ompexp" } } */
+
+/* Check that exactly one OpenACC kernels construct is analyzed, and that it
+   can't be parallelized.
+   { dg-final { scan-tree-dump-times "FAILED:" 1 "parloops1" } }
+   { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function 
\\(, , \\), omp target entrypoint\\)\\)" 1 "parloops1" } }
+   { dg-final { scan-tree-dump-not "SUCCESS: may be parallelized" "parloops1" 
} } */
+
+/* Check the offloaded function's classification and compute dimensions (will
+   always be 1 x 1 x 1 for non-offloading compilation).
+   { dg-final { scan-tree-dump-times "(?n)Function is kernels offload" 1 
"oaccdevlow" } }
+   { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 
1 "oaccdevlow" } }
+   { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function 
\\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */
diff --git gcc/testsuite/c-c++-common/goacc/classify-kernels.c 
gcc/testsuite/c-c++-common/goacc/classify-kernels.c
new file mode 100644
index 000..199a73e
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/classify-kernels.c
@@ -0,0 +1,35 @@
+/* Check offloaded function's attributes and classification for OpenACC
+   kernels.  */
+
+/* { dg-additional-options "-O2" }
+   { dg-additional-options "-fdump-tree-ompexp" }
+   { dg-additional-options "-fdump-tree-parloops1-all" }
+   { dg-additional-options "-fdump-tree-oaccdevlow" } */
+
+#define N 1024
+
+extern unsigned int *__restrict a;
+extern unsigned int *__restrict b;
+extern unsigned int *__restrict c;
+
+void KERNELS ()
+{
+#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N])
+  for (unsigned int i = 0; i < N; i++)
+c[i] = a[i] + b[i];
+}
+
+/* Check the offloaded function's attributes.
+   { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp target 
entrypoint\\)\\)" 1 "ompexp" } } */
+
+/* Check that exactly one OpenACC kernels construct is analyzed, and that it
+   can be parallelized.
+   { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 
"parloops1" } }
+   { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function 
\\(0, , \\), omp target entrypoint\\)\\)" 1 "parloops1" } }
+   { dg-final { scan-tree-dum

[PATCH v2,rs6000] Add built-in function support for compare bytes instruction

2017-05-08 Thread Kelvin Nilsen


This patch adds support for the compare bytes instruction, which has
been available in the rs6000 architecture since Power6.  Thank you to
Segher Boessenkool for feedback on the original submission of this
patch.  The following refinements have been incorporated:

1. Changed the implementation and documentation to present a single
overloaded function that handles either 32-bit or 64-bit arguments.

2. Corrected the spelling of compare in the comment describing the
RS6000_BTM_CMPB macro.  In response to reviewer question of whether
this line is too long: it is not.  It only appears that way due to
alignment of tabs in the diff output.

The patch has been bootstrapped and tested on powerpc64le-unknown-linux
and powerpc-unknown-linux (big-endian, with both -m32 and -m64 target
options) with no regressions.

Is this ok for the trunk?

gcc/testsuite/ChangeLog:

2017-05-08  Kelvin Nilsen  

* gcc.target/powerpc/cmpb-1.c: New test.
* gcc.target/powerpc/cmpb-2.c: New test.
* gcc.target/powerpc/cmpb-3.c: New test.
* gcc.target/powerpc/cmpb32-1.c: New test.
* gcc.target/powerpc/cmpb32-2.c: New test.

gcc/ChangeLog:

2017-05-08  Kelvin Nilsen  

* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
array entries to represent two legal parameterizations of the
overloaded __builtin_cmpb function, as represented by the
P6_OV_BUILTIN_CMPB constant.
(altivec_resolve_overloaded_builtin): Add special case handling
for the __builtin_cmpb function, as represented by the
P6_OV_BUILTIN_CMPB constant.
* config/rs6000/rs6000-builtin.def (BU_P6_2): New macro.
(BU_P6_64BIT_2): New macro.
(BU_P6_OVERLOAD_2): New macro
(CMPB_32): Add 32-bit compare-bytes support for 32-bit only targets.
(CMPB): Add 64-bit compare-bytes support for 32-bit and 64-bit targets.
(CMPB): Add overload support to represent both 32-bit and 64-bit
compare-bytes function.
* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Add
support for TARGET_CMPB.
* config/rs6000/rs6000.h: Add support for RS6000_BTM_CMPB.
* doc/extend.texi (PowerPC AltiVec Built-in Functions): Add
documentation of the __builtin_cmpb overloaded built-in function.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 247069)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3788,6 +3788,7 @@ HOST_WIDE_INT
 rs6000_builtin_mask_calculate (void)
 {
   return (((TARGET_ALTIVEC)? RS6000_BTM_ALTIVEC   : 0)
+ | ((TARGET_CMPB)  ? RS6000_BTM_CMPB  : 0)
  | ((TARGET_VSX)   ? RS6000_BTM_VSX   : 0)
  | ((TARGET_SPE)   ? RS6000_BTM_SPE   : 0)
  | ((TARGET_PAIRED_FLOAT)  ? RS6000_BTM_PAIRED: 0)
Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 247069)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -2717,6 +2717,7 @@ extern int frame_pointer_needed;
aren't in target_flags.  */
 #define RS6000_BTM_ALWAYS  0   /* Always enabled.  */
 #define RS6000_BTM_ALTIVEC MASK_ALTIVEC/* VMX/altivec vectors.  */
+#define RS6000_BTM_CMPBMASK_CMPB   /* ISA 2.05: compare 
bytes.  */
 #define RS6000_BTM_VSX MASK_VSX/* VSX (vector/scalar).  */
 #define RS6000_BTM_P8_VECTOR   MASK_P8_VECTOR  /* ISA 2.07 vector.  */
 #define RS6000_BTM_P9_VECTOR   MASK_P9_VECTOR  /* ISA 3.0 vector.  */
Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 247069)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -5348,6 +5348,11 @@ const struct altivec_builtin_types altivec_overloa
 RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
 RS6000_BTI_unsigned_V1TI, 0 },
 
+  { P6_OV_BUILTIN_CMPB, P6_BUILTIN_CMPB_32,
+RS6000_BTI_UINTSI, RS6000_BTI_UINTSI, RS6000_BTI_UINTSI, 0 },
+  { P6_OV_BUILTIN_CMPB, P6_BUILTIN_CMPB,
+RS6000_BTI_UINTDI, RS6000_BTI_UINTDI, RS6000_BTI_UINTDI, 0 },
+
   { P8V_BUILTIN_VEC_VUPKHSW, P8V_BUILTIN_VUPKHSW,
 RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
   { P8V_BUILTIN_VEC_VUPKHSW, P8V_BUILTIN_VUPKHSW,
@@ -6409,25 +6414,76 @@ altivec_resolve_overloaded_builtin (location_t loc
 for (desc = altivec_overloaded_builtins;
 desc->code && desc->code != fcode; desc++)
   continue;
-
-/* For arguments after the last, we have RS6000_BTI_NOT_OPAQUE in
-   the opX fields.  */
-for (; desc->code == fcode; desc++)
+
+/* Need to special case __builtin_cmp because the overloaded forms
+   of this function take (unsigned int, unsigned int) or (unsigned
+   long long int, unsigned long long int).  Since C co

[PATCH] Tweak static assertions in std::optional

2017-05-08 Thread Jonathan Wakely


This simplifies the conditions of some static assertions, using one
assertion per condition instead of asserting the conjunction. This
means we don't need any string literal, because the conditions are
quite readable now.

* include/std/optional: Use a separate static_assert per condition.
* testsuite/20_util/optional/cons/value_neg.cc: Update dg-error line
numbers.

Tested powerpc64le-linux, committed to trunk.


commit 90daca5d8383567deaff9817c4cca92309573311
Author: Jonathan Wakely 
Date:   Mon May 8 16:23:44 2017 +0100

Tweak static assertions in std::optional

* include/std/optional: Use a separate static_assert per condition.
* testsuite/20_util/optional/cons/value_neg.cc: Update dg-error line
numbers.

diff --git a/libstdc++-v3/include/std/optional 
b/libstdc++-v3/include/std/optional
index 1724120..5aa926f 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -462,10 +462,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 // Unique tag type.
 optional<_Tp>>
 {
-  static_assert(__and_<__not_, nullopt_t>>,
-  __not_, in_place_t>>,
-  __not_>>(),
-"Invalid instantiation of optional");
+  static_assert(!is_same_v, nullopt_t>);
+  static_assert(!is_same_v, in_place_t>);
+  static_assert(!is_reference_v<_Tp>);
 
 private:
   using _Base = _Optional_base<_Tp>;
@@ -756,9 +755,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
constexpr _Tp
value_or(_Up&& __u) const&
{
- static_assert(__and_,
-  is_convertible<_Up&&, _Tp>>(),
-   "Cannot return value");
+ static_assert(is_copy_constructible_v<_Tp>);
+ static_assert(is_convertible_v<_Up&&, _Tp>);
 
  return this->_M_is_engaged()
? this->_M_get()
@@ -769,9 +767,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_Tp
value_or(_Up&& __u) &&
{
- static_assert(__and_,
-  is_convertible<_Up&&, _Tp>>(),
-   "Cannot return value" );
+ static_assert(is_move_constructible_v<_Tp>);
+ static_assert(is_convertible_v<_Up&&, _Tp>);
 
  return this->_M_is_engaged()
? std::move(this->_M_get())
diff --git a/libstdc++-v3/testsuite/20_util/optional/cons/value_neg.cc 
b/libstdc++-v3/testsuite/20_util/optional/cons/value_neg.cc
index 87907f9..5abe26e 100644
--- a/libstdc++-v3/testsuite/20_util/optional/cons/value_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/optional/cons/value_neg.cc
@@ -37,8 +37,8 @@ int main()
 std::optional> oup2 = new int;  // { dg-error 
"conversion" }
 struct U { explicit U(std::in_place_t); };
 std::optional ou(std::in_place); // { dg-error "no matching" }
-// { dg-error "no type" "" { target { *-*-* } } 488 }
-// { dg-error "no type" "" { target { *-*-* } } 498 }
-// { dg-error "no type" "" { target { *-*-* } } 555 }
+// { dg-error "no type" "" { target { *-*-* } } 487 }
+// { dg-error "no type" "" { target { *-*-* } } 497 }
+// { dg-error "no type" "" { target { *-*-* } } 554 }
   }
 }

Re: [PATCH] Don't assume __secure_getenv is available

2017-05-08 Thread Jerry DeLisle

On 05/07/2017 11:37 PM, Janne Blomqvist via fortran wrote:
> PING
> 
> On Thu, Apr 27, 2017 at 9:55 PM, Janne Blomqvist
>  wrote:
>> On Thu, Apr 27, 2017 at 9:50 PM, Janne Blomqvist
>>  wrote:
>> [snip]
>>
>> And on top of that patch this simple typo fix:
>>
>> diff --git a/libgfortran/runtime/environ.c b/libgfortran/runtime/environ.c
>> index 969dcdf..f488e87 100644
>> --- a/libgfortran/runtime/environ.c
>> +++ b/libgfortran/runtime/environ.c
>> @@ -46,7 +46,7 @@ static char* weak_secure_getenv (const char*)
>>  char *
>>  secure_getenv (const char *name)
>>  {
>> -#if SUPPORTS_WEAKREF && defined(HAVE__SECURE_GETENV)
>> +#if SUPPORTS_WEAKREF && defined(HAVE___SECURE_GETENV)
>>if (weak_secure_getenv)
>>  return weak_secure_getenv (name);
>>  #endif
>>
>>
>>
>> --
>> Janne Blomqvist
> 
> 
> 

Looks Ok to me. I thought you might be waiting for Jakub to reply.

Thanks,

Jerry

[PATCH] correct length stpncpy handling in -Wformat-overflow (PR 80669)

2017-05-08 Thread Martin Sebor


The -Wformat-overflow warning newly enhanced in GCC 8.0 to detect
reading past the end of the source sequence misinterprets the size
argument to stpncpy as a request to read that many bytes from the
source sequence, rather than the number of bytes to write.  Like
strncpy, the function never reads past the end of the source string.

This constraint is already handled correctly for strncpy by
the check_sizes function so the fix is to simply remove the wrong
length handling from expand_builtin_stpncpy and let check_sizes
do the right thing.

The attached patch passes bootstrap and regression test on x86_64.

Martin
PR middle-end/80669 - Bad -Wstringop-overflow warnings for stpncpy

gcc/ChangeLog:

	PR middle-end/80669
	* builtins.c (expand_builtin_stpncpy): Simplify.

gcc/testsuite/ChangeLog:

	PR middle-end/80669
	* gcc.dg/builtin-stpncpy.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 936fcee..4f6c9c4 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3788,31 +3788,18 @@ expand_builtin_stpncpy (tree exp, rtx)
   || !warn_stringop_overflow)
 return NULL_RTX;
 
+  /* The source and destination of the call.  */
   tree dest = CALL_EXPR_ARG (exp, 0);
   tree src = CALL_EXPR_ARG (exp, 1);
 
-  /* The number of bytes to write (not the maximum).  */
+  /* The exact number of bytes to write (not the maximum).  */
   tree len = CALL_EXPR_ARG (exp, 2);
-  /* The length of the source sequence.  */
-  tree slen = c_strlen (src, 1);
-
-  /* Try to determine the range of lengths that the source expression
- refers to.  */
-  tree lenrange[2];
-  if (slen)
-lenrange[0] = lenrange[1] = slen;
-  else
-{
-  get_range_strlen (src, lenrange);
-  slen = lenrange[0];
-}
 
+  /* The size of the destination object.  */
   tree destsize = compute_objsize (dest, warn_stringop_overflow - 1);
 
-  /* The number of bytes to write is LEN but check_sizes will also
- check SLEN if LEN's value isn't known.  */
   check_sizes (OPT_Wstringop_overflow_,
-	   exp, len, /*maxlen=*/NULL_TREE, slen, destsize);
+	   exp, len, /*maxlen=*/NULL_TREE, src, destsize);
 
   return NULL_RTX;
 }
diff --git a/gcc/testsuite/gcc.dg/builtin-stpncpy.c b/gcc/testsuite/gcc.dg/builtin-stpncpy.c
new file mode 100644
index 000..e4290d5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-stpncpy.c
@@ -0,0 +1,74 @@
+/* PR tree-optimization/80669 - Bad -Wstringop-overflow warnings for stpncpy
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+#define SIZE_MAX __SIZE_MAX__
+
+typedef __SIZE_TYPE__ size_t;
+
+void sink (char*);
+
+#define stpncpy (d, s, n)  sink (__builtin_stpncpy (d, s, n))
+
+size_t value (void);
+
+size_t range (size_t min, size_t max)
+{
+  size_t val = value ();
+  return val < min || max < val ? min : val;
+}
+
+/* Verify that no warning is issued for stpncpy with constant size.  */
+void test_cst (char *d)
+{
+  __builtin_stpncpy (d, "123", 0);
+  __builtin_stpncpy (d, "123", 1);
+  __builtin_stpncpy (d, "123", 2);
+  __builtin_stpncpy (d, "123", 3);
+  __builtin_stpncpy (d, "123", 4);
+  __builtin_stpncpy (d, "123", 5);
+  __builtin_stpncpy (d, "123", 999);
+
+  size_t n = SIZE_MAX / 2;
+
+  __builtin_stpncpy (d, "123", n);
+
+  __builtin_stpncpy (d, "123", n + 1);/* { dg-warning "specified size \[0-9\]+ exceeds maximum object size \[0-9\]+" } */
+}
+
+
+/* Verify that no warning is issued for stpncpy with size in some range.  */
+void test_rng (char *d)
+{
+#define R(min, max) range (min, max)
+
+  __builtin_stpncpy (d, "123", R (0, 1));
+  __builtin_stpncpy (d, "123", R (0, 2));
+  __builtin_stpncpy (d, "123", R (0, 3));
+  __builtin_stpncpy (d, "123", R (0, 4));
+  __builtin_stpncpy (d, "123", R (0, 5));
+
+  __builtin_stpncpy (d, "123", R (1, 2));
+  __builtin_stpncpy (d, "123", R (1, 3));
+  __builtin_stpncpy (d, "123", R (1, 4));
+  __builtin_stpncpy (d, "123", R (1, 5));
+
+  __builtin_stpncpy (d, "123", R (2, 3));
+  __builtin_stpncpy (d, "123", R (2, 4));
+  __builtin_stpncpy (d, "123", R (2, 5));
+
+  __builtin_stpncpy (d, "123", R (3, 4));
+  __builtin_stpncpy (d, "123", R (3, 5));
+
+  __builtin_stpncpy (d, "123", R (4, 5));
+
+  __builtin_stpncpy (d, "123", R (5, 6));
+
+  __builtin_stpncpy (d, "123", R (12345, 23456));
+
+  size_t n = SIZE_MAX / 2;
+
+  __builtin_stpncpy (d, "123", R (n - 1, n + 1));
+
+  __builtin_stpncpy (d, "123", R (n + 1, n + 2));   /* { dg-warning "specified size between \[0-9\]+ and \[0-9\]+ exceeds maximum object size \[0-9\]+" } */
+}

Re: [PATCH 1/2] C++ template type diff printing

2017-05-08 Thread Nathan Sidwell


On 05/08/2017 12:51 PM, David Malcolm wrote:

On Mon, 2017-05-08 at 07:36 -0400, Nathan Sidwell wrote:

On 05/04/2017 07:44 PM, David Malcolm wrote:



Is '-felide-type' a good name?  Wouldn't something like
'-fdiagnostic-elide-template-args' be better?


Here I'm merely copying clang's option name.
   http://clang.llvm.org/docs/UsersManual.html#cmdoption-fno-elide-type


Whelp, that's ... unfortunate.

nathan
--
Nathan Sidwell

Re: [PATCH, v3] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch

2017-05-08 Thread Richard Biener

On May 8, 2017 6:41:01 PM GMT+02:00, Peter Bergner  wrote:
>On 05/03/2017 08:32 AM, Richard Biener wrote:
> > As for Bernhards concern I share this -- please intead make the
> > interface take either a gimple_seq or a gimple_stmt_iterator
> > instead of a basic-block.  That makes it more obvious you
> > can't use things like gsi_after_labels.  Also I think it's more
> > natural to work backwards as the last stmt in the sequence
> > _has_ to be __builtin_unreachable () and thus checking that first
> > is the cheapest thing to do given that in most cases it will
> > not be __builtin_unreachable () (but a noreturn call or an
> > inifinite loop).
> >
> > Thus, name it gimple_seq_unreachable_p.
>
>So you mean something like the following?

Yes.

>
>/* Returns true if the sequence of statements STMTS only contains
>a call to __builtin_unreachable ().  */
>
>bool
>gimple_seq_unreachable_p (gimple_seq stmts)
>{
>   gimple_stmt_iterator gsi = gsi_last (stmts);
>
>   if (!gimple_call_builtin_p (gsi_stmt (gsi), BUILT_IN_UNREACHABLE))
> return false;
>
>   for (gsi_prev (&gsi); !gsi_end_p (gsi); gsi_prev (&gsi))
> {
>   gimple *stmt = gsi_stmt (gsi);
>   if (gimple_code (stmt) != GIMPLE_LABEL
>   && !is_gimple_debug (stmt)
>   && !gimple_clobber_p (stmt))
>   return false;
> }
>   return true;
>}
>
>
>
> > On Wed, 26 Apr 2017, Peter Bergner wrote:
> >> One difference from the last patch is that I am no longer setting
> >> default_label to NULL when we emit a decision tree.  I noticed that
> >> the decision tree code seemed to generate slightly better code for
> >> some of my unit tests if I left it alone.  This simplified the
> >> patch somewhat by removing the changes to emit_case_nodes().
>[snip]
> >
> > Can you do the gimple_unreachable_bb_p check earlier in
> > expand_case so it covers the emit_case_decision_tree path as well
> > (and verify that works, of course)?  So basically right at
> >
> >   /* Find the default case target label.  */
> >   default_label = jump_target_rtx
> >   (CASE_LABEL (gimple_switch_default_label (stmt)));
> >   edge default_edge = EDGE_SUCC (bb, 0);
> >   int default_prob = default_edge->probability;
> >
> > handle this case.
>
>That is what the previous patch did, but as I mention above,
>we generate slightly better code for some test cases (other
>tests seemed to generate the same code) if we don't attempt
>to handle the decision tree case.  I'll note that the current
>unpatched compiler already knows how to remove unreachable
>case statement blocks when we expand to a decision tree.
>
>I can add that code back if you think that it will have a
>positive benefit for some test case I haven't tried yet.
>
>Peter

[C++ PATCH] pushdecl

2017-05-08 Thread Nathan Sidwell

This small patch replaces pushdecl_with_scope with a more-specific 
pushdecl_outermost_localscope.  It's used in 2 places to inject an 
artifical decl into the outermost block scope of a function 
(__FUNCTION__ var and lambda capture proxies).


This moves some binding-level handling into name-lookup, where it belongs.

Applied to trunk.

nathan
--
Nathan Sidwell
2017-05-08  Nathan Sidwell  

	* name-lookup.h (pushdecl_with_scope): Replace with ...
	(pushdecl_outermost_localscope): ... this.
	* name-lookup.c (pushdecl_with_scope): Replace with ...
	(pushdecl_outermost_localscope): ... this.
	(pushdecl_namespace_level): Adjust.
	* decl.c (cp_make_fname_decl): Use pushdecl_outermost_localscope.
	* lambda.c (insert_capture_proxy): Likewise.

Index: decl.c
===
--- decl.c	(revision 247751)
+++ decl.c	(working copy)
@@ -4335,9 +4335,6 @@ cp_make_fname_decl (location_t loc, tree
   if (name)
 free (CONST_CAST (char *, name));
 
-  /* As we're using pushdecl_with_scope, we must set the context.  */
-  DECL_CONTEXT (decl) = current_function_decl;
-
   TREE_STATIC (decl) = 1;
   TREE_READONLY (decl) = 1;
   DECL_ARTIFICIAL (decl) = 1;
@@ -4346,12 +4343,8 @@ cp_make_fname_decl (location_t loc, tree
 
   if (current_function_decl)
 {
-  cp_binding_level *b = current_binding_level;
-  if (b->kind == sk_function_parms)
-	return error_mark_node;
-  while (b->level_chain->kind != sk_function_parms)
-	b = b->level_chain;
-  pushdecl_with_scope (decl, b, /*is_friend=*/false);
+  DECL_CONTEXT (decl) = current_function_decl;
+  decl = pushdecl_outermost_localscope (decl);
   cp_finish_decl (decl, init, /*init_const_expr_p=*/false, NULL_TREE,
 		  LOOKUP_ONLYCONVERTING);
 }
Index: lambda.c
===
--- lambda.c	(revision 247751)
+++ lambda.c	(working copy)
@@ -295,24 +295,13 @@ is_normal_capture_proxy (tree decl)
 void
 insert_capture_proxy (tree var)
 {
-  cp_binding_level *b;
-  tree stmt_list;
-
   /* Put the capture proxy in the extra body block so that it won't clash
  with a later local variable.  */
-  b = current_binding_level;
-  for (;;)
-{
-  cp_binding_level *n = b->level_chain;
-  if (n->kind == sk_function_parms)
-	break;
-  b = n;
-}
-  pushdecl_with_scope (var, b, false);
+  pushdecl_outermost_localscope (var);
 
   /* And put a DECL_EXPR in the STATEMENT_LIST for the same block.  */
   var = build_stmt (DECL_SOURCE_LOCATION (var), DECL_EXPR, var);
-  stmt_list = (*stmt_list_stack)[1];
+  tree stmt_list = (*stmt_list_stack)[1];
   gcc_assert (stmt_list);
   append_to_statement_list_force (var, &stmt_list);
 }
Index: name-lookup.c
===
--- name-lookup.c	(revision 247751)
+++ name-lookup.c	(working copy)
@@ -2870,14 +2870,23 @@ pushdecl_with_scope_1 (tree x, cp_bindin
   return x;
 }
  
-/* Wrapper for pushdecl_with_scope_1.  */
+/* Inject X into the local scope just before the function parms.  */
 
 tree
-pushdecl_with_scope (tree x, cp_binding_level *level, bool is_friend)
+pushdecl_outermost_localscope (tree x)
 {
-  tree ret;
   bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
-  ret = pushdecl_with_scope_1 (x, level, is_friend);
+  cp_binding_level *b  = NULL, *n = current_binding_level;
+
+  if (n->kind == sk_function_parms)
+return error_mark_node;
+  do
+{
+  b = n;
+  n = b->level_chain;
+}
+  while (n->kind != sk_function_parms);
+  tree ret = pushdecl_with_scope_1 (x, b, false);
   timevar_cond_stop (TV_NAME_LOOKUP, subtime);
   return ret;
 }
@@ -4350,7 +4359,8 @@ pushdecl_namespace_level (tree x, bool i
   tree t;
 
   bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
-  t = pushdecl_with_scope (x, NAMESPACE_LEVEL (current_namespace), is_friend);
+  t = pushdecl_with_scope_1
+(x, NAMESPACE_LEVEL (current_namespace), is_friend);
 
   /* Now, the type_shadowed stack may screw us.  Munge it so it does
  what we want.  */
Index: name-lookup.h
===
--- name-lookup.h	(revision 247751)
+++ name-lookup.h	(working copy)
@@ -300,6 +300,7 @@ extern tree push_inner_scope (tree);
 extern void pop_inner_scope (tree, tree);
 extern void push_binding_level (cp_binding_level *);
 
+extern tree pushdecl_outermost_localscope (tree);
 extern bool push_namespace (tree);
 extern void pop_namespace (void);
 extern void push_nested_namespace (tree);
@@ -307,7 +308,6 @@ extern void pop_nested_namespace (tree);
 extern bool handle_namespace_attrs (tree, tree);
 extern void pushlevel_class (void);
 extern void poplevel_class (void);
-extern tree pushdecl_with_scope (tree, cp_binding_level *, bool);
 extern tree lookup_name_prefer_type (tree, int);
 extern tree lookup_name_real (tree, int, int, bool, int, int);
 extern tree lookup_type_scope (tree, tag_scope);

[C++ PATCH] function decl pushing

2017-05-08 Thread Nathan Sidwell


This patch implements a couple of changes.

1) set DECL_ANTICIPATED before pushing an anticipated builtin.  My 
overload management changes will keep the overload list ordered, so 
needs to know this up front.


2) pushdecl has some funky code to detect when we're trying to push a 
function decl into its own scope.  And push into its containing scope in 
that case.  But we only need that subterfuge because 
start_preparsed_function sets current_function_decl, before pushing the 
decl itself.  Reordering the order of events there will make pushdecl's 
weirdness unnecessary.


nathan
--
Nathan Sidwell
2017-05-08  Nathan Sidwell  

	* decl.c (builtin_function_1): Set DCL_ANTICIPATED before pushing.
	(start_preparsed_function): Do decl pushing before setting
	current_funciton_decl and announcing it.

Index: decl.c
===
--- decl.c	(revision 247752)
+++ decl.c	(working copy)
@@ -4375,11 +4375,6 @@ builtin_function_1 (tree decl, tree cont
 
   DECL_CONTEXT (decl) = context;
 
-  if (is_global)
-pushdecl_top_level (decl);
-  else
-pushdecl (decl);
-
   /* A function in the user's namespace should have an explicit
  declaration before it is used.  Mark the built-in function as
  anticipated but not actually declared.  */
@@ -4397,6 +4392,11 @@ builtin_function_1 (tree decl, tree cont
 	DECL_ANTICIPATED (decl) = 1;
 }
 
+  if (is_global)
+pushdecl_top_level (decl);
+  else
+pushdecl (decl);
+
   return decl;
 }
 
@@ -14821,17 +14821,10 @@ start_preparsed_function (tree decl1, tr
   decl1 = newdecl1;
 }
 
-  /* We are now in the scope of the function being defined.  */
-  current_function_decl = decl1;
-
-  /* Save the parm names or decls from this function's declarator
- where store_parm_decls will find them.  */
-  current_function_parms = DECL_ARGUMENTS (decl1);
-
   /* Make sure the parameter and return types are reasonable.  When
  you declare a function, these types can be incomplete, but they
  must be complete when you define the function.  */
-  check_function_type (decl1, current_function_parms);
+  check_function_type (decl1, DECL_ARGUMENTS (decl1));
 
   /* Build the return declaration for the function.  */
   restype = TREE_TYPE (fntype);
@@ -14848,9 +14841,6 @@ start_preparsed_function (tree decl1, tr
   cp_apply_type_quals_to_decl (cp_type_quals (restype), resdecl);
 }
 
-  /* Let the user know we're compiling this function.  */
-  announce_function (decl1);
-
   /* Record the decl so that the function name is defined.
  If we already have a decl for this name, and it is a FUNCTION_DECL,
  use the old decl.  */
@@ -14922,9 +14912,16 @@ start_preparsed_function (tree decl1, tr
 	maybe_apply_pragma_weak (decl1);
 }
 
-  /* Reset this in case the call to pushdecl changed it.  */
+  /* We are now in the scope of the function being defined.  */
   current_function_decl = decl1;
 
+  /* Save the parm names or decls from this function's declarator
+ where store_parm_decls will find them.  */
+  current_function_parms = DECL_ARGUMENTS (decl1);
+
+  /* Let the user know we're compiling this function.  */
+  announce_function (decl1);
+
   gcc_assert (DECL_INITIAL (decl1));
 
   /* This function may already have been parsed, in which case just

Re: [PATCH, v3] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch

2017-05-08 Thread Peter Bergner


On 05/08/2017 12:44 PM, Richard Biener wrote:


On Wed, 26 Apr 2017, Peter Bergner wrote:

One difference from the last patch is that I am no longer setting
default_label to NULL when we emit a decision tree.  I noticed that
the decision tree code seemed to generate slightly better code for
some of my unit tests if I left it alone.  This simplified the
patch somewhat by removing the changes to emit_case_nodes().

[snip]

Can you do the gimple_unreachable_bb_p check earlier in
expand_case so it covers the emit_case_decision_tree path as well
(and verify that works, of course)?  So basically right at

   /* Find the default case target label.  */
   default_label = jump_target_rtx
   (CASE_LABEL (gimple_switch_default_label (stmt)));
   edge default_edge = EDGE_SUCC (bb, 0);
   int default_prob = default_edge->probability;

handle this case.

That is what the previous patch did, but as I mention above,
we generate slightly better code for some test cases (other
tests seemed to generate the same code) if we don't attempt
to handle the decision tree case.  I'll note that the current
unpatched compiler already knows how to remove unreachable
case statement blocks when we expand to a decision tree.

I can add that code back if you think that it will have a
positive benefit for some test case I haven't tried yet.


Any comment on the above?

Peter

C++ PATCH for c++/80178 (ABI of non-copyable class)

2017-05-08 Thread Jason Merrill

Bug 80178 points out that we are doing bitwise copy when passing a
class with only deleted copy/move constructors; we should use
invisible reference instead.

The second patch updates the trunk to use -fabi-version=12 by default for GCC 8.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit f34d5751de4b94419ead04c5ab9be0dde4d21953
Author: Jason Merrill 
Date:   Mon May 8 14:24:03 2017 -0400

PR c++/80178 - parameter passing for uncopyable classes

* tree.c (type_has_nontrivial_copy_init): True for classes with only
deleted copy/move ctors.
(remember_deleted_copy, maybe_warn_parm_abi): New.
* decl.c (require_complete_types_for_parms, check_function_type):
Call maybe_warn_parm_abi.
* call.c (convert_for_arg_passing, build_cxx_call): Likewise.

diff --git a/gcc/common.opt b/gcc/common.opt
index a5c3aea..1330555 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -910,6 +910,10 @@ Driver Undocumented
 ; and introduces new inheriting constructor handling.
 ; Default in G++ 7.
 ;
+; 12: Corrects the calling convention for classes with only deleted copy/move
+; constructors.
+; Default in G++ 8.
+;
 ; Additional positive integers will be assigned as new versions of
 ; the ABI become the default version of the ABI.
 fabi-version=
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index f1e431c..d9accd1 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7352,17 +7352,21 @@ convert_for_arg_passing (tree type, tree val, 
tsubst_flags_t complain)
   && COMPLETE_TYPE_P (type)
   && tree_int_cst_lt (TYPE_SIZE (type), TYPE_SIZE (integer_type_node)))
 val = cp_perform_integral_promotions (val, complain);
-  if ((complain & tf_warning)
-  && warn_suggest_attribute_format)
+  if (complain & tf_warning)
 {
-  tree rhstype = TREE_TYPE (val);
-  const enum tree_code coder = TREE_CODE (rhstype);
-  const enum tree_code codel = TREE_CODE (type);
-  if ((codel == POINTER_TYPE || codel == REFERENCE_TYPE)
- && coder == codel
- && check_missing_format_attribute (type, rhstype))
-   warning (OPT_Wsuggest_attribute_format,
-"argument of function call might be a candidate for a format 
attribute");
+  if (warn_suggest_attribute_format)
+   {
+ tree rhstype = TREE_TYPE (val);
+ const enum tree_code coder = TREE_CODE (rhstype);
+ const enum tree_code codel = TREE_CODE (type);
+ if ((codel == POINTER_TYPE || codel == REFERENCE_TYPE)
+ && coder == codel
+ && check_missing_format_attribute (type, rhstype))
+   warning (OPT_Wsuggest_attribute_format,
+"argument of function call might be a candidate "
+"for a format attribute");
+   }
+  maybe_warn_parm_abi (type, EXPR_LOC_OR_LOC (val, input_location));
 }
   return val;
 }
@@ -8234,7 +8238,10 @@ build_cxx_call (tree fn, int nargs, tree *argarray,
return error_mark_node;
 
   if (MAYBE_CLASS_TYPE_P (TREE_TYPE (fn)))
-   fn = build_cplus_new (TREE_TYPE (fn), fn, complain);
+   {
+ fn = build_cplus_new (TREE_TYPE (fn), fn, complain);
+ maybe_warn_parm_abi (TREE_TYPE (fn), loc);
+   }
 }
   return convert_from_reference (fn);
 }
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index d2d48e7..b64fa6d 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6614,6 +6614,7 @@ extern bool type_has_unique_obj_representations 
(const_tree);
 extern bool scalarish_type_p   (const_tree);
 extern bool type_has_nontrivial_default_init   (const_tree);
 extern bool type_has_nontrivial_copy_init  (const_tree);
+extern void maybe_warn_parm_abi(tree, location_t);
 extern bool class_tmpl_impl_spec_p (const_tree);
 extern int zero_init_p (const_tree);
 extern bool check_abi_tag_redeclaration(const_tree, 
const_tree, const_tree);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 2b20bf9..f21058d 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -12394,6 +12394,9 @@ require_complete_types_for_parms (tree parms)
{
  relayout_decl (parms);
  DECL_ARG_TYPE (parms) = type_passed_as (TREE_TYPE (parms));
+
+ maybe_warn_parm_abi (TREE_TYPE (parms),
+  DECL_SOURCE_LOCATION (parms));
}
   else
/* grokparms or complete_type_or_else will have already issued
@@ -14646,7 +14649,11 @@ check_function_type (tree decl, tree 
current_function_parms)
   TREE_TYPE (decl) = fntype;
 }
   else
-abstract_virtuals_error (decl, TREE_TYPE (fntype));
+{
+  abstract_virtuals_error (decl, TREE_TYPE (fntype));
+  maybe_warn_parm_abi (TREE_TYPE (fntype),
+  DECL_SOURCE_LOCATION (decl));
+}
 }
 
 /* True iff FN is an implicitly-defined default constructor.  */
diff --git a/gcc/cp/tr

[PATCH] decl lang hooks

2017-05-08 Thread Nathan Sidwell

cp/name-lookup has a twisty maze of forwarding functions.  One reason is 
that a couple of the names are used directly by c-common.c.  However, we 
now have lang hooks for these things.


This patch changes the C++ FE to override the pushdecl and getdecl lang 
hooks.  In addition to simply overriding them there, I had to fixup a 
couple of places in c-family/c-common.c and 
objc/objc-gnu-runtime-abi-01.c to use the pushdecl hook.


This patch doesn't reduce the twisty maze, but will enable me to do so 
shortly.


ok?

nathan
--
Nathan Sidwell
2017-05-08  Nathan Sidwell  

	gcc/c/
	* c-tree.h (pushdecl): Declare.
	gcc/cp/
	* cp-lang.c (get_global_decls, cxx_pushdecl): New.
	(LANG_HOOKS_GETDECLS, LANG_HOOKS_PUSHDECL): Override.
	* name-lookup.h (pushdecl_top_level): Declare.
	gcc/c-family/
	* c-common.c (c_register_builtin_type): Use pushdecl lang_hook.
	* c-common.h (pushdecl_top_level, pushdecl): Don't declare here.
	gcc/objc/
	* objc-gnu-runtime-abi-01.c (objc_add_static_instance): Use
	pushdecl lang_hook.

Index: c/c-tree.h
===
--- c/c-tree.h	(revision 247751)
+++ c/c-tree.h	(working copy)
@@ -506,6 +506,7 @@ extern tree c_break_label;
 extern tree c_cont_label;
 
 extern bool global_bindings_p (void);
+extern tree pushdecl (tree);
 extern void push_scope (void);
 extern tree pop_scope (void);
 extern void c_bindings_start_stmt_expr (struct c_spot_bindings *);
Index: c-family/c-common.c
===
--- c-family/c-common.c	(revision 247751)
+++ c-family/c-common.c	(working copy)
@@ -2589,7 +2589,7 @@ c_register_builtin_type (tree type, cons
   DECL_ARTIFICIAL (decl) = 1;
   if (!TYPE_NAME (type))
 TYPE_NAME (type) = decl;
-  pushdecl (decl);
+  lang_hooks.decls.pushdecl (decl);
 
   registered_builtin_types = tree_cons (0, type, registered_builtin_types);
 }
Index: c-family/c-common.h
===
--- c-family/c-common.h	(revision 247751)
+++ c-family/c-common.h	(working copy)
@@ -588,8 +588,7 @@ extern tree push_stmt_list (void);
 extern tree pop_stmt_list (tree);
 extern tree add_stmt (tree);
 extern void push_cleanup (tree, tree, bool);
-extern tree pushdecl_top_level (tree);
-extern tree pushdecl (tree);
+
 extern tree build_modify_expr (location_t, tree, tree, enum tree_code,
 			   location_t, tree, tree);
 extern tree build_array_notation_expr (location_t, tree, tree, enum tree_code,
Index: cp/cp-lang.c
===
--- cp/cp-lang.c	(revision 247751)
+++ cp/cp-lang.c	(working copy)
@@ -35,6 +35,8 @@ static tree cp_eh_personality (void);
 static tree get_template_innermost_arguments_folded (const_tree);
 static tree get_template_argument_pack_elems_folded (const_tree);
 static tree cxx_enum_underlying_base_type (const_tree);
+static tree get_global_decls ();
+static tree cxx_pushdecl (tree);
 
 /* Lang hooks common to C++ and ObjC++ are declared in cp/cp-objcp-common.h;
consequently, there should be very few hooks below.  */
@@ -78,6 +80,10 @@ static tree cxx_enum_underlying_base_typ
 #define LANG_HOOKS_EH_RUNTIME_TYPE build_eh_type_type
 #undef LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE
 #define LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE cxx_enum_underlying_base_type
+#undef LANG_HOOKS_GETDECLS
+#define LANG_HOOKS_GETDECLS get_global_decls
+#undef LANG_HOOKS_PUSHDECL
+#define LANG_HOOKS_PUSHDECL cxx_pushdecl
 
 /* Each front end provides its own lang hook initializer.  */
 struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
@@ -229,5 +235,21 @@ tree cxx_enum_underlying_base_type (cons
   return underlying_type;
 }
 
+/* Return the list of decls in the global namespace.  */
+
+static
+tree get_global_decls ()
+{
+  return NAMESPACE_LEVEL (global_namespace)->names;
+}
+
+/* Push DECL into the current scope.  */
+
+static
+tree cxx_pushdecl (tree decl)
+{
+  return pushdecl (decl);
+}
+
 #include "gt-cp-cp-lang.h"
 #include "gtype-cp.h"
Index: cp/name-lookup.h
===
--- cp/name-lookup.h	(revision 247752)
+++ cp/name-lookup.h	(working copy)
@@ -342,4 +342,6 @@ extern tree innermost_non_namespace_valu
 extern cxx_binding *outer_binding (tree, cxx_binding *, bool);
 extern void cp_emit_debug_info_for_using (tree, tree);
 
+extern tree pushdecl_top_level			(tree);
+
 #endif /* GCC_CP_NAME_LOOKUP_H */
Index: objc/objc-gnu-runtime-abi-01.c
===
--- objc/objc-gnu-runtime-abi-01.c	(revision 247751)
+++ objc/objc-gnu-runtime-abi-01.c	(working copy)
@@ -888,7 +888,7 @@ objc_add_static_instance (tree construct
   /* We may be writing something else just now.
  Postpone till end of input. */
   DECL_DEFER_OUTPUT (decl) = 1;
-  pushdecl_top_level (decl);
+  lang_hooks.decls.pushdecl (decl);
   rest_of_decl_compilation (decl, 1, 0);
 
   /* Add the DECL to the

Re: Use "oacc kernels" attribute for OpenACC kernels

2017-05-08 Thread Thomas Schwinge

Hi!

On Thu, 4 Aug 2016 16:07:00 +0200, I wrote:
> Ping.
> 
> On Wed, 27 Jul 2016 12:06:59 +0200, I wrote:
> > On Mon, 25 Jan 2016 16:09:14 +0100, Jakub Jelinek  wrote:
> > > On Mon, Jan 25, 2016 at 10:06:50AM -0500, Nathan Sidwell wrote:
> > > > On 01/04/16 10:39, Nathan Sidwell wrote:
> > > > >There's currently no robust predicate to determine whether an oacc 
> > > > >offload
> > > > >function is for a kernels region (as opposed to a parallel region).
> > > > >[...]
> > > > >
> > > > >This patch marks TREE_PUBLIC on the offload attribute values, to note 
> > > > >kernels
> > > > >regions,  and adds a predicate to check that.  [...]
> > > > >
> > > > >Using these predicates improves the dump output of the openacc device 
> > > > >lowering
> > > > >pass too.
> > 
> > I just submitted a patch adding "Test cases to check OpenACC offloaded
> > function's attributes and classification",

(Pinged in
<877f1r1duw.fsf@hertz.schwinge.homeip.net">http://mid.mail-archive.com/877f1r1duw.fsf@hertz.schwinge.homeip.net>.)

> > to actually check the dump output of "oaccdevlow" -- it works.  ;-)
> > 
> > > > https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00092.html
> > > > ping?
> > > 
> > > Ok, thanks.
> > 
> > It's conceptually and code-wise simpler to just use a "oacc kernels"
> > attribute for that.  (And, that will make another patch I'm working on
> > less convoluted.)
> > 
> > I'm open to suggestions if there is a better place to set the "oacc
> > kernels" attribute -- I put it into expand_omp_target, where another
> > special thing for GF_OMP_TARGET_KIND_OACC_KERNELS is already being done,
> > and before "rewriting" GF_OMP_TARGET_KIND_OACC_KERNELS (and
> > GF_OMP_TARGET_KIND_OACC_PARALLEL) into BUILT_IN_GOACC_PARALLEL.  My
> > reasoning for not setting the attribute earlier (like, in the front
> > ends), is that at that point in/before expand_omp_target, we still have
> > the distrinction between OACC_PARALLEL/OACC_KERNELS (tree codes), and
> > later GF_OMP_TARGET_KIND_OACC_PARALLEL/GF_OMP_TARGET_KIND_OACC_KERNELS
> > (GIMPLE_OMP_TARGET subcodes).  Another question/possibly cleanup of
> > course might be to actually do set the "oacc kernels" attribute in the
> > front end and merge OACC_KERNELS into OACC_PARALLEL, and
> > GF_OMP_TARGET_KIND_OACC_KERNELS into GF_OMP_TARGET_KIND_OACC_PARALLEL?
> > 
> > But anyway, as a first step: OK for trunk?

commit fac5c3214f58812881635d3fb1e1751446d4b660
Author: Thomas Schwinge 
Date:   Mon May 8 21:24:46 2017 +0200

Use "oacc kernels" attribute for OpenACC kernels

gcc/
* omp-expand.c (expand_omp_target)
: Set "oacc kernels" attribute.
* omp-general.c (oacc_set_fn_attrib): Remove is_kernel formal
parameter.  Adjust all users.
(oacc_fn_attrib_kernels_p): Remove function.
(execute_oacc_device_lower): Look for "oacc kernels" attribute
instead of calling oacc_fn_attrib_kernels_p.
* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
* tree-parloops.c (create_parallel_loop): If oacc_kernels_p,
assert "oacc kernels" attribute is set.
gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c: Adjust.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-routine.f95: Likewise.
---
 gcc/omp-expand.c   | 16 +-
 gcc/omp-general.c  | 18 ++--
 gcc/omp-general.h  |  4 +---
 gcc/omp-offload.c  | 25 +++---
 .../goacc/classify-kernels-unparallelized.c|  8 +++
 .../c-c++-common/goacc/classify-kernels.c  |  8 +++
 .../c-c++-common/goacc/classify-parallel.c |  2 +-
 .../c-c++-common/goacc/classify-routine.c  |  2 +-
 .../goacc/classify-kernels-unparallelized.f95  |  8 +++
 .../gfortran.dg/goacc/classify-kernels.f95 |  8 +++
 .../gfortran.dg/goacc/classify-parallel.f95|  2 +-
 .../gfortran.dg/goacc/classify-routine.f95 |  2 +-
 gcc/tree-parloops.c|  5 -
 gcc/tree-ssa-loop.c|  5 +
 14 files changed, 52 insertions(+), 61 deletions(-)

diff --git gcc/omp-expand.c gcc/omp-expand.c
index 5c48b78..405c60e 100644
--- gcc/omp-expand.c
+++ gcc/omp-expand.c
@@ -7083,7 +7083,16 @@ expand_omp_target (struct omp_region *region)
   exit_bb = region->exit;
 
   if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS)
-mark_loops_in_oac

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-08 Thread Thomas Koenig


Am 08.05.2017 um 18:58 schrieb Jerry DeLisle:

he attached patch reduces the stack usage by the blocked

version of matmul for cases where we don't need the full buffer.
This should improve stack usage.

OK for trunk?



OK, thanks.


Is this something we should consider for backporting to gcc-7?
I think the large block size caused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79876 .

That one was fixed by adjusting the ridiculously low
stack size for multi-threaded applications on OSX, but
the underlying problem could still bite somewhere or
somebody else.

Opinions?

Regards

Thomas

Re: [PATCH 1/7] enhance -Wformat to detect quoting problems (PR 80280 et al.)

2017-05-08 Thread Martin Sebor


On 05/05/2017 11:49 AM, Joseph Myers wrote:

On Thu, 4 May 2017, Martin Sebor wrote:


I like the flags2 idea.  I split up the initialization array to also
detect quoted %K, and unquoted %R and %r.  With that I ran into test
failures that took me a bit to debug.  It turns out that there's code
(a nasty hack, really) that makes assumptions about some of
the conversion specifiers.  I dealt with the failures by simplifying
the initialization code and removing the hack.


This patch is OK.


Better testing exposed a couple of issues with the patch:

1) The init_dynamic_diag_info function depends on the order in which
   the identifiers it looks (e.g. tree_node) for are declared in
   the translation unit being compiled and in some instances wound
   up initializing the local_tree_type_node with void_type_node even
   though tree would subsequently be found to be declared.

2) The warning on %R/%r when used outside a quoted sequence was
   overfly restrictive.  There is code in cp/error.c that relies
   on that when printing the inlining context of a function.

I fixed both of these problems in the attached revision.  I've
also been running a script to build the targets in config-list.mk.
So far it's built just over 40 of the 199 targets and found one
quoting problem in a build of sparcv9-solaris2.11.  The build
fails due to an unrelated error (I opened bug 80673 for it), so
I can't say I've found anything that should block the patch from
checking in.

I'll keep running the script but unless you or someone else would
prefer I hold off until they all finish I'll go ahead and commit
the attached patch, without waiting for the build of the remaining
targets to complete, probably sometime in the second half of
the week.

Martin
PR translation/80280 - Missing closing quote (%>) c/semantics.c and c/c-typeck.c

gcc/c-family/ChangeLog:

	PR translation/80280
	* c-format.h (struct format_flag_spec): Add new member.
	(T89_T): New macro.
	* c-format.c (local_tree_type_node): New global.
	(printf_flag_specs, asm_fprintf_flag_spec): Initialize new data.
	(gcc_diag_flag_specs, scanf_flag_specs, strftime_flag_specs): Ditto.
	(strfmon_flag_specs): Likewise.
	(gcc_diag_char_table, gcc_cdiag_char_table): Split up specifiers
	with distinct quoting properties.
	(gcc_tdiag_char_table, gcc_cxxdiag_char_table): Same.
	(flag_chars_t::validate): Add argument and handle bad quoting.
	(check_format_info_main): Handle quoting problems.
	(init_dynamic_diag_info): Simplify.

gcc/testsuite/ChangeLog:

	PR translation/80280
	* gcc.dg/format/gcc_diag-10.c: New test.

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 400eb66..2dba062 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -53,6 +53,9 @@ struct function_format_info
   unsigned HOST_WIDE_INT first_arg_num;	/* number of first arg (zero for varargs) */
 };
 
+/* Initialized in init_dynamic_diag_info.  */
+static tree local_tree_type_node;
+
 static bool decode_format_attr (tree, function_format_info *, int);
 static int decode_format_type (const char *);
 
@@ -492,17 +495,17 @@ static const format_length_info gcc_gfc_length_specs[] =
 
 static const format_flag_spec printf_flag_specs[] =
 {
-  { ' ',  0, 0, N_("' ' flag"),N_("the ' ' printf flag"),  STD_C89 },
-  { '+',  0, 0, N_("'+' flag"),N_("the '+' printf flag"),  STD_C89 },
-  { '#',  0, 0, N_("'#' flag"),N_("the '#' printf flag"),  STD_C89 },
-  { '0',  0, 0, N_("'0' flag"),N_("the '0' printf flag"),  STD_C89 },
-  { '-',  0, 0, N_("'-' flag"),N_("the '-' printf flag"),  STD_C89 },
-  { '\'', 0, 0, N_("''' flag"),N_("the ''' printf flag"),  STD_EXT },
-  { 'I',  0, 0, N_("'I' flag"),N_("the 'I' printf flag"),  STD_EXT },
-  { 'w',  0, 0, N_("field width"), N_("field width in printf format"), STD_C89 },
-  { 'p',  0, 0, N_("precision"),   N_("precision in printf format"),   STD_C89 },
-  { 'L',  0, 0, N_("length modifier"), N_("length modifier in printf format"), STD_C89 },
-  { 0, 0, 0, NULL, NULL, STD_C89 }
+  { ' ',  0, 0, 0, N_("' ' flag"),N_("the ' ' printf flag"),  STD_C89 },
+  { '+',  0, 0, 0, N_("'+' flag"),N_("the '+' printf flag"),  STD_C89 },
+  { '#',  0, 0, 0, N_("'#' flag"),N_("the '#' printf flag"),  STD_C89 },
+  { '0',  0, 0, 0, N_("'0' flag"),N_("the '0' printf flag"),  STD_C89 },
+  { '-',  0, 0, 0, N_("'-' flag"),N_("the '-' printf flag"),  STD_C89 },
+  { '\'', 0, 0, 0, N_("''' flag"),N_("the ''' printf flag"),  STD_EXT },
+  { 'I',  0, 0, 0, N_("'I' flag"),N_("the 'I' printf flag"),  STD_EXT },
+  { 'w',  0, 0, 0, N_("field width"), N_("field width in printf format"), STD_C89 },
+  { 'p',  0, 0, 0, N_("precision"),   N_("precision in printf format"),   STD_C89 },

Re: [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-08 Thread Daniel Santos


On 05/06/2017 03:22 PM, Daniel Santos wrote:


gcc-5.4.0 CFLAGS="-march=native -O2 -g": 74
gcc-7.1.0 CFLAGS="-march=native -O2 -g": 74
gcc-7.1.0 CFLAGS="-march=nocona -mtune=generic -O2 -g": 79
gcc-7.1.0 CFLAGS="-march=native -O2 -g -mcall-ms2sysv-xlogues" 
(patched): 31


I'm building out a clean test environment on another machine to try to 
rule out clutter issues (and video driver issues) on my workstation.


Daniel



I've re-run Wine's tests with a new clean VM environment and some 
changes to include more tests and similar results:


Compiler Failures
gcc-4.9.4:   39
gcc-7.1.0:   78
gcc-7.1.0-patched (with -mcall-ms2sysv-xlogues): 40


The first error not present in the gcc-4.9.4 tests that I examined 
looked like a run-of-the-mill race condition in Wine that just happened 
to not crash when built with 4.9.4.  So I'm going to guess that the 
disappearance of these failures with -mcall-ms2sysv-xlogues is just 
incidental.  I think we're in good condition with this patch set.


Daniel

Re: [PATCH 2/n] [PR tree-optimization/78496] Simplify ASSERT_EXPRs to facilitate jump threading

2017-05-08 Thread Jeff Law


On 05/08/2017 09:54 AM, Jeff Law wrote:
So I'm not sure if changing VRP with your patches is a good thing when 

you
could have used the new API in the first place ...
I don't see that the changes to date around 78496 change things 
significantly in regards to the immediate plans to remove ASSERT_EXPR. 
The work around 78496 raises the bar in terms of what information we 
need to be able to extract, but that IMHO, is a fine thing to do ;-)
As expected, it's pretty easy to change to the newer way of doing 
things.  ie, extracting from the GIMPLE_COND rather than the 
ASSERT_EXPR.   It really isn't a big deal.






However, the existence of register_edge_assert_for does change how I'm 
looking at the next issue for 78496 as well as how to tackle a host of 
related issues.  It may end up being the case that we stop 78496 work 
after patch #2, work on ASSERT_EXPR removal, then re-eval 78496.
And as expected the unwindable VRs do help significantly in the 3rd hunk 
of the work for 78496.  Probably the biggest issue here is how to 
compose the bits to avoid code duplication between EVRP and the VRP jump 
threading.


Jeff

Re: [PATCH, v3] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch

2017-05-08 Thread Peter Bergner

On 05/08/2017 01:20 PM, Peter Bergner wrote:
> That is what the previous patch did, but as I mention above,
> we generate slightly better code for some test cases (other
> tests seemed to generate the same code) if we don't attempt
> to handle the decision tree case.  I'll note that the current
> unpatched compiler already knows how to remove unreachable
> case statement blocks when we expand to a decision tree.

I should be more careful with my description here.  The patch does
affect both unreachable case statements for both decision trees as
well as jump tables, and that leads to improved code for both
decision trees as well as jump tables.

What I meant to say above, is that the current handling of unreachable
default case statements in the unpatched compiler seems to lead to
slightly better code for some test cases than attempting to handle
default_label == NULL in the decision tree code.  It was for that
reason, I placed the code in expand_case() only in the jump table
path.  The fact it made the patch smaller was a bonus, since I didn't
need to protect emit_case_nodes() from a NULL default_label.

As I said, if you think it will help some test case I haven't tried yet,
I can add that support back.

Peter

[committed] correct %D quoting in gcc/config/sol2-c.c

2017-05-08 Thread Martin Sebor


In preparation for committing the meat of the patch for bug 80280
I've committed the following in r247758 as an obvious fix to prevent
the new -Wformat warning from breaking Solaris bootstrap.  Bootstrap
is currently broken due to 80673 (that looks like a separate issue
from the warning).

Martin

Index: gcc/config/sol2-c.c
===
--- gcc/config/sol2-c.c (revision 247757)
+++ gcc/config/sol2-c.c (working copy)
@@ -113,7 +113,7 @@ solaris_pragma_align (cpp_reader *pfile ATTRIBUTE_
   tree decl = identifier_global_value (t);
   if (decl && DECL_P (decl))
warning (0, "%<#pragma align%> must appear before the declaration of "
-"%D, ignoring", decl);
+"%qD, ignoring", decl);
   else
solaris_pending_aligns = tree_cons (t, build_tree_list (NULL, x),
solaris_pending_aligns);

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-08 Thread Jerry DeLisle

On 05/08/2017 12:29 PM, Thomas Koenig wrote:
> Am 08.05.2017 um 18:58 schrieb Jerry DeLisle:
> 
> he attached patch reduces the stack usage by the blocked
>>> version of matmul for cases where we don't need the full buffer.
>>> This should improve stack usage.
>>>
>>> OK for trunk?
>>>
>>
>> OK, thanks.
> 
> Is this something we should consider for backporting to gcc-7?
> I think the large block size caused
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79876 .
> 
> That one was fixed by adjusting the ridiculously low
> stack size for multi-threaded applications on OSX, but
> the underlying problem could still bite somewhere or
> somebody else.
> 
> Opinions?
> 

I think it should be back ported since we really changed the matmul and this was
a fallout and using a to much stack might end up being a regression.

OK to back port.

Jerry

[Patch, Fortran, OOP] PR 79311: ICE in generate_finalization_wrapper, at fortran/class.c:1992

2017-05-08 Thread Janus Weil

Hi all,

the attached patch fixes an ICE-on-valid problem with finalization by
making sure that the finalization procedures are properly resolved.

In the test case, the finalizer of the component type was not being
resolved if the superordinate type had a finalizer itself.

The patch also fixes a small error that had no actual impact on the
test case ('has_final2' could never become true).

Regtesting went well, except for these three failure which also seems
to occur on a clean trunk currently:

FAIL: gfortran.dg/coarray_lock_7.f90   -O   scan-tree-dump-times original
FAIL: gfortran.dg/coarray_lock_7.f90   -O   scan-tree-dump-times original
FAIL: gfortran.dg/mvbits_7.f90   -O0   (test for warnings, line 28)

Ok for trunk?

Cheers,
Janus



2017-05-08  Janus Weil  

PR fortran/79311
* resolve.c (gfc_resolve_finalizers): Ensure that derived-type
components have a their finalizers resolved, also if the superordinate
type itself has a finalizer.

2017-05-08  Janus Weil  

PR fortran/79311
* gfortran.dg/finalize_32.f90: New test.
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c   (revision 247757)
+++ gcc/fortran/resolve.c   (working copy)
@@ -12385,26 +12385,23 @@ gfc_resolve_finalizers (gfc_symbol* derived, bool
   if (parent)
 gfc_resolve_finalizers (parent, finalizable);
 
-  /* Return early when not finalizable. Additionally, ensure that derived-type
- components have a their finalizables resolved.  */
-  if (!derived->f2k_derived || !derived->f2k_derived->finalizers)
+  /* Ensure that derived-type components have a their finalizers resolved.  */
+  bool has_final = derived->f2k_derived && derived->f2k_derived->finalizers;
+  for (c = derived->components; c; c = c->next)
+if (c->ts.type == BT_DERIVED
+   && !c->attr.pointer && !c->attr.proc_pointer && !c->attr.allocatable)
+  {
+   bool has_final2 = false;
+   if (!gfc_resolve_finalizers (c->ts.u.derived, &has_final2))
+ return false;  /* Error.  */
+   has_final = has_final || has_final2;
+  }
+  /* Return early if not finalizable.  */
+  if (!has_final)
 {
-  bool has_final = false;
-  for (c = derived->components; c; c = c->next)
-   if (c->ts.type == BT_DERIVED
-   && !c->attr.pointer && !c->attr.proc_pointer && 
!c->attr.allocatable)
- {
-   bool has_final2 = false;
-   if (!gfc_resolve_finalizers (c->ts.u.derived, &has_final))
- return false;  /* Error.  */
-   has_final = has_final || has_final2;
- }
-  if (!has_final)
-   {
- if (finalizable)
-   *finalizable = false;
- return true;
-   }
+  if (finalizable)
+   *finalizable = false;
+  return true;
 }
 
   /* Walk over the list of finalizer-procedures, check them, and if any one
! { dg-do compile }
!
! PR 79311: [OOP] ICE in generate_finalization_wrapper, at fortran/class.c:1992
!
! Contributed by DIL 

module tensor_recursive
  implicit none

  type :: tens_signature_t
  contains
final :: tens_signature_dtor
  end type

  type :: tens_header_t
type(tens_signature_t) :: signature
  contains
final :: tens_header_dtor
  end type

contains

  subroutine tens_signature_dtor(this)
type(tens_signature_t) :: this
  end subroutine

  subroutine tens_header_dtor(this)
type(tens_header_t) :: this
  end subroutine

end

Re: [PATCH] decl lang hooks

2017-05-08 Thread Joseph Myers

On Mon, 8 May 2017, Nathan Sidwell wrote:

> cp/name-lookup has a twisty maze of forwarding functions.  One reason is that
> a couple of the names are used directly by c-common.c.  However, we now have
> lang hooks for these things.
> 
> This patch changes the C++ FE to override the pushdecl and getdecl lang hooks.
> In addition to simply overriding them there, I had to fixup a couple of places
> in c-family/c-common.c and objc/objc-gnu-runtime-abi-01.c to use the pushdecl
> hook.

The c/ and c-family/ changes are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] -fdump option docs

2017-05-08 Thread Nathan Sidwell

I've committed this to sort the -fdump options.  I left the generic 
-fdump-noaddr, -fdump-unnumbered and -fdump-unnumbered-links at the start.


nathan
--
Nathan Sidwell
2017-05-08  Nathan Sidwell  

* doc/invoke.texi: Alphabetize -fdump options.

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 247758)
+++ doc/invoke.texi (working copy)
@@ -540,13 +540,13 @@ Objective-C and Objective-C++ Dialects}.
 -fdisable-tree-@var{pass_name} @gol
 -fdisable-tree-@var{pass-name}=@var{range-list} @gol
 -fdump-noaddr  -fdump-unnumbered  -fdump-unnumbered-links @gol
--fdump-translation-unit@r{[}-@var{n}@r{]} @gol
 -fdump-class-hierarchy@r{[}-@var{n}@r{]} @gol
+-fdump-final-insns@r{[}=@var{file}@r{]}
 -fdump-ipa-all  -fdump-ipa-cgraph  -fdump-ipa-inline @gol
 -fdump-passes @gol
 -fdump-rtl-@var{pass}  -fdump-rtl-@var{pass}=@var{filename} @gol
 -fdump-statistics @gol
--fdump-final-insns@r{[}=@var{file}@r{]}
+-fdump-translation-unit@r{[}-@var{n}@r{]} @gol
 -fdump-tree-all @gol
 -fdump-tree-@var{switch} @gol
 -fdump-tree-@var{switch}-@var{options} @gol
@@ -12938,16 +12938,6 @@ When doing debugging dumps (see @option{
 instruction numbers for the links to the previous and next instructions
 in a sequence.
 
-@item -fdump-translation-unit @r{(C++ only)}
-@itemx -fdump-translation-unit-@var{options} @r{(C++ only)}
-@opindex fdump-translation-unit
-Dump a representation of the tree structure for the entire translation
-unit to a file.  The file name is made by appending @file{.tu} to the
-source file name, and the file is created in the same directory as the
-output file.  If the @samp{-@var{options}} form is used, @var{options}
-controls the details of the dump as described for the
-@option{-fdump-tree} options.
-
 @item -fdump-class-hierarchy @r{(C++ only)}
 @itemx -fdump-class-hierarchy-@var{options} @r{(C++ only)}
 @opindex fdump-class-hierarchy
@@ -12995,6 +12985,16 @@ whole compilation unit while @samp{-deta
 the passes generate them.  The default with no option is to sum
 counters for each function compiled.
 
+@item -fdump-translation-unit @r{(C++ only)}
+@itemx -fdump-translation-unit-@var{options} @r{(C++ only)}
+@opindex fdump-translation-unit
+Dump a representation of the tree structure for the entire translation
+unit to a file.  The file name is made by appending @file{.tu} to the
+source file name, and the file is created in the same directory as the
+output file.  If the @samp{-@var{options}} form is used, @var{options}
+controls the details of the dump as described for the
+@option{-fdump-tree} options.
+
 @item -fdump-tree-all
 @itemx -fdump-tree-@var{switch}
 @itemx -fdump-tree-@var{switch}-@var{options}

[PATCH] xtensa: add support for SSP

2017-05-08 Thread Max Filippov

gcc/
2017-03-27  Max Filippov  

* config/xtensa/xtensa-protos.h
(xtensa_initial_elimination_offset): New declaration.
* config/xtensa/xtensa.c (xtensa_initial_elimination_offset):
New function. Move its body from the INITIAL_ELIMINATION_OFFSET
macro definition, add case for FRAME_POINTER_REGNUM when
FRAME_GROWS_DOWNWARD.
* config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): New macro
definition.
(INITIAL_ELIMINATION_OFFSET): Replace body with call to
xtensa_initial_elimination_offset.
---
 gcc/config/xtensa/xtensa-protos.h |  1 +
 gcc/config/xtensa/xtensa.c| 24 
 gcc/config/xtensa/xtensa.h| 19 ---
 3 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/gcc/config/xtensa/xtensa-protos.h 
b/gcc/config/xtensa/xtensa-protos.h
index 873557f..ef9417c 100644
--- a/gcc/config/xtensa/xtensa-protos.h
+++ b/gcc/config/xtensa/xtensa-protos.h
@@ -73,5 +73,6 @@ extern void xtensa_expand_prologue (void);
 extern void xtensa_expand_epilogue (void);
 extern void order_regs_for_local_alloc (void);
 extern enum reg_class xtensa_regno_to_class (int regno);
+extern HOST_WIDE_INT xtensa_initial_elimination_offset (int from, int to);
 
 #endif /* !__XTENSA_PROTOS_H__ */
diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c
index 25ed7db..2578716 100644
--- a/gcc/config/xtensa/xtensa.c
+++ b/gcc/config/xtensa/xtensa.c
@@ -2676,6 +2676,30 @@ xtensa_frame_pointer_required (void)
   return false;
 }
 
+HOST_WIDE_INT
+xtensa_initial_elimination_offset (int from, int to)
+{
+  long frame_size = compute_frame_size (get_frame_size ());
+  HOST_WIDE_INT offset;
+
+  switch (from)
+{
+case FRAME_POINTER_REGNUM:
+  if (FRAME_GROWS_DOWNWARD)
+   offset = frame_size - (WINDOW_SIZE * UNITS_PER_WORD)
+ - cfun->machine->callee_save_size;
+  else
+   offset = 0;
+  break;
+case ARG_POINTER_REGNUM:
+  offset = frame_size;
+  break;
+default:
+  gcc_unreachable ();
+}
+
+  return offset;
+}
 
 /* minimum frame = reg save area (4 words) plus static chain (1 word)
and the total number of words must be a multiple of 128 bits.  */
diff --git a/gcc/config/xtensa/xtensa.h b/gcc/config/xtensa/xtensa.h
index 58eb1b2..68ee96a 100644
--- a/gcc/config/xtensa/xtensa.h
+++ b/gcc/config/xtensa/xtensa.h
@@ -460,9 +460,11 @@ enum reg_class
 
 #define STACK_GROWS_DOWNWARD 1
 
+#define FRAME_GROWS_DOWNWARD flag_stack_protect
+
 /* Offset within stack frame to start allocating local variables at.  */
 #define STARTING_FRAME_OFFSET  \
-  crtl->outgoing_args_size
+  (FRAME_GROWS_DOWNWARD ? 0 : crtl->outgoing_args_size)
 
 /* The ARG_POINTER and FRAME_POINTER are not real Xtensa registers, so
they are eliminated to either the stack pointer or hard frame pointer.  */
@@ -474,20 +476,7 @@ enum reg_class
 
 /* Specify the initial difference between the specified pair of registers.  */
 #define INITIAL_ELIMINATION_OFFSET(FROM, TO, OFFSET)   \
-  do { \
-long frame_size = compute_frame_size (get_frame_size ());  \
-switch (FROM)  \
-  {
\
-  case FRAME_POINTER_REGNUM:   \
-(OFFSET) = 0;  \
-   break;  \
-  case ARG_POINTER_REGNUM: \
-(OFFSET) = frame_size; \
-   break;  \
-  default: \
-   gcc_unreachable (); \
-  }
\
-  } while (0)
+  (OFFSET) = xtensa_initial_elimination_offset ((FROM), (TO))
 
 /* If defined, the maximum amount of space required for outgoing
arguments will be computed and placed into the variable
-- 
2.1.4

Re: [PATCH] xtensa: add support for SSP

2017-05-08 Thread augustine.sterl...@gmail.com

On Mon, May 8, 2017 at 3:55 PM, Max Filippov  wrote:
> gcc/
> 2017-03-27  Max Filippov  
>
> * config/xtensa/xtensa-protos.h
> (xtensa_initial_elimination_offset): New declaration.
> * config/xtensa/xtensa.c (xtensa_initial_elimination_offset):
> New function. Move its body from the INITIAL_ELIMINATION_OFFSET
> macro definition, add case for FRAME_POINTER_REGNUM when
> FRAME_GROWS_DOWNWARD.
> * config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): New macro
> definition.
> (INITIAL_ELIMINATION_OFFSET): Replace body with call to
> xtensa_initial_elimination_offset.

Approved.

Re: [PATCH] xtensa: add support for SSP

2017-05-08 Thread Max Filippov

On Mon, May 8, 2017 at 4:24 PM, augustine.sterl...@gmail.com
 wrote:
> On Mon, May 8, 2017 at 3:55 PM, Max Filippov  wrote:
>> gcc/
>> 2017-03-27  Max Filippov  
>>
>> * config/xtensa/xtensa-protos.h
>> (xtensa_initial_elimination_offset): New declaration.
>> * config/xtensa/xtensa.c (xtensa_initial_elimination_offset):
>> New function. Move its body from the INITIAL_ELIMINATION_OFFSET
>> macro definition, add case for FRAME_POINTER_REGNUM when
>> FRAME_GROWS_DOWNWARD.
>> * config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): New macro
>> definition.
>> (INITIAL_ELIMINATION_OFFSET): Replace body with call to
>> xtensa_initial_elimination_offset.
>
> Approved.

Applied to trunk. Thank you!

-- Max

Re: [PATCH] handle enumerated types in -Wformat-overflow (PR 80397)

2017-05-08 Thread Martin Sebor


On 04/28/2017 12:35 PM, Jeff Law wrote:

On 04/26/2017 11:05 AM, Martin Sebor wrote:

On 04/24/2017 03:35 PM, Jeff Law wrote:

On 04/11/2017 12:57 PM, Martin Sebor wrote:

In a review of my fix for bug 80364 Jakub pointed out that to
determine whether an argument to an integer directive is of
an integer type the gimple-ssa-sprintf pass tests the type code
for equality to INTEGER_TYPE when it should instead be using
INTEGRAL_TYPE_P().  This has the effect of the pass being unable
to use the available range of arguments of enumerated types,
resulting in both false positives and false negatives, and in
some cases, in emitting suboptimal code.

The attached patch replaces those tests with INTEGRAL_TYPE_P().

Since this is not a regression I submit it for GCC 8.

You might consider using POINTER_TYPE_P as well.


You mean rather than (TREE_CODE (type) == POINTER_TYPE)?  Those
I believe are vestiges of the %p handling that was removed sometime
last year, and (unless you are recommending I remove them as part
of this patch) should probably be removed during the next cleanup.Yes,
that can be a follow-up cleanup.


For the future, if you find yourself writing something like
TREE_CODE (TREE_TYPE (x)) == POINTER_TYPE, you're usually going to be
better off using POINTER_TYPE_P (TREE_TYPE (x)).  That allows the code
to work with C++ references as well as C pointers.


I'll keep it in mind, thanks.  Should I take this as approval
of the patch as is or are there some changes you'd like me to
make?

Martin

[PATCH][x86] Add missing intrinsics for vrcp14sd/ss instructions.

2017-05-08 Thread Koval, Julia

Hi,

This patch adds missing intrinsics for VRCP14SD and VRCP14SS instructions:
_mm_mask_rcp14_sd
_mm_maskz_rcp14_sd
_mm_mask_rcp14_ss
_mm_maskz_rcp14_ss

These instructions and intrinsics are described in SDM Vol. 2C 5-487.

gcc/
* config/i386/avx512fintrin.h (_mm_mask_rcp14_sd,
_mm_maskz_rcp14_sd, _mm_mask_rcp14_ss,
_mm_maskz_rcp14_ss): New intrinsics.
* config/i386/i386-builtin.def (__builtin_ia32_rcp14sd_mask,
__builtin_ia32_rcp14ss_mask): New builtins.
* config/i386/sse.md (srcp14_mask): New pattern.

gcc/testsuite/
* gcc.target/i386/avx512f-vrcp14sd-1.c: Test new intrinsics.
* gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto.
* gcc.target/i386/avx512f-vrcp14ss-1.c: Ditto.
* gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto.

Thanks,
Julia


0001-vrcp14sd.patch
Description: 0001-vrcp14sd.patch

87 matches

Mail list logo