[PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx types (was: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6]))

2016-09-16 Thread Thomas Schwinge
Hi!

(CCing Bernd and Jakub -- for your information, or: "amusement" -- as
you've discussed offloading preload_common_nodes issues before...)

Got to look into this some more, yesterday:

On Thu, 08 Sep 2016 13:43:30 +0200, I wrote:
> On Wed, 7 Sep 2016 14:23:18 +0200, Richard Biener 
>  wrote:
> > On Wed, Sep 7, 2016 at 1:52 PM, Thomas Schwinge  
> > wrote:
> > > On Fri, 19 Aug 2016 11:05:59 +, Joseph Myers 
> > >  wrote:
> > >> On Fri, 19 Aug 2016, Richard Biener wrote:
> > >> > Can you quickly verify if LTO works with the new types?  I don't see 
> > >> > anything
> > >> > that would prevent it but having new global trees and backends 
> > >> > initializing them
> > >> > might come up with surprises (see tree-streamer.c:preload_common_nodes)
> > >>
> > >> Well, the execution tests are in gcc.dg/torture, which is run with 
> > >> various
> > >> options including -flto (and I've checked the testsuite logs to confirm
> > >> these tests are indeed run with such options).  Is there something else
> > >> you think should be tested?
> > >
> > > As I noted in :
> > >
> > > As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx 
> > > types", nvptx
> > > offloading is broken, ICEs in LTO stream-in.  Probably some kind of 
> > > data-type
> > > mismatch that is not visible with Intel MIC offloading (using the 
> > > same data
> > > types) but explodes with nvptx.  I'm having a look.

> > [...] preload_common_nodes.  This is carefully crafted to _not_ diverge by
> > frontend (!) it wasn't even designed to cope with global trees being present
> > or not dependent on target (well, because the target is always the
> > same! mind you!)
> 
> Scary.  ;-/
> 
> > Now -- in theory it should deal with NULLs just fine (resulting in
> > error_mark_node), but it can diverge when there are additional
> > compount types (like vectors, complex
> > or array or record types) whose element types are not in the set of
> > global trees.
> > The complex _FloatN types would be such a case given they appear before 
> > their
> > components.  That mixes up the ordering at least.
> 
> ACK, but that's also an issue for "regular" float/complex float, which
> also is in "reverse" order.  But that's "fixed" by the recursion in
> gcc/tree-streamer.c:record_common_node for "TREE_CODE (node) ==
> COMPLEX_TYPE".  This likewise seems to work for the _FloatN types.

So, that mechanism does work, but what's going wrong is the following:
with differing target vs. offload target, we potentially (and actually
do, in the case of x86_64 vs. nvptx) have different sets of _FloatN and
_FloatNx types.  That is, for nvptx, a few of these don't exist (so,
NULL_TREE), and so it follows their complex variants also don't exist,
and thus the recursion that I just mentioned for complex types is no
longer done in lockstep in the x86_64 cc1 vs. the nvptx lto1, hence we
get an offset (of two, in this specific case), and consequently streaming
explodes, for example, as soon as it hits a forward-reference (due to
looking for tree 185 (x86_64 cc1 view; as encoded in the stream) when it
should be looking for tree 183 (nvptx lto1 view).

> (I've
> put "fixed" in quotes -- doesn't that recursion mean that we're thus
> putting "complex float", "float", [...], "float" (again) into the cache?
> Anyway that's for later...)

Maybe it would make sense to do this tree streaming in two passes: first
build a set of what we actually need, and then stream that, without
duplicates.  (Or, is also for these "pickled" trees their order relevant,
so that one tree may only refer to other earlier but not later ones?
Anyway, we could still remember the set of trees already streamed, and
avoid the double streaming I described?)

So I now understand that due to the stream format, the integer tree IDs
(cache->next_id) have to match in all cc1/lto1s (etc.), so we'll have to
make that work for x86_64 target vs. nvptx offload target being
different.  (I'm pondering some ideas about how to rework that integer
tree ID generation.)

(I have not digested completely yet what the implications are for the
skipping we're doing for some trees in preload_common_nodes), but here is
a first patch that at least gets us rid of the immediate problem.  (I'll
fix the TODO by adding a "#define TI_COMPLEX_FLOATN_NX_TYPE_LAST" to
gcc/tree-core.h, OK?)  Will such a patch be OK for trunk, at least for
now?

commit d2045bd028104be2ede5ae1d5d7b9395e67a8180
Author: Thomas Schwinge 
Date:   Thu Sep 15 21:56:16 2016 +0200

[PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx 
types

gcc/
PR lto/77458
* tree-streamer.c (preload_common_nodes): Skip _FloatN and
_FloatNx types if offloading.
---
 gcc/tree-streamer.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git gcc/tree-streamer.c gcc/tree-streamer.c
index 7ea7096..061d831 100644
--- gcc/tree-streamer.c
+++ gcc/tree-streamer.c
@@ -333,7 +3

Re: [PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx types (was: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6]))

2016-09-16 Thread Richard Biener
On Fri, Sep 16, 2016 at 9:05 AM, Thomas Schwinge
 wrote:
> Hi!
>
> (CCing Bernd and Jakub -- for your information, or: "amusement" -- as
> you've discussed offloading preload_common_nodes issues before...)
>
> Got to look into this some more, yesterday:
>
> On Thu, 08 Sep 2016 13:43:30 +0200, I wrote:
>> On Wed, 7 Sep 2016 14:23:18 +0200, Richard Biener 
>>  wrote:
>> > On Wed, Sep 7, 2016 at 1:52 PM, Thomas Schwinge  
>> > wrote:
>> > > On Fri, 19 Aug 2016 11:05:59 +, Joseph Myers 
>> > >  wrote:
>> > >> On Fri, 19 Aug 2016, Richard Biener wrote:
>> > >> > Can you quickly verify if LTO works with the new types?  I don't see 
>> > >> > anything
>> > >> > that would prevent it but having new global trees and backends 
>> > >> > initializing them
>> > >> > might come up with surprises (see 
>> > >> > tree-streamer.c:preload_common_nodes)
>> > >>
>> > >> Well, the execution tests are in gcc.dg/torture, which is run with 
>> > >> various
>> > >> options including -flto (and I've checked the testsuite logs to confirm
>> > >> these tests are indeed run with such options).  Is there something else
>> > >> you think should be tested?
>> > >
>> > > As I noted in :
>> > >
>> > > As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx 
>> > > types", nvptx
>> > > offloading is broken, ICEs in LTO stream-in.  Probably some kind of 
>> > > data-type
>> > > mismatch that is not visible with Intel MIC offloading (using the 
>> > > same data
>> > > types) but explodes with nvptx.  I'm having a look.
>
>> > [...] preload_common_nodes.  This is carefully crafted to _not_ diverge by
>> > frontend (!) it wasn't even designed to cope with global trees being 
>> > present
>> > or not dependent on target (well, because the target is always the
>> > same! mind you!)
>>
>> Scary.  ;-/
>>
>> > Now -- in theory it should deal with NULLs just fine (resulting in
>> > error_mark_node), but it can diverge when there are additional
>> > compount types (like vectors, complex
>> > or array or record types) whose element types are not in the set of
>> > global trees.
>> > The complex _FloatN types would be such a case given they appear before 
>> > their
>> > components.  That mixes up the ordering at least.
>>
>> ACK, but that's also an issue for "regular" float/complex float, which
>> also is in "reverse" order.  But that's "fixed" by the recursion in
>> gcc/tree-streamer.c:record_common_node for "TREE_CODE (node) ==
>> COMPLEX_TYPE".  This likewise seems to work for the _FloatN types.
>
> So, that mechanism does work, but what's going wrong is the following:
> with differing target vs. offload target, we potentially (and actually
> do, in the case of x86_64 vs. nvptx) have different sets of _FloatN and
> _FloatNx types.  That is, for nvptx, a few of these don't exist (so,
> NULL_TREE), and so it follows their complex variants also don't exist,
> and thus the recursion that I just mentioned for complex types is no
> longer done in lockstep in the x86_64 cc1 vs. the nvptx lto1, hence we
> get an offset (of two, in this specific case), and consequently streaming
> explodes, for example, as soon as it hits a forward-reference (due to
> looking for tree 185 (x86_64 cc1 view; as encoded in the stream) when it
> should be looking for tree 183 (nvptx lto1 view).
>
>> (I've
>> put "fixed" in quotes -- doesn't that recursion mean that we're thus
>> putting "complex float", "float", [...], "float" (again) into the cache?
>> Anyway that's for later...)
>
> Maybe it would make sense to do this tree streaming in two passes: first
> build a set of what we actually need, and then stream that, without
> duplicates.  (Or, is also for these "pickled" trees their order relevant,
> so that one tree may only refer to other earlier but not later ones?
> Anyway, we could still remember the set of trees already streamed, and
> avoid the double streaming I described?)
>
> So I now understand that due to the stream format, the integer tree IDs
> (cache->next_id) have to match in all cc1/lto1s (etc.), so we'll have to
> make that work for x86_64 target vs. nvptx offload target being
> different.  (I'm pondering some ideas about how to rework that integer
> tree ID generation.)
>
> (I have not digested completely yet what the implications are for the
> skipping we're doing for some trees in preload_common_nodes), but here is
> a first patch that at least gets us rid of the immediate problem.  (I'll
> fix the TODO by adding a "#define TI_COMPLEX_FLOATN_NX_TYPE_LAST" to
> gcc/tree-core.h, OK?)  Will such a patch be OK for trunk, at least for
> now?

Humm ... do we anywhere compare to those global trees by pointer equivalence?
If so then it breaks LTO support for those types.

I think forcing proper ordering so that we can assert that at the
point we'd like
to recurse the nodes we recurse for are already in the cache would be a better
fix.  This might need some additional global trees in case components do not
ex

New GCC Mirror

2016-09-16 Thread mirrors

Hi guys,

We're supporters of your project and have created a mirror.

Location: Tokyo, Japan

Mirror: http://gcc.letterboxdelivery.org (http) | 
rsync://gcc.letterboxdelivery.org/gcc (rsync)


Mirror contact email: mirr...@letterboxdelivery.org

Glad to help out!

Kind regards,

Mitch.




Re: [PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx types (was: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6]))

2016-09-16 Thread Thomas Schwinge
Hi!

On Fri, 16 Sep 2016 10:59:16 +0200, Richard Biener  
wrote:
> On Fri, Sep 16, 2016 at 9:05 AM, Thomas Schwinge
>  wrote:
> > On Thu, 08 Sep 2016 13:43:30 +0200, I wrote:
> >> On Wed, 7 Sep 2016 14:23:18 +0200, Richard Biener 
> >>  wrote:
> >> > On Wed, Sep 7, 2016 at 1:52 PM, Thomas Schwinge 
> >> >  wrote:
> >> > > As I noted in :
> >> > >
> >> > > As of the PR32187 commit r239625 "Implement C _FloatN, _FloatNx 
> >> > > types", nvptx
> >> > > offloading is broken, ICEs in LTO stream-in.  Probably some kind 
> >> > > of data-type
> >> > > mismatch that is not visible with Intel MIC offloading (using the 
> >> > > same data
> >> > > types) but explodes with nvptx.  I'm having a look.
> >
> >> > [...] preload_common_nodes.  This is carefully crafted to _not_ diverge 
> >> > by
> >> > frontend (!) it wasn't even designed to cope with global trees being 
> >> > present
> >> > or not dependent on target (well, because the target is always the
> >> > same! mind you!)
> >>
> >> Scary.  ;-/
> >>
> >> > Now -- in theory it should deal with NULLs just fine (resulting in
> >> > error_mark_node), but it can diverge when there are additional
> >> > compount types (like vectors, complex
> >> > or array or record types) whose element types are not in the set of
> >> > global trees.
> >> > The complex _FloatN types would be such a case given they appear before 
> >> > their
> >> > components.  That mixes up the ordering at least.
> >>
> >> ACK, but that's also an issue for "regular" float/complex float, which
> >> also is in "reverse" order.  But that's "fixed" by the recursion in
> >> gcc/tree-streamer.c:record_common_node for "TREE_CODE (node) ==
> >> COMPLEX_TYPE".  This likewise seems to work for the _FloatN types.
> >
> > So, that mechanism does work, but what's going wrong is the following:
> > with differing target vs. offload target, we potentially (and actually
> > do, in the case of x86_64 vs. nvptx) have different sets of _FloatN and
> > _FloatNx types.  That is, for nvptx, a few of these don't exist (so,
> > NULL_TREE), and so it follows their complex variants also don't exist,
> > and thus the recursion that I just mentioned for complex types is no
> > longer done in lockstep in the x86_64 cc1 vs. the nvptx lto1, hence we
> > get an offset (of two, in this specific case), and consequently streaming
> > explodes, for example, as soon as it hits a forward-reference (due to
> > looking for tree 185 (x86_64 cc1 view; as encoded in the stream) when it
> > should be looking for tree 183 (nvptx lto1 view).
> >
> >> (I've
> >> put "fixed" in quotes -- doesn't that recursion mean that we're thus
> >> putting "complex float", "float", [...], "float" (again) into the cache?
> >> Anyway that's for later...)
> >
> > Maybe it would make sense to do this tree streaming in two passes: first
> > build a set of what we actually need, and then stream that, without
> > duplicates.  (Or, is also for these "pickled" trees their order relevant,
> > so that one tree may only refer to other earlier but not later ones?
> > Anyway, we could still remember the set of trees already streamed, and
> > avoid the double streaming I described?)
> >
> > So I now understand that due to the stream format, the integer tree IDs
> > (cache->next_id) have to match in all cc1/lto1s (etc.), so we'll have to
> > make that work for x86_64 target vs. nvptx offload target being
> > different.  (I'm pondering some ideas about how to rework that integer
> > tree ID generation.)
> >
> > (I have not digested completely yet what the implications are for the
> > skipping we're doing for some trees in preload_common_nodes), but here is
> > a first patch that at least gets us rid of the immediate problem.  (I'll
> > fix the TODO by adding a "#define TI_COMPLEX_FLOATN_NX_TYPE_LAST" to
> > gcc/tree-core.h, OK?)  Will such a patch be OK for trunk, at least for
> > now?
> 
> Humm ... do we anywhere compare to those global trees by pointer equivalence?

I have not verified that.  Does GCC permit/forbid that on a case-by-case
basis -- which seems very fragile to me?

> If so then it breaks LTO support for those types.

OK, I think I understand that -- but I do have a "lto_stream_offload_p"
conditional in my code changes, so these changes should only affect the
offloading stream, in my understanding?

> I think forcing proper ordering so that we can assert that at the
> point we'd like
> to recurse the nodes we recurse for are already in the cache would be a better
> fix.

ACK.  That's what I'd planned to look into as a next step.

> This might need some additional global trees in case components do not
> explicitely exist

That's what I was afraid of: for example, I can't tell if it holds for
all GCC configurations (back ends), that complex types' component types
will always match one of the already existing global trees?  (I can
certainly imagine some "strange" ISAs/data types breaking this assertion
-- no idea wheth

Re: [PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx types (was: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6]))

2016-09-16 Thread Joseph Myers
On Fri, 16 Sep 2016, Thomas Schwinge wrote:

> That's what I was afraid of: for example, I can't tell if it holds for
> all GCC configurations (back ends), that complex types' component types
> will always match one of the already existing global trees?  (I can

Well, a component type could certainly match a target-specific type 
instead (e.g. __ibm128 on powerpc, which if it's not long double won't be 
any of the other types either).  That's a type registered with 
lang_hooks.types.register_builtin_type, not one of the global trees.  
(You can't write _Complex __ibm128, but can get such a type with _Complex 
float __attribute__ ((__mode__ (__IC__))).  Or similarly, with ARM __fp16, 
the complex type _Complex float __attribute__ ((__mode__ (__HC__))).)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx types (was: Advice sought for debugging a lto1 ICE (was: Implement C _FloatN, _FloatNx types [version 6]))

2016-09-16 Thread Joseph Myers
On Fri, 16 Sep 2016, Richard Biener wrote:

> Humm ... do we anywhere compare to those global trees by pointer equivalence?
> If so then it breaks LTO support for those types.

The C front end compares main variants to those types for handling usual 
arithmetic conversions (and more generally for type compatibility), but 
that's not relevant to LTO.  I don't think there should be such 
comparisons outside the front ends.

-- 
Joseph S. Myers
jos...@codesourcery.com


Converting to LRA (calling all maintainers)

2016-09-16 Thread Segher Boessenkool
Hi!

Since a few days TARGET_LRA_P defaults to returning "true".  I changed
all in-tree ports to still behave the same as before, which for most
ports means they use old reload always.  All the primary platforms (see
the release criteria, ) now
default to LRA though.

Since one day (hopefully not too very far in the future) we want to
deprecate and eventually remove old reload, all ports that do not want
to risk being removed should be adapted to work with LRA.  New ports
should use LRA always.

I started a wiki page at 
that gives hints on how to go about moving to LRA.  Please add any and
all details and experiences you think can help others, there!

Thanks for listening to this public service announcement,


Segher


Re: Converting to LRA (calling all maintainers)

2016-09-16 Thread Jeff Law

On 09/16/2016 02:37 PM, Segher Boessenkool wrote:

Hi!

Since a few days TARGET_LRA_P defaults to returning "true".  I changed
all in-tree ports to still behave the same as before, which for most
ports means they use old reload always.  All the primary platforms (see
the release criteria, ) now
default to LRA though.

Since one day (hopefully not too very far in the future) we want to
deprecate and eventually remove old reload, all ports that do not want
to risk being removed should be adapted to work with LRA.  New ports
should use LRA always.

I started a wiki page at 
that gives hints on how to go about moving to LRA.  Please add any and
all details and experiences you think can help others, there!

Thanks for listening to this public service announcement,
Under traps for the unwary -- LRA requires the target to not use the old 
cc0 condition code handling...


ANd yes, I see this as a way to deprecating those cc0 targets like the 
m68k :-)


Jeff


Re: Converting to LRA (calling all maintainers)

2016-09-16 Thread Eric Botcazou
> Since a few days TARGET_LRA_P defaults to returning "true".  I changed
> all in-tree ports to still behave the same as before, which for most
> ports means they use old reload always.  All the primary platforms (see
> the release criteria, ) now
> default to LRA though.

So SPARC was changed and neither DaveM nor I was told about it? ;-)

-- 
Eric Botcazou


Re: Converting to LRA (calling all maintainers)

2016-09-16 Thread Segher Boessenkool
On Fri, Sep 16, 2016 at 02:53:16PM -0600, Jeff Law wrote:
> Under traps for the unwary -- LRA requires the target to not use the old 
> cc0 condition code handling...

I added this now, thanks Jeff.

> ANd yes, I see this as a way to deprecating those cc0 targets like the 
> m68k :-)

Would be a shame to see m68k go.  There still is time...


Segher


Re: Converting to LRA (calling all maintainers)

2016-09-16 Thread Segher Boessenkool
On Fri, Sep 16, 2016 at 11:22:04PM +0200, Eric Botcazou wrote:
> > Since a few days TARGET_LRA_P defaults to returning "true".  I changed
> > all in-tree ports to still behave the same as before, which for most
> > ports means they use old reload always.  All the primary platforms (see
> > the release criteria, ) now
> > default to LRA though.
> 
> So SPARC was changed and neither DaveM nor I was told about it? ;-)

Huh!  For over a year I have been working under the (as it turns out now)
false impression that Power was the last port to change over.

Well that is a nice way to start the weekend.  For me, anyway!


Segher

p.s.  Are there plans for converting the SPARC port?


Re: Converting to LRA (calling all maintainers)

2016-09-16 Thread Eric Botcazou
> p.s.  Are there plans for converting the SPARC port?

There are more than plans - actual patches by DaveM that were installed at 
some point and then reverted quickly because of unexpected fallout.

-- 
Eric Botcazou