Re: TImode for BITS_PER_WORD=32 targets

2020-07-27 Thread Richard Biener via Gcc
On Fri, Jul 24, 2020 at 5:38 PM Andrew Stubbs  wrote:
>
> Hi all,
>
> I want amdgcn to be able to support int128 types, partly because they
> might come up in code offloaded from x86_64 code, and partly because
> libgomp now requires at least some support (amdgcn builds have been
> failing since yesterday).
>
> But, amdgcn has 32-bit registers, and therefore defines BITS_PER_WORD to
> 32, which means that TImode doesn't Just Work, at least not for all
> operators. It already has TImode moves, for internal uses, so I can
> enable TImode and fix the libgomp build, but now libgfortran tries to
> use operators that don't exist, so I'm no better off.
>
> The expand pass won't emit libgcc calls, like it does for DImode, and
> libgcc doesn't have the routines for it anyway. Neither does it support
> synthesized shifts or rotates for more than double-word types.
> (Multiple-word add and subtract appear to work fine, however.)
>
> What would be the best (least effort) way to implement this?
>
> I think I need shift, rotate, multiply, divide, and modulus, but there's
> probably more.

You've figured out that TImode support for SImode word_mode targets
is not implemented in generic code.  So what you need to do is either
provide patterns for all of the operations you need or implement
generic support for libgcc fallbacks (which isn't there).  Joseph might
have an idea what's missing and how difficult it would be (I suppose
we do not want divti3 to end up calling divdi3, thus "stage" TImode
support ontop of DImode ops eventually provided by libgcc only).
libgcc2.c uses LIBGCC2_UNITS_PER_WORD so it _might_ be
possible to somehow do this "staging" by providing two different
values here.  I guess you'd have to try.

Richard.

> Thanks, any advise will be appreciated.
>
> Andrew


Re: Problems with changing the type of an ssa name

2020-07-27 Thread Richard Biener via Gcc
On Sun, Jul 26, 2020 at 10:31 PM Gary Oblock  wrote:
>
> Richard,
>
> As you know I'm working on a structure reorganization optimization.
> The particular one I'm working on is called instance interleaving.
> For the particular case I'm working on now, there is a single array
> of structures being transformed, a pointer to an element of the
> array is transformed into an index into what is now a structure
> of arrays. Note, I did share my HL design document with you so
> there are more details in there if you need them. So what all this
> means is for this example
>
> typedef struct fu fu_t;
> struct fu {
>   char x;
>   inty;
>   double z;
> };
>   :
>   :
>   fu_t *fubar = (fu_t*)malloc(...);
>   fu_t *baz;
>
> That fubar and baz no longer are pointer types and need to be
> transformed into some integer type (say _index_fu_t.) Thus if
> I encounter an ssa_name of type "fu_t *", I'll need to modify its
> type be _index_fu_t. This is of course equivalent to replacing
> that ssa name with a new one of type _index_fu_t.
>
> Now, how do I actually do either of these? My attempts at
> former all failed and the  later seems equally difficult for
> the default defs. Note, prefer modifying them to replacing
> them because it seems more reasonable and it also seems
> to work except for the default defs.
>
> I really need some help with this Richard.

OK, so modifying the SSA name in-place is really bad here
since you _have_ to adjust all uses and defs anyway.  Thus
please create a new SSA name here.

The default-def case you run into is either an uninitialized
value which can easily appear with conditionally initialized
pointers or the SSA name associated with the value of
a function argument.

Once you have to deal with a default def you have to create
a new underlying VAR_DECL (or PARM_DECL if it was a
parameter) with the new type and for the SSA replacement
create its default def (get_or_create_ssa_default_def).

Now for parameters this of course means you have to
adjust function signatures and calls.  For the function
boundary case you'll likely need to pass a pointer to
the structure as well which means you'll have to add
parameters.

As for "replacing" uses you can use immediate uses
to walk them:

 FOR_EACH_IMM_USE_STMT (...)
FOR_EACH_IMM_USE_ON_STMT (..)
...

also the SSA definition statement after your transform
cannot be the same so you have to create another
stmt anyway, no?

Richard.

> Thanks,
>
> Gary
> 
> From: Richard Biener 
> Sent: Saturday, July 25, 2020 10:48 PM
> To: Gary Oblock ; gcc@gcc.gnu.org 
> Subject: Re: Problems with changing the type of an ssa name
>
> [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
> be mindful of safe email handling and proprietary information protection 
> practices.]
>
>
> On July 25, 2020 10:47:59 PM GMT+02:00, Gary Oblock 
>  wrote:
> >Richard,
> >
> >I suppose that might be doable but aren't there any ramifications
> >from the fact that the problematic ssa_names are the default defs?
> >I can imagine easily replacing all the ssa names except those that
> >are default defs.
>
> Well, just changing the SSA names doesn't make it less ramifications. You 
> have to know what you are doing.
>
> So - what's the reason you need to change those SSA name types?
>
> Richard.
>
> >Gary
> >
> >From: Richard Biener 
> >Sent: Friday, July 24, 2020 11:16 PM
> >To: Gary Oblock ; Gary Oblock via Gcc
> >; gcc@gcc.gnu.org 
> >Subject: Re: Problems with changing the type of an ssa name
> >
> >[EXTERNAL EMAIL NOTICE: This email originated from an external sender.
> >Please be mindful of safe email handling and proprietary information
> >protection practices.]
> >
> >
> >On July 25, 2020 7:30:48 AM GMT+02:00, Gary Oblock via Gcc
> > wrote:
> >>If you've followed what I've been up to via my questions
> >>on the mailing list, I finally traced my latest big problem
> >>back to to my own code. In a nut shell here is what
> >>I'm doing.
> >>
> >>I'm creating a new type exaactly like this:
> >>
> >>tree pointer_rep =
> >>  make_signed_type ( TYPE_PRECISION ( pointer_sized_int_node));
> >>TYPE_MAIN_VARIANT ( pointer_rep) =
> >>  TYPE_MAIN_VARIANT ( pointer_sized_int_node);
> >>const char *gcc_name =
> >>identifier_to_locale ( IDENTIFIER_POINTER ( TYPE_NAME (
> >>ri->gcc_type)));
> >>size_t len =
> >>  strlen ( REORG_SP_PTR_PREFIX) + strlen ( gcc_name);
> >>char *name = ( char *)alloca(len + 1);
> >>strcpy ( name, REORG_SP_PTR_PREFIX);
> >>strcat ( name, gcc_name);
> >>TYPE_NAME ( pointer_rep) = get_identifier ( name);
> >>
> >>I detect an ssa_name that I want to change to have this type
> >>and change it thusly. Note, this particular ssa_name is a
> >>default def which I seems to be very pertinent (since it's
> >>the only case that fails.)
> >>
> >>modify_ssa_name_type ( an_ssa_name, pointer_rep);
> >>
> >>void
> >>modify_ssa

Re: Non-inlined functions and mixed architectures

2020-07-27 Thread Florian Weimer via Gcc
* Allan Sandfeld Jensen:

> A problem that I keep running into is functions defined headers, but used in 
> sources files that are compiled with different CPU feature flags (for runtime 
> CPU feature selection).
>
> We know to make sure the functions are inlinable and their address never 
> taken, but of course in debug builds they are still not inlined. Every so 
> often the functions get compiled using some of the optional CPU instructions, 
> and if the linker selects the optimized versions those instructions can then 
> leak through to instances compiled with different CPU flags where the 
> instructions aren't supposed to be used. This happens even in unoptimized 
> debug builds as the extended instruction selections doesn't count as an 
> optimization.

You need to provide source code examples.  This isn't supposed to happen
if you declare the functions as static inline.  If a function is emitted
for any reason, it will be local this particular object file.

Plain inline (for C++) works differently and will attempt to share
implementations.

Thanks,
Florian



Re: Non-inlined functions and mixed architectures

2020-07-27 Thread Allan Sandfeld Jensen
On Montag, 27. Juli 2020 10:33:35 CEST Florian Weimer wrote:
> * Allan Sandfeld Jensen:
> > A problem that I keep running into is functions defined headers, but used
> > in sources files that are compiled with different CPU feature flags (for
> > runtime CPU feature selection).
> > 
> > We know to make sure the functions are inlinable and their address never
> > taken, but of course in debug builds they are still not inlined. Every so
> > often the functions get compiled using some of the optional CPU
> > instructions, and if the linker selects the optimized versions those
> > instructions can then leak through to instances compiled with different
> > CPU flags where the instructions aren't supposed to be used. This happens
> > even in unoptimized debug builds as the extended instruction selections
> > doesn't count as an optimization.
> 
> You need to provide source code examples.  This isn't supposed to happen
> if you declare the functions as static inline.  If a function is emitted
> for any reason, it will be local this particular object file.
> 
> Plain inline (for C++) works differently and will attempt to share
> implementations.
> 
static inline? Hadn't thought of that in a shared header file.

Is harder to do with inline methods in C++ classes though.

A recent example I hit into was methods using a qfloat16 class that 
specializes for F16C when available, see https://codereview.qt-project.org/c/
qt/qtbase/+/307772. Which I guess ought to be split into different classes 
with different constructors, so they don't violate ODR rules to be really safe 
across compilers.

But I guess a case like https://codereview.qt-project.org/c/qt/qtbase/+/308163 
could be solved with static inline instead.

Best regards
Allan




Re: Non-inlined functions and mixed architectures

2020-07-27 Thread Florian Weimer via Gcc
* Allan Sandfeld Jensen:

> On Montag, 27. Juli 2020 10:33:35 CEST Florian Weimer wrote:
>> * Allan Sandfeld Jensen:
>> > A problem that I keep running into is functions defined headers, but used
>> > in sources files that are compiled with different CPU feature flags (for
>> > runtime CPU feature selection).
>> > 
>> > We know to make sure the functions are inlinable and their address never
>> > taken, but of course in debug builds they are still not inlined. Every so
>> > often the functions get compiled using some of the optional CPU
>> > instructions, and if the linker selects the optimized versions those
>> > instructions can then leak through to instances compiled with different
>> > CPU flags where the instructions aren't supposed to be used. This happens
>> > even in unoptimized debug builds as the extended instruction selections
>> > doesn't count as an optimization.
>> 
>> You need to provide source code examples.  This isn't supposed to happen
>> if you declare the functions as static inline.  If a function is emitted
>> for any reason, it will be local this particular object file.
>> 
>> Plain inline (for C++) works differently and will attempt to share
>> implementations.
>> 

> static inline? Hadn't thought of that in a shared header file.
>
> Is harder to do with inline methods in C++ classes though.

Ahh, and anonymous namespaces (the equivalent for that for member
functions) do not work in such cases because the representation of the
class still needs to be shared across API boundaries.  With an anonymous
namspace, that would be undefined.

Thanks,
Florian



Tar version being used

2020-07-27 Thread CHIGOT, CLEMENT via Gcc
Hi everyone,

I'm wondering if someone knows which tar version / configuration was being used 
when creating gcc-10.2.0 tarballs ?

I'm getting some directory checksum errors while trying to unpack it with the 
AIX tar (which can be a bit old). But they are disappearing when I'm building 
these tarballs on Ubuntu-18.04, even with the last tar version 1.32.

Note that gcc-10.1.0 doesn't have these problems, so maybe something have 
changed since.

Sincerely


Clément Chigot
ATOS Bull SAS
1 rue de Provence - 38432 Échirolles - France


Re: Tar version being used

2020-07-27 Thread Jonathan Wakely via Gcc
On Mon, 27 Jul 2020 at 11:59, CHIGOT, CLEMENT via Gcc  wrote:
>
> Hi everyone,
>
> I'm wondering if someone knows which tar version / configuration was being 
> used when creating gcc-10.2.0 tarballs ?

GNU tar 1.30, I think.

>
> I'm getting some directory checksum errors while trying to unpack it with the 
> AIX tar (which can be a bit old). But they are disappearing when I'm building 
> these tarballs on Ubuntu-18.04, even with the last tar version 1.32.

https://gcc.gnu.org/install/prerequisites.html says:

"GNU tar version 1.14 (or later)
Necessary (only on some platforms) to untar the source code. Many
systems’ tar programs will also work, only try GNU tar if you have
problems."


> Note that gcc-10.1.0 doesn't have these problems, so maybe something have 
> changed since.

I think it's the same version of GNU tar on the same system.


Re: Tar version being used

2020-07-27 Thread Jonathan Wakely via Gcc
On Mon, 27 Jul 2020 at 12:34, Jonathan Wakely  wrote:
>
> On Mon, 27 Jul 2020 at 11:59, CHIGOT, CLEMENT via Gcc  wrote:
> >
> > Hi everyone,
> >
> > I'm wondering if someone knows which tar version / configuration was being 
> > used when creating gcc-10.2.0 tarballs ?
>
> GNU tar 1.30, I think.

I've been informed that the tarball isn't necessarily created on the
gcc.gnu.org server, and so it could be using a different version of
tar. The 1.30 version was indeed used for GCC 10.1.0 though/

The advice about using GNU tar still holds though ...

>
> >
> > I'm getting some directory checksum errors while trying to unpack it with 
> > the AIX tar (which can be a bit old). But they are disappearing when I'm 
> > building these tarballs on Ubuntu-18.04, even with the last tar version 
> > 1.32.
>
> https://gcc.gnu.org/install/prerequisites.html says:
>
> "GNU tar version 1.14 (or later)
> Necessary (only on some platforms) to untar the source code. Many
> systems’ tar programs will also work, only try GNU tar if you have
> problems."
>
>
> > Note that gcc-10.1.0 doesn't have these problems, so maybe something have 
> > changed since.
>
> I think it's the same version of GNU tar on the same system.


Re: Tar version being used

2020-07-27 Thread Richard Biener via Gcc
On Mon, Jul 27, 2020 at 12:59 PM CHIGOT, CLEMENT via Gcc
 wrote:
>
> Hi everyone,
>
> I'm wondering if someone knows which tar version / configuration was being 
> used when creating gcc-10.2.0 tarballs ?
>
> I'm getting some directory checksum errors while trying to unpack it with the 
> AIX tar (which can be a bit old). But they are disappearing when I'm building 
> these tarballs on Ubuntu-18.04, even with the last tar version 1.32.
>
> Note that gcc-10.1.0 doesn't have these problems, so maybe something have 
> changed since.

I have used tar 1.30 as shipped by openSUSE Leap 15.1
(tar-1.30-lp151.2.1.x86_64)

Richard.

> Sincerely
>
>
> Clément Chigot
> ATOS Bull SAS
> 1 rue de Provence - 38432 Échirolles - France


Re: Tar version being used

2020-07-27 Thread CHIGOT, CLEMENT via Gcc
> On Mon, Jul 27, 2020 at 12:59 PM CHIGOT, CLEMENT via Gcc
>  wrote:
> >
> > Hi everyone,
> >
> > I'm wondering if someone knows which tar version / configuration was being 
> > used when creating gcc-10.2.0 tarballs ?
> >
> > I'm getting some directory checksum errors while trying to unpack it with 
> > the AIX tar (which can be a bit old). But they are disappearing when I'm 
> > building these tarballs on Ubuntu-18.04, even with the last tar version 
> > 1.32.
> >
> > Note that gcc-10.1.0 doesn't have these problems, so maybe something have 
> > changed since.
> 
> I have used tar 1.30 as shipped by openSUSE Leap 15.1
> (tar-1.30-lp151.2.1.x86_64)

Ok thanks. I don't know what have changed between the two versions. But AIX tar 
doesn't like it... 
Using AIX tar instead of GNU tar was a mistake anyway, so that's not a big 
deal. 

Thanks all for your answers ! 

Clément 

Re: Tar version being used

2020-07-27 Thread David Edelsohn via Gcc
On Mon, Jul 27, 2020 at 7:58 AM Richard Biener via Gcc  wrote:
>
> On Mon, Jul 27, 2020 at 12:59 PM CHIGOT, CLEMENT via Gcc
>  wrote:
> >
> > Hi everyone,
> >
> > I'm wondering if someone knows which tar version / configuration was being 
> > used when creating gcc-10.2.0 tarballs ?
> >
> > I'm getting some directory checksum errors while trying to unpack it with 
> > the AIX tar (which can be a bit old). But they are disappearing when I'm 
> > building these tarballs on Ubuntu-18.04, even with the last tar version 
> > 1.32.
> >
> > Note that gcc-10.1.0 doesn't have these problems, so maybe something have 
> > changed since.
>
> I have used tar 1.30 as shipped by openSUSE Leap 15.1
> (tar-1.30-lp151.2.1.x86_64)

GNU tar is listed as a requirement to install GCC from source:

https://gcc.gnu.org/install/prerequisites.html

Thanks, David


Re: LTO Dead Field Elimination

2020-07-27 Thread Richard Biener via Gcc
On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
 wrote:
>
> This patchset brings back struct reorg to GCC.
>
> We’ve been working on improving cache utilization recently and would
> like to share our current implementation to receive some feedback on it.
>
> Essentially, we’ve implemented the following components:
>
>  Type-based escape analysis to determine if we can reorganize a type
> at link-time
>
>  Dead-field elimination to remove unused fields of a struct at
> link-time
>
> The type-based escape analysis provides a list of types, that are not
> visible outside of the current linking unit (e.g. parameter types of
> external functions).
>
> The dead-field elimination pass analyses non-escaping structs for fields
> that are not used in the linking unit and thus can be removed. The
> resulting struct has a smaller memory footprint, which allows for a
> higher cache utilization.
>
> As a side-effect a couple of new infrastructure code has been written
> (e.g. a type walker, which we were really missing in GCC), which can be
> of course reused for other passes as well.
>
> We’ve prepared a patchset in the following branch:
>
>refs/vendors/ARM/heads/arm-struct-reorg-wip

Just had some time to peek into this.  Ugh.  The code doesn't look like
GCC code looks :/  It doesn't help to have one set of files per C++ class (25!).
The code itself is undocumented - it's hard to understand what the purpose
of all the Walker stuff is.

You also didn't seem to know walk_tree () nor walk_gimple* ().

Take as example - I figured to look for IPA pass entries, then I see

+
+static void
+collect_types ()
+{
+  GimpleTypeCollector collector;
+  collector.walk ();
+  collector.print_collected ();
+  ptrset_t types = collector.get_pointer_set ();
+  GimpleCaster caster (types);
+  caster.walk ();
+  if (flag_print_cast_analysis)
+caster.print_reasons ();
+  ptrset_t casting = caster.get_sets ();
+  fix_escaping_types_in_set (casting);
+  GimpleAccesser accesser;
+  accesser.walk ();
+  if (flag_print_access_analysis)
+accesser.print_accesses ();
+  record_field_map_t record_field_map = accesser.get_map ();
+  TypeIncompleteEquality equality;
+  bool has_fields_that_can_be_deleted = false;
+  typedef std::set field_offsets_t;

there's no comments (not even file-level) that explains how type escape
is computed.

Sorry, but this isn't even close to be coarsely reviewable.

> We’ve also added a subsection in the GCC internals document to allow
> other compiler devs to better understand our design and implementation.
> A generated PDF can be found here:
>
> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F
> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download
>
> page: 719
>
> We’ve been testing the pass against a range of in-tree tests and
> real-life applications (e.g. all SPEC CPU2017 C benchmarks). For
> testing, please see testing subsection in the gcc internals we prepared.
>
> Currently we see the following limitations:
>
> * It is not a "true" ipa pass yes. That is, we can only succeed with
> -flto-partition=none.
> * Currently it is not safe to use -fipa-sra.
> * Brace constructors not supported now. We handle this gracefully.
> * Only C as of now.
> * Results of sizeof() and offsetof() are generated in the compiler
> frontend and thus can’t be changed later at link time. There are a
> couple of ideas to resolve this, but that’s currently unimplemented.
> * At this point we’d like to thank the GCC community for their patient
> help so far on the mailing list and in other channels. And we ask for
> your support in terms of feedback, comments and testing.
>
> Thanks!


Re: LTO Dead Field Elimination

2020-07-27 Thread Erick Ochoa




On 27/07/2020 14:36, Richard Biener wrote:

On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
 wrote:


This patchset brings back struct reorg to GCC.

We’ve been working on improving cache utilization recently and would
like to share our current implementation to receive some feedback on it.

Essentially, we’ve implemented the following components:

  Type-based escape analysis to determine if we can reorganize a type
at link-time

  Dead-field elimination to remove unused fields of a struct at
link-time

The type-based escape analysis provides a list of types, that are not
visible outside of the current linking unit (e.g. parameter types of
external functions).

The dead-field elimination pass analyses non-escaping structs for fields
that are not used in the linking unit and thus can be removed. The
resulting struct has a smaller memory footprint, which allows for a
higher cache utilization.

As a side-effect a couple of new infrastructure code has been written
(e.g. a type walker, which we were really missing in GCC), which can be
of course reused for other passes as well.

We’ve prepared a patchset in the following branch:

refs/vendors/ARM/heads/arm-struct-reorg-wip


Just had some time to peek into this.  Ugh.  The code doesn't look like
GCC code looks :/  It doesn't help to have one set of files per C++ class (25!).
The code itself is undocumented - it's hard to understand what the purpose
of all the Walker stuff is.

You also didn't seem to know walk_tree () nor walk_gimple* ().

Take as example - I figured to look for IPA pass entries, then I see

+
+static void
+collect_types ()
+{
+  GimpleTypeCollector collector;
+  collector.walk ();
+  collector.print_collected ();
+  ptrset_t types = collector.get_pointer_set ();
+  GimpleCaster caster (types);
+  caster.walk ();
+  if (flag_print_cast_analysis)
+caster.print_reasons ();
+  ptrset_t casting = caster.get_sets ();
+  fix_escaping_types_in_set (casting);
+  GimpleAccesser accesser;
+  accesser.walk ();
+  if (flag_print_access_analysis)
+accesser.print_accesses ();
+  record_field_map_t record_field_map = accesser.get_map ();
+  TypeIncompleteEquality equality;
+  bool has_fields_that_can_be_deleted = false;
+  typedef std::set field_offsets_t;

there's no comments (not even file-level) that explains how type escape
is computed.

Sorry, but this isn't even close to be coarsely reviewable.


Thanks Richard!

I will work on making this more readable. I understand your comments and 
you are right. I am currently in the process of writing a human-readable 
PDF with an overview of this. There is already a (somewhat hurried) 
version of this PDF in the GCC internals link we shared. This should 
answer most of the high level questions about how the type escape 
analysis is computed. However, yes, there is still work to be done. I 
will start looking into walk_tree and walk_gimple* and see if the 
gimple_walker I wrote is redundant and substitute it.


Thanks for the input!




We’ve also added a subsection in the GCC internals document to allow
other compiler devs to better understand our design and implementation.
A generated PDF can be found here:

 https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F
 https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download

 page: 719

We’ve been testing the pass against a range of in-tree tests and
real-life applications (e.g. all SPEC CPU2017 C benchmarks). For
testing, please see testing subsection in the gcc internals we prepared.

Currently we see the following limitations:

* It is not a "true" ipa pass yes. That is, we can only succeed with
-flto-partition=none.
* Currently it is not safe to use -fipa-sra.
* Brace constructors not supported now. We handle this gracefully.
* Only C as of now.
* Results of sizeof() and offsetof() are generated in the compiler
frontend and thus can’t be changed later at link time. There are a
couple of ideas to resolve this, but that’s currently unimplemented.
* At this point we’d like to thank the GCC community for their patient
help so far on the mailing list and in other channels. And we ask for
your support in terms of feedback, comments and testing.

Thanks!


Re: LTO Dead Field Elimination

2020-07-27 Thread Christoph Müllner
Hi Richard,

On 7/27/20 2:36 PM, Richard Biener wrote:
> On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
>  wrote:
>>
>> This patchset brings back struct reorg to GCC.
>>
>> We’ve been working on improving cache utilization recently and would
>> like to share our current implementation to receive some feedback on it.
>>
>> Essentially, we’ve implemented the following components:
>>
>>  Type-based escape analysis to determine if we can reorganize a type
>> at link-time
>>
>>  Dead-field elimination to remove unused fields of a struct at
>> link-time
>>
>> The type-based escape analysis provides a list of types, that are not
>> visible outside of the current linking unit (e.g. parameter types of
>> external functions).
>>
>> The dead-field elimination pass analyses non-escaping structs for fields
>> that are not used in the linking unit and thus can be removed. The
>> resulting struct has a smaller memory footprint, which allows for a
>> higher cache utilization.
>>
>> As a side-effect a couple of new infrastructure code has been written
>> (e.g. a type walker, which we were really missing in GCC), which can be
>> of course reused for other passes as well.
>>
>> We’ve prepared a patchset in the following branch:
>>
>>refs/vendors/ARM/heads/arm-struct-reorg-wip
> 
> Just had some time to peek into this.  Ugh.  The code doesn't look like
> GCC code looks :/  It doesn't help to have one set of files per C++ class 
> (25!).

Any suggestions how to best structure these?
Are there some coding guidelines in the GCC project,
which can help us to match the expectation?

> The code itself is undocumented - it's hard to understand what the purpose
> of all the Walker stuff is.
> 
> You also didn't seem to know walk_tree () nor walk_gimple* ().

True, we were not aware of that code.
Thanks for pointing to that code.
We will have a look.

> Take as example - I figured to look for IPA pass entries, then I see
> 
> +
> +static void
> +collect_types ()
> +{
> +  GimpleTypeCollector collector;
> +  collector.walk ();
> +  collector.print_collected ();
> +  ptrset_t types = collector.get_pointer_set ();
> +  GimpleCaster caster (types);
> +  caster.walk ();
> +  if (flag_print_cast_analysis)
> +caster.print_reasons ();
> +  ptrset_t casting = caster.get_sets ();
> +  fix_escaping_types_in_set (casting);
> +  GimpleAccesser accesser;
> +  accesser.walk ();
> +  if (flag_print_access_analysis)
> +accesser.print_accesses ();
> +  record_field_map_t record_field_map = accesser.get_map ();
> +  TypeIncompleteEquality equality;
> +  bool has_fields_that_can_be_deleted = false;
> +  typedef std::set field_offsets_t;
> 
> there's no comments (not even file-level) that explains how type escape
> is computed.
> 
> Sorry, but this isn't even close to be coarsely reviewable.

Sad to hear.
We'll work on the input that you provided and provide a new version.

Thanks,
Christoph

> 
>> We’ve also added a subsection in the GCC internals document to allow
>> other compiler devs to better understand our design and implementation.
>> A generated PDF can be found here:
>>
>> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F
>> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download
>>
>> page: 719
>>
>> We’ve been testing the pass against a range of in-tree tests and
>> real-life applications (e.g. all SPEC CPU2017 C benchmarks). For
>> testing, please see testing subsection in the gcc internals we prepared.
>>
>> Currently we see the following limitations:
>>
>> * It is not a "true" ipa pass yes. That is, we can only succeed with
>> -flto-partition=none.
>> * Currently it is not safe to use -fipa-sra.
>> * Brace constructors not supported now. We handle this gracefully.
>> * Only C as of now.
>> * Results of sizeof() and offsetof() are generated in the compiler
>> frontend and thus can’t be changed later at link time. There are a
>> couple of ideas to resolve this, but that’s currently unimplemented.
>> * At this point we’d like to thank the GCC community for their patient
>> help so far on the mailing list and in other channels. And we ask for
>> your support in terms of feedback, comments and testing.
>>
>> Thanks!


Re: LTO Dead Field Elimination

2020-07-27 Thread Jakub Jelinek via Gcc
On Mon, Jul 27, 2020 at 02:52:21PM +0200, Erick Ochoa wrote:
> I will work on making this more readable. I understand your comments and you
> are right. I am currently in the process of writing a human-readable PDF
> with an overview of this. There is already a (somewhat hurried) version of
> this PDF in the GCC internals link we shared. This should answer most of the
> high level questions about how the type escape analysis is computed.
> However, yes, there is still work to be done. I will start looking into
> walk_tree and walk_gimple* and see if the gimple_walker I wrote is redundant
> and substitute it.

The overview doesn't belong in some on the side document, but should be in
the source (usually at the start of the source file that contains the pass).
And then each function should be documented.

I'd also like to stress what has been seen several times in the past, but is
being not taken into account.  One of the reasons why old ipa struct reorg
has been removed is that it worked on the types granularity, if it is safe
to adjust all references to all objects of a certain type, then do it,
otherwise give up.  That is fundamentally wrong, it can't be useful except
by pure luck on some benchmark.
Instead, it should work on the object (or sets of those) granularity.
If the analysis phase determines all accesses to certain object (automatic
or global variable, or e.g. memory allocation) can be adjusted by using a
different type for it (with either fields reordered, or some removed etc.),
then it should be optimized regardless of whether other objects can be
optimized similarly or not.  In other cases, it will not be possible to
optimize a single object on its own, e.g. some function is called with
an address of one of 10 objects, so in that case put those 10 objects into a
set and either optimize those together if possible, or punt on those
together.  But only very seldom all objects of certain type can be optimized
the same way, and even if they do, perhaps it is better to optimize some of
them even further (e.g. you find none of the objects of type XYZ use field
abc, and decide to adjust the type and remove the field.  But perhaps 50% of
objects of that type also don't use def field and if those were optimized
separately, you could optimize that too).

Jakub



Re: LTO Dead Field Elimination

2020-07-27 Thread Richard Biener via Gcc
On Mon, Jul 27, 2020 at 2:59 PM Christoph Müllner
 wrote:
>
> Hi Richard,
>
> On 7/27/20 2:36 PM, Richard Biener wrote:
> > On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
> >  wrote:
> >>
> >> This patchset brings back struct reorg to GCC.
> >>
> >> We’ve been working on improving cache utilization recently and would
> >> like to share our current implementation to receive some feedback on it.
> >>
> >> Essentially, we’ve implemented the following components:
> >>
> >>  Type-based escape analysis to determine if we can reorganize a type
> >> at link-time
> >>
> >>  Dead-field elimination to remove unused fields of a struct at
> >> link-time
> >>
> >> The type-based escape analysis provides a list of types, that are not
> >> visible outside of the current linking unit (e.g. parameter types of
> >> external functions).
> >>
> >> The dead-field elimination pass analyses non-escaping structs for fields
> >> that are not used in the linking unit and thus can be removed. The
> >> resulting struct has a smaller memory footprint, which allows for a
> >> higher cache utilization.
> >>
> >> As a side-effect a couple of new infrastructure code has been written
> >> (e.g. a type walker, which we were really missing in GCC), which can be
> >> of course reused for other passes as well.
> >>
> >> We’ve prepared a patchset in the following branch:
> >>
> >>refs/vendors/ARM/heads/arm-struct-reorg-wip
> >
> > Just had some time to peek into this.  Ugh.  The code doesn't look like
> > GCC code looks :/  It doesn't help to have one set of files per C++ class 
> > (25!).
>
> Any suggestions how to best structure these?

As "bad" as it sounds, put everything into one file (maybe separate out
type escape analysis from the actual transform).  Add a toplevel comment
per file explaining things.

> Are there some coding guidelines in the GCC project,
> which can help us to match the expectation?

Look at existing passes, otherwise there's mostly conventions on
formatting.

> > The code itself is undocumented - it's hard to understand what the purpose
> > of all the Walker stuff is.
> >
> > You also didn't seem to know walk_tree () nor walk_gimple* ().
>
> True, we were not aware of that code.
> Thanks for pointing to that code.
> We will have a look.
>
> > Take as example - I figured to look for IPA pass entries, then I see
> >
> > +
> > +static void
> > +collect_types ()
> > +{
> > +  GimpleTypeCollector collector;
> > +  collector.walk ();
> > +  collector.print_collected ();
> > +  ptrset_t types = collector.get_pointer_set ();
> > +  GimpleCaster caster (types);
> > +  caster.walk ();
> > +  if (flag_print_cast_analysis)
> > +caster.print_reasons ();
> > +  ptrset_t casting = caster.get_sets ();
> > +  fix_escaping_types_in_set (casting);
> > +  GimpleAccesser accesser;
> > +  accesser.walk ();
> > +  if (flag_print_access_analysis)
> > +accesser.print_accesses ();
> > +  record_field_map_t record_field_map = accesser.get_map ();
> > +  TypeIncompleteEquality equality;
> > +  bool has_fields_that_can_be_deleted = false;
> > +  typedef std::set field_offsets_t;
> >
> > there's no comments (not even file-level) that explains how type escape
> > is computed.
> >
> > Sorry, but this isn't even close to be coarsely reviewable.
>
> Sad to hear.
> We'll work on the input that you provided and provide a new version.
>
> Thanks,
> Christoph
>
> >
> >> We’ve also added a subsection in the GCC internals document to allow
> >> other compiler devs to better understand our design and implementation.
> >> A generated PDF can be found here:
> >>
> >> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F
> >> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download
> >>
> >> page: 719
> >>
> >> We’ve been testing the pass against a range of in-tree tests and
> >> real-life applications (e.g. all SPEC CPU2017 C benchmarks). For
> >> testing, please see testing subsection in the gcc internals we prepared.
> >>
> >> Currently we see the following limitations:
> >>
> >> * It is not a "true" ipa pass yes. That is, we can only succeed with
> >> -flto-partition=none.
> >> * Currently it is not safe to use -fipa-sra.
> >> * Brace constructors not supported now. We handle this gracefully.
> >> * Only C as of now.
> >> * Results of sizeof() and offsetof() are generated in the compiler
> >> frontend and thus can’t be changed later at link time. There are a
> >> couple of ideas to resolve this, but that’s currently unimplemented.
> >> * At this point we’d like to thank the GCC community for their patient
> >> help so far on the mailing list and in other channels. And we ask for
> >> your support in terms of feedback, comments and testing.
> >>
> >> Thanks!


Re: LTO Dead Field Elimination

2020-07-27 Thread David Edelsohn via Gcc
On Mon, Jul 27, 2020 at 9:03 AM Christoph Müllner
 wrote:
>
> Hi Richard,
>
> On 7/27/20 2:36 PM, Richard Biener wrote:
> > On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
> >  wrote:
> >>
> >> This patchset brings back struct reorg to GCC.
> >>
> >> We’ve been working on improving cache utilization recently and would
> >> like to share our current implementation to receive some feedback on it.
> >>
> >> Essentially, we’ve implemented the following components:
> >>
> >>  Type-based escape analysis to determine if we can reorganize a type
> >> at link-time
> >>
> >>  Dead-field elimination to remove unused fields of a struct at
> >> link-time
> >>
> >> The type-based escape analysis provides a list of types, that are not
> >> visible outside of the current linking unit (e.g. parameter types of
> >> external functions).
> >>
> >> The dead-field elimination pass analyses non-escaping structs for fields
> >> that are not used in the linking unit and thus can be removed. The
> >> resulting struct has a smaller memory footprint, which allows for a
> >> higher cache utilization.
> >>
> >> As a side-effect a couple of new infrastructure code has been written
> >> (e.g. a type walker, which we were really missing in GCC), which can be
> >> of course reused for other passes as well.
> >>
> >> We’ve prepared a patchset in the following branch:
> >>
> >>refs/vendors/ARM/heads/arm-struct-reorg-wip
> >
> > Just had some time to peek into this.  Ugh.  The code doesn't look like
> > GCC code looks :/  It doesn't help to have one set of files per C++ class 
> > (25!).
>
> Any suggestions how to best structure these?
> Are there some coding guidelines in the GCC project,
> which can help us to match the expectation?

Thanks for working on this feature in GCC.  The coding standards for
the GNU Project and for GCC are:

https://www.gnu.org/prep/standards/
https://gcc.gnu.org/codingconventions.html

As others have mentioned, the documentation should be in the source
code itself.  An external design document is nice, but the code should
seemlessly integrate with the rest of the GCC source code, not a
separate, self-contained set of files bolted onto the side.

Thanks, David


Re: Problems with changing the type of an ssa name

2020-07-27 Thread Michael Matz
Hello,

On Sat, 25 Jul 2020, Gary Oblock via Gcc wrote:

>   if ( TYPE_P ( type) )
> {
>TREE_TYPE ( ssa_name) = TYPE_MAIN_VARIANT ( type);
>if ( ssa_defined_default_def_p ( ssa_name) )
>   {
>  // I guessing which I know is a terrible thing to do...
>  SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, TYPE_MAIN_VARIANT ( 
> type));

As the macro name indicates this takes a VAR_DECL, or an IDENTIFIER_NODE.  
You put in a type, that won't work.

You also simply override the type of the SSA name, without also caring for 
the type of the underlying variable that is (potentially!) associated with 
the SSA name; if those two disagree then issues will arise, you have to 
replace either the variables type (not advisable!), or the associated 
variable, with either nothing or a new variable (of the appropriate type), 
or an mere identifier.  Generally you can't modify SSA names in place like 
this, i.e. as Richi says, create new SSA names, replace all occurences of 
one with the other.


Ciao,
Michael.


Re: LTO Dead Field Elimination

2020-07-27 Thread David Brown
On 24/07/2020 17:43, Erick Ochoa wrote:
> This patchset brings back struct reorg to GCC.
> 
> We’ve been working on improving cache utilization recently and would
> like to share our current implementation to receive some feedback on it.
> 
> Essentially, we’ve implemented the following components:
> 
>     Type-based escape analysis to determine if we can reorganize a type
> at link-time
> 
>     Dead-field elimination to remove unused fields of a struct at link-time
> 
> The type-based escape analysis provides a list of types, that are not
> visible outside of the current linking unit (e.g. parameter types of
> external functions).
> 
> The dead-field elimination pass analyses non-escaping structs for fields
> that are not used in the linking unit and thus can be removed. The
> resulting struct has a smaller memory footprint, which allows for a
> higher cache utilization.
> 

I am very much a lurker on this list, and not a gcc developer - I am
just an enthusiastic user.  So my opinion here might not even be worth
the apocryphal two cents, and I won't feel insulted if someone says I
don't know what I am talking about here.

With that disclaimer out of the way, my immediate reaction to this idea
is "Why?".

What is the aim of this feature?  I can see it making many things a lot
more complicated, but I can't see it being of correspondingly
significant benefit.

Do you have reason to suppose that there really are many structs in use
in real code, for which there are fields that aren't used, and for which
removing those fields makes a /significant/ improvement to the
efficiency of the code?

If I have the history correct, gcc used to have an optimisation that
would let it re-order fields in a struct to reduce padding.  This got
removed in later versions because it made things so complicated in other
parts of compilation and linking, especially if some parts of the code
are compiled with different options.  Why would these new features not
suffer from exactly the same kinds of issues?

I worry that this kind of global optimisation has so many knock-on
effects that it simply is not worth the cost.  Remember that it is not
just in writing these patches that work must be done - people are going
to have to maintain the code for ever after, and it could mean changes
or limitations in other parts of gcc.

As I see it, there is a simpler path with much lower cost.  When I am
writing code, if I have poor struct layout or unnecessary fields, I
don't want the compiler to re-organise things behind my back.  I want to
be /told/.  Either I want the field there (perhaps I haven't written the
code that uses it yet), or I've made a mistake in my code.

My recommendation here would be to keep the analysis part - that could
be useful for other features too.  But drop the optimisation part -
replace it with a warning.  Tell the programmer, and let /them/ decide
whether or not the field is unnecessary.  Let the programmer make the
decision and the change - then you get all the benefits of the
optimisation with none of the risks or complications.

David



Re: Gcc Digest, Vol 5, Issue 52

2020-07-27 Thread Gary Oblock via Gcc
Almost all of the makes sense to.

I'm not sure what a conditionally initialized pointer is.

You mention VAR_DECL but I assume this is for
completeness and not something I'll run across
associated with a default def (but then again I don't
understand notion of a conditionally initialized
pointer.)

I'm at the moment only dealing with a single malloced
array of structures of the given type (though multiple types could have this 
property.) I intend to extend this to cover multiple array and static 
allocations but I need to get the easiest case working first. This means no 
side pointers are needed and if and when I need them pointer will get 
transformed into a base and index pair.

I intend to do the creation of new ssa names as a separate pass from the gimple 
transformations. So I will technically be creating for the duration of the pass 
possibly two defs associated with a single gimple statement. Do I need to 
delete the old ssa names
via some mechanism?

By the way this is really helpful information. The only
other person on the project, Erick, is a continent away
and has about as much experience with gimple as
me but a whole heck lot less compiler experience.

Thanks,

Gary


From: Gcc  on behalf of gcc-requ...@gcc.gnu.org 

Sent: Monday, July 27, 2020 1:33 AM
To: gcc@gcc.gnu.org 
Subject: Gcc Digest, Vol 5, Issue 52

[EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please 
be mindful of safe email handling and proprietary information protection 
practices.]


Send Gcc mailing list submissions to
gcc@gcc.gnu.org

To subscribe or unsubscribe via the World Wide Web, visit
http://gcc.gnu.org/mailman/listinfo/gcc
or, via email, send a message with subject or body 'help' to
gcc-requ...@gcc.gnu.org

You can reach the person managing the list at
gcc-ow...@gcc.gnu.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Gcc digest..."