Re: TImode for BITS_PER_WORD=32 targets
On Fri, Jul 24, 2020 at 5:38 PM Andrew Stubbs wrote: > > Hi all, > > I want amdgcn to be able to support int128 types, partly because they > might come up in code offloaded from x86_64 code, and partly because > libgomp now requires at least some support (amdgcn builds have been > failing since yesterday). > > But, amdgcn has 32-bit registers, and therefore defines BITS_PER_WORD to > 32, which means that TImode doesn't Just Work, at least not for all > operators. It already has TImode moves, for internal uses, so I can > enable TImode and fix the libgomp build, but now libgfortran tries to > use operators that don't exist, so I'm no better off. > > The expand pass won't emit libgcc calls, like it does for DImode, and > libgcc doesn't have the routines for it anyway. Neither does it support > synthesized shifts or rotates for more than double-word types. > (Multiple-word add and subtract appear to work fine, however.) > > What would be the best (least effort) way to implement this? > > I think I need shift, rotate, multiply, divide, and modulus, but there's > probably more. You've figured out that TImode support for SImode word_mode targets is not implemented in generic code. So what you need to do is either provide patterns for all of the operations you need or implement generic support for libgcc fallbacks (which isn't there). Joseph might have an idea what's missing and how difficult it would be (I suppose we do not want divti3 to end up calling divdi3, thus "stage" TImode support ontop of DImode ops eventually provided by libgcc only). libgcc2.c uses LIBGCC2_UNITS_PER_WORD so it _might_ be possible to somehow do this "staging" by providing two different values here. I guess you'd have to try. Richard. > Thanks, any advise will be appreciated. > > Andrew
Re: Problems with changing the type of an ssa name
On Sun, Jul 26, 2020 at 10:31 PM Gary Oblock wrote: > > Richard, > > As you know I'm working on a structure reorganization optimization. > The particular one I'm working on is called instance interleaving. > For the particular case I'm working on now, there is a single array > of structures being transformed, a pointer to an element of the > array is transformed into an index into what is now a structure > of arrays. Note, I did share my HL design document with you so > there are more details in there if you need them. So what all this > means is for this example > > typedef struct fu fu_t; > struct fu { > char x; > inty; > double z; > }; > : > : > fu_t *fubar = (fu_t*)malloc(...); > fu_t *baz; > > That fubar and baz no longer are pointer types and need to be > transformed into some integer type (say _index_fu_t.) Thus if > I encounter an ssa_name of type "fu_t *", I'll need to modify its > type be _index_fu_t. This is of course equivalent to replacing > that ssa name with a new one of type _index_fu_t. > > Now, how do I actually do either of these? My attempts at > former all failed and the later seems equally difficult for > the default defs. Note, prefer modifying them to replacing > them because it seems more reasonable and it also seems > to work except for the default defs. > > I really need some help with this Richard. OK, so modifying the SSA name in-place is really bad here since you _have_ to adjust all uses and defs anyway. Thus please create a new SSA name here. The default-def case you run into is either an uninitialized value which can easily appear with conditionally initialized pointers or the SSA name associated with the value of a function argument. Once you have to deal with a default def you have to create a new underlying VAR_DECL (or PARM_DECL if it was a parameter) with the new type and for the SSA replacement create its default def (get_or_create_ssa_default_def). Now for parameters this of course means you have to adjust function signatures and calls. For the function boundary case you'll likely need to pass a pointer to the structure as well which means you'll have to add parameters. As for "replacing" uses you can use immediate uses to walk them: FOR_EACH_IMM_USE_STMT (...) FOR_EACH_IMM_USE_ON_STMT (..) ... also the SSA definition statement after your transform cannot be the same so you have to create another stmt anyway, no? Richard. > Thanks, > > Gary > > From: Richard Biener > Sent: Saturday, July 25, 2020 10:48 PM > To: Gary Oblock ; gcc@gcc.gnu.org > Subject: Re: Problems with changing the type of an ssa name > > [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please > be mindful of safe email handling and proprietary information protection > practices.] > > > On July 25, 2020 10:47:59 PM GMT+02:00, Gary Oblock > wrote: > >Richard, > > > >I suppose that might be doable but aren't there any ramifications > >from the fact that the problematic ssa_names are the default defs? > >I can imagine easily replacing all the ssa names except those that > >are default defs. > > Well, just changing the SSA names doesn't make it less ramifications. You > have to know what you are doing. > > So - what's the reason you need to change those SSA name types? > > Richard. > > >Gary > > > >From: Richard Biener > >Sent: Friday, July 24, 2020 11:16 PM > >To: Gary Oblock ; Gary Oblock via Gcc > >; gcc@gcc.gnu.org > >Subject: Re: Problems with changing the type of an ssa name > > > >[EXTERNAL EMAIL NOTICE: This email originated from an external sender. > >Please be mindful of safe email handling and proprietary information > >protection practices.] > > > > > >On July 25, 2020 7:30:48 AM GMT+02:00, Gary Oblock via Gcc > > wrote: > >>If you've followed what I've been up to via my questions > >>on the mailing list, I finally traced my latest big problem > >>back to to my own code. In a nut shell here is what > >>I'm doing. > >> > >>I'm creating a new type exaactly like this: > >> > >>tree pointer_rep = > >> make_signed_type ( TYPE_PRECISION ( pointer_sized_int_node)); > >>TYPE_MAIN_VARIANT ( pointer_rep) = > >> TYPE_MAIN_VARIANT ( pointer_sized_int_node); > >>const char *gcc_name = > >>identifier_to_locale ( IDENTIFIER_POINTER ( TYPE_NAME ( > >>ri->gcc_type))); > >>size_t len = > >> strlen ( REORG_SP_PTR_PREFIX) + strlen ( gcc_name); > >>char *name = ( char *)alloca(len + 1); > >>strcpy ( name, REORG_SP_PTR_PREFIX); > >>strcat ( name, gcc_name); > >>TYPE_NAME ( pointer_rep) = get_identifier ( name); > >> > >>I detect an ssa_name that I want to change to have this type > >>and change it thusly. Note, this particular ssa_name is a > >>default def which I seems to be very pertinent (since it's > >>the only case that fails.) > >> > >>modify_ssa_name_type ( an_ssa_name, pointer_rep); > >> > >>void > >>modify_ssa
Re: Non-inlined functions and mixed architectures
* Allan Sandfeld Jensen: > A problem that I keep running into is functions defined headers, but used in > sources files that are compiled with different CPU feature flags (for runtime > CPU feature selection). > > We know to make sure the functions are inlinable and their address never > taken, but of course in debug builds they are still not inlined. Every so > often the functions get compiled using some of the optional CPU instructions, > and if the linker selects the optimized versions those instructions can then > leak through to instances compiled with different CPU flags where the > instructions aren't supposed to be used. This happens even in unoptimized > debug builds as the extended instruction selections doesn't count as an > optimization. You need to provide source code examples. This isn't supposed to happen if you declare the functions as static inline. If a function is emitted for any reason, it will be local this particular object file. Plain inline (for C++) works differently and will attempt to share implementations. Thanks, Florian
Re: Non-inlined functions and mixed architectures
On Montag, 27. Juli 2020 10:33:35 CEST Florian Weimer wrote: > * Allan Sandfeld Jensen: > > A problem that I keep running into is functions defined headers, but used > > in sources files that are compiled with different CPU feature flags (for > > runtime CPU feature selection). > > > > We know to make sure the functions are inlinable and their address never > > taken, but of course in debug builds they are still not inlined. Every so > > often the functions get compiled using some of the optional CPU > > instructions, and if the linker selects the optimized versions those > > instructions can then leak through to instances compiled with different > > CPU flags where the instructions aren't supposed to be used. This happens > > even in unoptimized debug builds as the extended instruction selections > > doesn't count as an optimization. > > You need to provide source code examples. This isn't supposed to happen > if you declare the functions as static inline. If a function is emitted > for any reason, it will be local this particular object file. > > Plain inline (for C++) works differently and will attempt to share > implementations. > static inline? Hadn't thought of that in a shared header file. Is harder to do with inline methods in C++ classes though. A recent example I hit into was methods using a qfloat16 class that specializes for F16C when available, see https://codereview.qt-project.org/c/ qt/qtbase/+/307772. Which I guess ought to be split into different classes with different constructors, so they don't violate ODR rules to be really safe across compilers. But I guess a case like https://codereview.qt-project.org/c/qt/qtbase/+/308163 could be solved with static inline instead. Best regards Allan
Re: Non-inlined functions and mixed architectures
* Allan Sandfeld Jensen: > On Montag, 27. Juli 2020 10:33:35 CEST Florian Weimer wrote: >> * Allan Sandfeld Jensen: >> > A problem that I keep running into is functions defined headers, but used >> > in sources files that are compiled with different CPU feature flags (for >> > runtime CPU feature selection). >> > >> > We know to make sure the functions are inlinable and their address never >> > taken, but of course in debug builds they are still not inlined. Every so >> > often the functions get compiled using some of the optional CPU >> > instructions, and if the linker selects the optimized versions those >> > instructions can then leak through to instances compiled with different >> > CPU flags where the instructions aren't supposed to be used. This happens >> > even in unoptimized debug builds as the extended instruction selections >> > doesn't count as an optimization. >> >> You need to provide source code examples. This isn't supposed to happen >> if you declare the functions as static inline. If a function is emitted >> for any reason, it will be local this particular object file. >> >> Plain inline (for C++) works differently and will attempt to share >> implementations. >> > static inline? Hadn't thought of that in a shared header file. > > Is harder to do with inline methods in C++ classes though. Ahh, and anonymous namespaces (the equivalent for that for member functions) do not work in such cases because the representation of the class still needs to be shared across API boundaries. With an anonymous namspace, that would be undefined. Thanks, Florian
Tar version being used
Hi everyone, I'm wondering if someone knows which tar version / configuration was being used when creating gcc-10.2.0 tarballs ? I'm getting some directory checksum errors while trying to unpack it with the AIX tar (which can be a bit old). But they are disappearing when I'm building these tarballs on Ubuntu-18.04, even with the last tar version 1.32. Note that gcc-10.1.0 doesn't have these problems, so maybe something have changed since. Sincerely Clément Chigot ATOS Bull SAS 1 rue de Provence - 38432 Échirolles - France
Re: Tar version being used
On Mon, 27 Jul 2020 at 11:59, CHIGOT, CLEMENT via Gcc wrote: > > Hi everyone, > > I'm wondering if someone knows which tar version / configuration was being > used when creating gcc-10.2.0 tarballs ? GNU tar 1.30, I think. > > I'm getting some directory checksum errors while trying to unpack it with the > AIX tar (which can be a bit old). But they are disappearing when I'm building > these tarballs on Ubuntu-18.04, even with the last tar version 1.32. https://gcc.gnu.org/install/prerequisites.html says: "GNU tar version 1.14 (or later) Necessary (only on some platforms) to untar the source code. Many systems’ tar programs will also work, only try GNU tar if you have problems." > Note that gcc-10.1.0 doesn't have these problems, so maybe something have > changed since. I think it's the same version of GNU tar on the same system.
Re: Tar version being used
On Mon, 27 Jul 2020 at 12:34, Jonathan Wakely wrote: > > On Mon, 27 Jul 2020 at 11:59, CHIGOT, CLEMENT via Gcc wrote: > > > > Hi everyone, > > > > I'm wondering if someone knows which tar version / configuration was being > > used when creating gcc-10.2.0 tarballs ? > > GNU tar 1.30, I think. I've been informed that the tarball isn't necessarily created on the gcc.gnu.org server, and so it could be using a different version of tar. The 1.30 version was indeed used for GCC 10.1.0 though/ The advice about using GNU tar still holds though ... > > > > > I'm getting some directory checksum errors while trying to unpack it with > > the AIX tar (which can be a bit old). But they are disappearing when I'm > > building these tarballs on Ubuntu-18.04, even with the last tar version > > 1.32. > > https://gcc.gnu.org/install/prerequisites.html says: > > "GNU tar version 1.14 (or later) > Necessary (only on some platforms) to untar the source code. Many > systems’ tar programs will also work, only try GNU tar if you have > problems." > > > > Note that gcc-10.1.0 doesn't have these problems, so maybe something have > > changed since. > > I think it's the same version of GNU tar on the same system.
Re: Tar version being used
On Mon, Jul 27, 2020 at 12:59 PM CHIGOT, CLEMENT via Gcc wrote: > > Hi everyone, > > I'm wondering if someone knows which tar version / configuration was being > used when creating gcc-10.2.0 tarballs ? > > I'm getting some directory checksum errors while trying to unpack it with the > AIX tar (which can be a bit old). But they are disappearing when I'm building > these tarballs on Ubuntu-18.04, even with the last tar version 1.32. > > Note that gcc-10.1.0 doesn't have these problems, so maybe something have > changed since. I have used tar 1.30 as shipped by openSUSE Leap 15.1 (tar-1.30-lp151.2.1.x86_64) Richard. > Sincerely > > > Clément Chigot > ATOS Bull SAS > 1 rue de Provence - 38432 Échirolles - France
Re: Tar version being used
> On Mon, Jul 27, 2020 at 12:59 PM CHIGOT, CLEMENT via Gcc > wrote: > > > > Hi everyone, > > > > I'm wondering if someone knows which tar version / configuration was being > > used when creating gcc-10.2.0 tarballs ? > > > > I'm getting some directory checksum errors while trying to unpack it with > > the AIX tar (which can be a bit old). But they are disappearing when I'm > > building these tarballs on Ubuntu-18.04, even with the last tar version > > 1.32. > > > > Note that gcc-10.1.0 doesn't have these problems, so maybe something have > > changed since. > > I have used tar 1.30 as shipped by openSUSE Leap 15.1 > (tar-1.30-lp151.2.1.x86_64) Ok thanks. I don't know what have changed between the two versions. But AIX tar doesn't like it... Using AIX tar instead of GNU tar was a mistake anyway, so that's not a big deal. Thanks all for your answers ! Clément
Re: Tar version being used
On Mon, Jul 27, 2020 at 7:58 AM Richard Biener via Gcc wrote: > > On Mon, Jul 27, 2020 at 12:59 PM CHIGOT, CLEMENT via Gcc > wrote: > > > > Hi everyone, > > > > I'm wondering if someone knows which tar version / configuration was being > > used when creating gcc-10.2.0 tarballs ? > > > > I'm getting some directory checksum errors while trying to unpack it with > > the AIX tar (which can be a bit old). But they are disappearing when I'm > > building these tarballs on Ubuntu-18.04, even with the last tar version > > 1.32. > > > > Note that gcc-10.1.0 doesn't have these problems, so maybe something have > > changed since. > > I have used tar 1.30 as shipped by openSUSE Leap 15.1 > (tar-1.30-lp151.2.1.x86_64) GNU tar is listed as a requirement to install GCC from source: https://gcc.gnu.org/install/prerequisites.html Thanks, David
Re: LTO Dead Field Elimination
On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa wrote: > > This patchset brings back struct reorg to GCC. > > We’ve been working on improving cache utilization recently and would > like to share our current implementation to receive some feedback on it. > > Essentially, we’ve implemented the following components: > > Type-based escape analysis to determine if we can reorganize a type > at link-time > > Dead-field elimination to remove unused fields of a struct at > link-time > > The type-based escape analysis provides a list of types, that are not > visible outside of the current linking unit (e.g. parameter types of > external functions). > > The dead-field elimination pass analyses non-escaping structs for fields > that are not used in the linking unit and thus can be removed. The > resulting struct has a smaller memory footprint, which allows for a > higher cache utilization. > > As a side-effect a couple of new infrastructure code has been written > (e.g. a type walker, which we were really missing in GCC), which can be > of course reused for other passes as well. > > We’ve prepared a patchset in the following branch: > >refs/vendors/ARM/heads/arm-struct-reorg-wip Just had some time to peek into this. Ugh. The code doesn't look like GCC code looks :/ It doesn't help to have one set of files per C++ class (25!). The code itself is undocumented - it's hard to understand what the purpose of all the Walker stuff is. You also didn't seem to know walk_tree () nor walk_gimple* (). Take as example - I figured to look for IPA pass entries, then I see + +static void +collect_types () +{ + GimpleTypeCollector collector; + collector.walk (); + collector.print_collected (); + ptrset_t types = collector.get_pointer_set (); + GimpleCaster caster (types); + caster.walk (); + if (flag_print_cast_analysis) +caster.print_reasons (); + ptrset_t casting = caster.get_sets (); + fix_escaping_types_in_set (casting); + GimpleAccesser accesser; + accesser.walk (); + if (flag_print_access_analysis) +accesser.print_accesses (); + record_field_map_t record_field_map = accesser.get_map (); + TypeIncompleteEquality equality; + bool has_fields_that_can_be_deleted = false; + typedef std::set field_offsets_t; there's no comments (not even file-level) that explains how type escape is computed. Sorry, but this isn't even close to be coarsely reviewable. > We’ve also added a subsection in the GCC internals document to allow > other compiler devs to better understand our design and implementation. > A generated PDF can be found here: > > https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F > https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download > > page: 719 > > We’ve been testing the pass against a range of in-tree tests and > real-life applications (e.g. all SPEC CPU2017 C benchmarks). For > testing, please see testing subsection in the gcc internals we prepared. > > Currently we see the following limitations: > > * It is not a "true" ipa pass yes. That is, we can only succeed with > -flto-partition=none. > * Currently it is not safe to use -fipa-sra. > * Brace constructors not supported now. We handle this gracefully. > * Only C as of now. > * Results of sizeof() and offsetof() are generated in the compiler > frontend and thus can’t be changed later at link time. There are a > couple of ideas to resolve this, but that’s currently unimplemented. > * At this point we’d like to thank the GCC community for their patient > help so far on the mailing list and in other channels. And we ask for > your support in terms of feedback, comments and testing. > > Thanks!
Re: LTO Dead Field Elimination
On 27/07/2020 14:36, Richard Biener wrote: On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa wrote: This patchset brings back struct reorg to GCC. We’ve been working on improving cache utilization recently and would like to share our current implementation to receive some feedback on it. Essentially, we’ve implemented the following components: Type-based escape analysis to determine if we can reorganize a type at link-time Dead-field elimination to remove unused fields of a struct at link-time The type-based escape analysis provides a list of types, that are not visible outside of the current linking unit (e.g. parameter types of external functions). The dead-field elimination pass analyses non-escaping structs for fields that are not used in the linking unit and thus can be removed. The resulting struct has a smaller memory footprint, which allows for a higher cache utilization. As a side-effect a couple of new infrastructure code has been written (e.g. a type walker, which we were really missing in GCC), which can be of course reused for other passes as well. We’ve prepared a patchset in the following branch: refs/vendors/ARM/heads/arm-struct-reorg-wip Just had some time to peek into this. Ugh. The code doesn't look like GCC code looks :/ It doesn't help to have one set of files per C++ class (25!). The code itself is undocumented - it's hard to understand what the purpose of all the Walker stuff is. You also didn't seem to know walk_tree () nor walk_gimple* (). Take as example - I figured to look for IPA pass entries, then I see + +static void +collect_types () +{ + GimpleTypeCollector collector; + collector.walk (); + collector.print_collected (); + ptrset_t types = collector.get_pointer_set (); + GimpleCaster caster (types); + caster.walk (); + if (flag_print_cast_analysis) +caster.print_reasons (); + ptrset_t casting = caster.get_sets (); + fix_escaping_types_in_set (casting); + GimpleAccesser accesser; + accesser.walk (); + if (flag_print_access_analysis) +accesser.print_accesses (); + record_field_map_t record_field_map = accesser.get_map (); + TypeIncompleteEquality equality; + bool has_fields_that_can_be_deleted = false; + typedef std::set field_offsets_t; there's no comments (not even file-level) that explains how type escape is computed. Sorry, but this isn't even close to be coarsely reviewable. Thanks Richard! I will work on making this more readable. I understand your comments and you are right. I am currently in the process of writing a human-readable PDF with an overview of this. There is already a (somewhat hurried) version of this PDF in the GCC internals link we shared. This should answer most of the high level questions about how the type escape analysis is computed. However, yes, there is still work to be done. I will start looking into walk_tree and walk_gimple* and see if the gimple_walker I wrote is redundant and substitute it. Thanks for the input! We’ve also added a subsection in the GCC internals document to allow other compiler devs to better understand our design and implementation. A generated PDF can be found here: https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download page: 719 We’ve been testing the pass against a range of in-tree tests and real-life applications (e.g. all SPEC CPU2017 C benchmarks). For testing, please see testing subsection in the gcc internals we prepared. Currently we see the following limitations: * It is not a "true" ipa pass yes. That is, we can only succeed with -flto-partition=none. * Currently it is not safe to use -fipa-sra. * Brace constructors not supported now. We handle this gracefully. * Only C as of now. * Results of sizeof() and offsetof() are generated in the compiler frontend and thus can’t be changed later at link time. There are a couple of ideas to resolve this, but that’s currently unimplemented. * At this point we’d like to thank the GCC community for their patient help so far on the mailing list and in other channels. And we ask for your support in terms of feedback, comments and testing. Thanks!
Re: LTO Dead Field Elimination
Hi Richard, On 7/27/20 2:36 PM, Richard Biener wrote: > On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa > wrote: >> >> This patchset brings back struct reorg to GCC. >> >> We’ve been working on improving cache utilization recently and would >> like to share our current implementation to receive some feedback on it. >> >> Essentially, we’ve implemented the following components: >> >> Type-based escape analysis to determine if we can reorganize a type >> at link-time >> >> Dead-field elimination to remove unused fields of a struct at >> link-time >> >> The type-based escape analysis provides a list of types, that are not >> visible outside of the current linking unit (e.g. parameter types of >> external functions). >> >> The dead-field elimination pass analyses non-escaping structs for fields >> that are not used in the linking unit and thus can be removed. The >> resulting struct has a smaller memory footprint, which allows for a >> higher cache utilization. >> >> As a side-effect a couple of new infrastructure code has been written >> (e.g. a type walker, which we were really missing in GCC), which can be >> of course reused for other passes as well. >> >> We’ve prepared a patchset in the following branch: >> >>refs/vendors/ARM/heads/arm-struct-reorg-wip > > Just had some time to peek into this. Ugh. The code doesn't look like > GCC code looks :/ It doesn't help to have one set of files per C++ class > (25!). Any suggestions how to best structure these? Are there some coding guidelines in the GCC project, which can help us to match the expectation? > The code itself is undocumented - it's hard to understand what the purpose > of all the Walker stuff is. > > You also didn't seem to know walk_tree () nor walk_gimple* (). True, we were not aware of that code. Thanks for pointing to that code. We will have a look. > Take as example - I figured to look for IPA pass entries, then I see > > + > +static void > +collect_types () > +{ > + GimpleTypeCollector collector; > + collector.walk (); > + collector.print_collected (); > + ptrset_t types = collector.get_pointer_set (); > + GimpleCaster caster (types); > + caster.walk (); > + if (flag_print_cast_analysis) > +caster.print_reasons (); > + ptrset_t casting = caster.get_sets (); > + fix_escaping_types_in_set (casting); > + GimpleAccesser accesser; > + accesser.walk (); > + if (flag_print_access_analysis) > +accesser.print_accesses (); > + record_field_map_t record_field_map = accesser.get_map (); > + TypeIncompleteEquality equality; > + bool has_fields_that_can_be_deleted = false; > + typedef std::set field_offsets_t; > > there's no comments (not even file-level) that explains how type escape > is computed. > > Sorry, but this isn't even close to be coarsely reviewable. Sad to hear. We'll work on the input that you provided and provide a new version. Thanks, Christoph > >> We’ve also added a subsection in the GCC internals document to allow >> other compiler devs to better understand our design and implementation. >> A generated PDF can be found here: >> >> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F >> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download >> >> page: 719 >> >> We’ve been testing the pass against a range of in-tree tests and >> real-life applications (e.g. all SPEC CPU2017 C benchmarks). For >> testing, please see testing subsection in the gcc internals we prepared. >> >> Currently we see the following limitations: >> >> * It is not a "true" ipa pass yes. That is, we can only succeed with >> -flto-partition=none. >> * Currently it is not safe to use -fipa-sra. >> * Brace constructors not supported now. We handle this gracefully. >> * Only C as of now. >> * Results of sizeof() and offsetof() are generated in the compiler >> frontend and thus can’t be changed later at link time. There are a >> couple of ideas to resolve this, but that’s currently unimplemented. >> * At this point we’d like to thank the GCC community for their patient >> help so far on the mailing list and in other channels. And we ask for >> your support in terms of feedback, comments and testing. >> >> Thanks!
Re: LTO Dead Field Elimination
On Mon, Jul 27, 2020 at 02:52:21PM +0200, Erick Ochoa wrote: > I will work on making this more readable. I understand your comments and you > are right. I am currently in the process of writing a human-readable PDF > with an overview of this. There is already a (somewhat hurried) version of > this PDF in the GCC internals link we shared. This should answer most of the > high level questions about how the type escape analysis is computed. > However, yes, there is still work to be done. I will start looking into > walk_tree and walk_gimple* and see if the gimple_walker I wrote is redundant > and substitute it. The overview doesn't belong in some on the side document, but should be in the source (usually at the start of the source file that contains the pass). And then each function should be documented. I'd also like to stress what has been seen several times in the past, but is being not taken into account. One of the reasons why old ipa struct reorg has been removed is that it worked on the types granularity, if it is safe to adjust all references to all objects of a certain type, then do it, otherwise give up. That is fundamentally wrong, it can't be useful except by pure luck on some benchmark. Instead, it should work on the object (or sets of those) granularity. If the analysis phase determines all accesses to certain object (automatic or global variable, or e.g. memory allocation) can be adjusted by using a different type for it (with either fields reordered, or some removed etc.), then it should be optimized regardless of whether other objects can be optimized similarly or not. In other cases, it will not be possible to optimize a single object on its own, e.g. some function is called with an address of one of 10 objects, so in that case put those 10 objects into a set and either optimize those together if possible, or punt on those together. But only very seldom all objects of certain type can be optimized the same way, and even if they do, perhaps it is better to optimize some of them even further (e.g. you find none of the objects of type XYZ use field abc, and decide to adjust the type and remove the field. But perhaps 50% of objects of that type also don't use def field and if those were optimized separately, you could optimize that too). Jakub
Re: LTO Dead Field Elimination
On Mon, Jul 27, 2020 at 2:59 PM Christoph Müllner wrote: > > Hi Richard, > > On 7/27/20 2:36 PM, Richard Biener wrote: > > On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa > > wrote: > >> > >> This patchset brings back struct reorg to GCC. > >> > >> We’ve been working on improving cache utilization recently and would > >> like to share our current implementation to receive some feedback on it. > >> > >> Essentially, we’ve implemented the following components: > >> > >> Type-based escape analysis to determine if we can reorganize a type > >> at link-time > >> > >> Dead-field elimination to remove unused fields of a struct at > >> link-time > >> > >> The type-based escape analysis provides a list of types, that are not > >> visible outside of the current linking unit (e.g. parameter types of > >> external functions). > >> > >> The dead-field elimination pass analyses non-escaping structs for fields > >> that are not used in the linking unit and thus can be removed. The > >> resulting struct has a smaller memory footprint, which allows for a > >> higher cache utilization. > >> > >> As a side-effect a couple of new infrastructure code has been written > >> (e.g. a type walker, which we were really missing in GCC), which can be > >> of course reused for other passes as well. > >> > >> We’ve prepared a patchset in the following branch: > >> > >>refs/vendors/ARM/heads/arm-struct-reorg-wip > > > > Just had some time to peek into this. Ugh. The code doesn't look like > > GCC code looks :/ It doesn't help to have one set of files per C++ class > > (25!). > > Any suggestions how to best structure these? As "bad" as it sounds, put everything into one file (maybe separate out type escape analysis from the actual transform). Add a toplevel comment per file explaining things. > Are there some coding guidelines in the GCC project, > which can help us to match the expectation? Look at existing passes, otherwise there's mostly conventions on formatting. > > The code itself is undocumented - it's hard to understand what the purpose > > of all the Walker stuff is. > > > > You also didn't seem to know walk_tree () nor walk_gimple* (). > > True, we were not aware of that code. > Thanks for pointing to that code. > We will have a look. > > > Take as example - I figured to look for IPA pass entries, then I see > > > > + > > +static void > > +collect_types () > > +{ > > + GimpleTypeCollector collector; > > + collector.walk (); > > + collector.print_collected (); > > + ptrset_t types = collector.get_pointer_set (); > > + GimpleCaster caster (types); > > + caster.walk (); > > + if (flag_print_cast_analysis) > > +caster.print_reasons (); > > + ptrset_t casting = caster.get_sets (); > > + fix_escaping_types_in_set (casting); > > + GimpleAccesser accesser; > > + accesser.walk (); > > + if (flag_print_access_analysis) > > +accesser.print_accesses (); > > + record_field_map_t record_field_map = accesser.get_map (); > > + TypeIncompleteEquality equality; > > + bool has_fields_that_can_be_deleted = false; > > + typedef std::set field_offsets_t; > > > > there's no comments (not even file-level) that explains how type escape > > is computed. > > > > Sorry, but this isn't even close to be coarsely reviewable. > > Sad to hear. > We'll work on the input that you provided and provide a new version. > > Thanks, > Christoph > > > > >> We’ve also added a subsection in the GCC internals document to allow > >> other compiler devs to better understand our design and implementation. > >> A generated PDF can be found here: > >> > >> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F > >> https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download > >> > >> page: 719 > >> > >> We’ve been testing the pass against a range of in-tree tests and > >> real-life applications (e.g. all SPEC CPU2017 C benchmarks). For > >> testing, please see testing subsection in the gcc internals we prepared. > >> > >> Currently we see the following limitations: > >> > >> * It is not a "true" ipa pass yes. That is, we can only succeed with > >> -flto-partition=none. > >> * Currently it is not safe to use -fipa-sra. > >> * Brace constructors not supported now. We handle this gracefully. > >> * Only C as of now. > >> * Results of sizeof() and offsetof() are generated in the compiler > >> frontend and thus can’t be changed later at link time. There are a > >> couple of ideas to resolve this, but that’s currently unimplemented. > >> * At this point we’d like to thank the GCC community for their patient > >> help so far on the mailing list and in other channels. And we ask for > >> your support in terms of feedback, comments and testing. > >> > >> Thanks!
Re: LTO Dead Field Elimination
On Mon, Jul 27, 2020 at 9:03 AM Christoph Müllner wrote: > > Hi Richard, > > On 7/27/20 2:36 PM, Richard Biener wrote: > > On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa > > wrote: > >> > >> This patchset brings back struct reorg to GCC. > >> > >> We’ve been working on improving cache utilization recently and would > >> like to share our current implementation to receive some feedback on it. > >> > >> Essentially, we’ve implemented the following components: > >> > >> Type-based escape analysis to determine if we can reorganize a type > >> at link-time > >> > >> Dead-field elimination to remove unused fields of a struct at > >> link-time > >> > >> The type-based escape analysis provides a list of types, that are not > >> visible outside of the current linking unit (e.g. parameter types of > >> external functions). > >> > >> The dead-field elimination pass analyses non-escaping structs for fields > >> that are not used in the linking unit and thus can be removed. The > >> resulting struct has a smaller memory footprint, which allows for a > >> higher cache utilization. > >> > >> As a side-effect a couple of new infrastructure code has been written > >> (e.g. a type walker, which we were really missing in GCC), which can be > >> of course reused for other passes as well. > >> > >> We’ve prepared a patchset in the following branch: > >> > >>refs/vendors/ARM/heads/arm-struct-reorg-wip > > > > Just had some time to peek into this. Ugh. The code doesn't look like > > GCC code looks :/ It doesn't help to have one set of files per C++ class > > (25!). > > Any suggestions how to best structure these? > Are there some coding guidelines in the GCC project, > which can help us to match the expectation? Thanks for working on this feature in GCC. The coding standards for the GNU Project and for GCC are: https://www.gnu.org/prep/standards/ https://gcc.gnu.org/codingconventions.html As others have mentioned, the documentation should be in the source code itself. An external design document is nice, but the code should seemlessly integrate with the rest of the GCC source code, not a separate, self-contained set of files bolted onto the side. Thanks, David
Re: Problems with changing the type of an ssa name
Hello, On Sat, 25 Jul 2020, Gary Oblock via Gcc wrote: > if ( TYPE_P ( type) ) > { >TREE_TYPE ( ssa_name) = TYPE_MAIN_VARIANT ( type); >if ( ssa_defined_default_def_p ( ssa_name) ) > { > // I guessing which I know is a terrible thing to do... > SET_SSA_NAME_VAR_OR_IDENTIFIER ( ssa_name, TYPE_MAIN_VARIANT ( > type)); As the macro name indicates this takes a VAR_DECL, or an IDENTIFIER_NODE. You put in a type, that won't work. You also simply override the type of the SSA name, without also caring for the type of the underlying variable that is (potentially!) associated with the SSA name; if those two disagree then issues will arise, you have to replace either the variables type (not advisable!), or the associated variable, with either nothing or a new variable (of the appropriate type), or an mere identifier. Generally you can't modify SSA names in place like this, i.e. as Richi says, create new SSA names, replace all occurences of one with the other. Ciao, Michael.
Re: LTO Dead Field Elimination
On 24/07/2020 17:43, Erick Ochoa wrote: > This patchset brings back struct reorg to GCC. > > We’ve been working on improving cache utilization recently and would > like to share our current implementation to receive some feedback on it. > > Essentially, we’ve implemented the following components: > > Type-based escape analysis to determine if we can reorganize a type > at link-time > > Dead-field elimination to remove unused fields of a struct at link-time > > The type-based escape analysis provides a list of types, that are not > visible outside of the current linking unit (e.g. parameter types of > external functions). > > The dead-field elimination pass analyses non-escaping structs for fields > that are not used in the linking unit and thus can be removed. The > resulting struct has a smaller memory footprint, which allows for a > higher cache utilization. > I am very much a lurker on this list, and not a gcc developer - I am just an enthusiastic user. So my opinion here might not even be worth the apocryphal two cents, and I won't feel insulted if someone says I don't know what I am talking about here. With that disclaimer out of the way, my immediate reaction to this idea is "Why?". What is the aim of this feature? I can see it making many things a lot more complicated, but I can't see it being of correspondingly significant benefit. Do you have reason to suppose that there really are many structs in use in real code, for which there are fields that aren't used, and for which removing those fields makes a /significant/ improvement to the efficiency of the code? If I have the history correct, gcc used to have an optimisation that would let it re-order fields in a struct to reduce padding. This got removed in later versions because it made things so complicated in other parts of compilation and linking, especially if some parts of the code are compiled with different options. Why would these new features not suffer from exactly the same kinds of issues? I worry that this kind of global optimisation has so many knock-on effects that it simply is not worth the cost. Remember that it is not just in writing these patches that work must be done - people are going to have to maintain the code for ever after, and it could mean changes or limitations in other parts of gcc. As I see it, there is a simpler path with much lower cost. When I am writing code, if I have poor struct layout or unnecessary fields, I don't want the compiler to re-organise things behind my back. I want to be /told/. Either I want the field there (perhaps I haven't written the code that uses it yet), or I've made a mistake in my code. My recommendation here would be to keep the analysis part - that could be useful for other features too. But drop the optimisation part - replace it with a warning. Tell the programmer, and let /them/ decide whether or not the field is unnecessary. Let the programmer make the decision and the change - then you get all the benefits of the optimisation with none of the risks or complications. David
Re: Gcc Digest, Vol 5, Issue 52
Almost all of the makes sense to. I'm not sure what a conditionally initialized pointer is. You mention VAR_DECL but I assume this is for completeness and not something I'll run across associated with a default def (but then again I don't understand notion of a conditionally initialized pointer.) I'm at the moment only dealing with a single malloced array of structures of the given type (though multiple types could have this property.) I intend to extend this to cover multiple array and static allocations but I need to get the easiest case working first. This means no side pointers are needed and if and when I need them pointer will get transformed into a base and index pair. I intend to do the creation of new ssa names as a separate pass from the gimple transformations. So I will technically be creating for the duration of the pass possibly two defs associated with a single gimple statement. Do I need to delete the old ssa names via some mechanism? By the way this is really helpful information. The only other person on the project, Erick, is a continent away and has about as much experience with gimple as me but a whole heck lot less compiler experience. Thanks, Gary From: Gcc on behalf of gcc-requ...@gcc.gnu.org Sent: Monday, July 27, 2020 1:33 AM To: gcc@gcc.gnu.org Subject: Gcc Digest, Vol 5, Issue 52 [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please be mindful of safe email handling and proprietary information protection practices.] Send Gcc mailing list submissions to gcc@gcc.gnu.org To subscribe or unsubscribe via the World Wide Web, visit http://gcc.gnu.org/mailman/listinfo/gcc or, via email, send a message with subject or body 'help' to gcc-requ...@gcc.gnu.org You can reach the person managing the list at gcc-ow...@gcc.gnu.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Gcc digest..."