https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119833
Bug ID: 119833 Summary: Clarify which semantics offloading compilation does (not) inherit from using the LTO infrastructure Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: openacc, openmp Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tschwinge at gcc dot gnu.org CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org, rguenth at gcc dot gnu.org, tschwinge at gcc dot gnu.org Target Milestone: --- +++ This bug was initially created as a clone of Bug #117010 +++ (In reply to myself from bug 117010, comment #4) > Jakub, Richi, C++/offloading question. For the small test case posted here, > for 'V<0>::V()' I see in the '-O0' x86_64 host code: > > .section > .text._ZN1VILi0EEC2Ev,"axG",@progbits,_ZN1VILi0EEC5Ev,comdat > .align 2 > .weak _ZN1VILi0EEC2Ev > .type _ZN1VILi0EEC2Ev, @function > _ZN1VILi0EEC2Ev: > [...] > .size _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev > .weak _ZN1VILi0EEC1Ev > .set _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev > > That is, weak definitions of '_ZN1VILi0EEC2Ev' and its alias > '_ZN1VILi0EEC1Ev' (which gets called from 'foo'). > > Likewise, I see weak definitions, if compiling such code for GCN target: > > .section > .text._ZN1VILi0EEC2Ev,"axG",@progbits,_ZN1VILi0EEC5Ev,comdat > .align 4 > .weak _ZN1VILi0EEC2Ev > .type _ZN1VILi0EEC2Ev,@function > _ZN1VILi0EEC2Ev: > [...] > .size _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev > .weak _ZN1VILi0EEC1Ev > .set _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev > > ..., so that appears consistent. > > For nvptx target (with '-malias'), I see: > > .weak .func _ZN1VILi0EEC1Ev (.param.u64 %in_ar0) > { > [...] > } > > That is, it directly emits the (used) '_ZN1VILi0EEC1Ev' constructor, instead > of emitting '_ZN1VILi0EEC2Ev' and then aliasing the former to the latter. (See bug 117010, comment #9 for the x86_64, or GCN vs. nvptx target difference.) > Now, the observation/question: compiling this code for offloading (as > originally reported), I see for GCN offloading: > > .text > [...] > .type _ZN1VILi0EEC2Ev,@function > _ZN1VILi0EEC2Ev: > [...] > .size _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev > .set _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev > > That is, '_ZN1VILi0EEC2Ev' and its alias '_ZN1VILi0EEC1Ev' are now strong > instead of weak definitions. Similarly for nvptx offloading: .func _ZN1VILi0EEC2Ev (.param.u64 %in_ar0) { [...] ... is then non-'.weak'. > Is this expected, or unexpected, and > potentially problematic? (In reply to myself from bug 117010, comment #6) > [Looking for an explanation] why "weak" and "comdat" get lost in the GCN > offloading path? GCN > (ELF) does support all these things (to the best of my knowledge). (Let's > ignore nvptx for this moment.) I'll thus analyze offload stream-out, > stream-in etc. (In reply to myself from bug 117010, comment #7) > First observation: the same (per my understanding) happens with LTO: compile > this code, still at '-O0' with '-foffload=disable' but with '-flto', and see > the x86_64 '[...].ltrans0.ltrans.s' file: > > .text > [...] > .type _ZN1VILi0EEC2Ev, @function > _ZN1VILi0EEC2Ev: > [...] > .size _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev > .set _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev > > Could this be due to whole-program optimization, enabled by LTO? (But > '-O0'?) (In reply to myself from bug 117010, comment #8) > Well, indeed. Offloading code generation uses the LTO machinery, including > the 'lto1' front end, and thus has 'gcc/common.opt:in_lto_p' set to 'true': > > ; True if this is the lto front end. This is used to disable gimple > ; generation and lowering passes that are normally run on the output > ; of a front end. These passes must be bypassed for lto since they > ; have already been done before the gimple was written. > Variable > bool in_lto_p = false > > The "weak", "comdat" transformations are described at the high level in > 'gcc/doc/lto.texi': > > The whole program mode assumptions are slightly more complex in > C++, where inline functions in headers are put into @emph{COMDAT} > sections. COMDAT function and variables can be defined by > multiple object files and their bodies are unified at link-time > and dynamic link-time. COMDAT functions are changed to local only > when their address is not taken and thus un-sharing them with a > library is not harmful. [...] > > If I force-disable 'pass_ipa_whole_program_visibility': > > --- gcc/ipa-visibility.cc > +++ gcc/ipa-visibility.cc > @@ -993,4 +993,7 @@ public: > unsigned int execute (function *) final override > { > +#ifdef ACCEL_COMPILER > + return 0; > +#endif > return whole_program_function_and_variable_visibility (); > } > > ..., then we get the expected 'diff' for GCN offloading compilation's > '[...].xamdgcn-amdhsa.mkoffload.082i.whole-program' (and similar for nvptx > offloading compilation's '[...].xnvptx-none.mkoffload.082i.whole-program'): > > -Marking local functions: __ct_comp /2 __ct_base /1 > [...] > @@ -49,22 +40,24 @@ > _ZN1VILi0EEC1Ev/2 (__ct_comp ) > Type: function definition analyzed alias > - Visibility: semantic_interposition prevailing_def_ironly > + Visibility: externally_visible semantic_interposition public weak > comdat comdat_group:_ZN1VILi0EEC5Ev one_only > + Same comdat group as: _ZN1VILi0EEC2Ev/1 > References: _ZN1VILi0EEC2Ev/1 (alias) > Referring: > Read from file: pr117010-1_.o > - Availability: local > + Availability: available > Unit id: 1 > - Function flags: local > + Function flags: > Called by: _Z3foov/3 > Calls: > _ZN1VILi0EEC2Ev/1 (__ct_base ) > Type: function definition analyzed > - Visibility: semantic_interposition no_reorder prevailing_def_ironly > + Visibility: externally_visible semantic_interposition no_reorder > public weak comdat comdat_group:_ZN1VILi0EEC5Ev one_only > + Same comdat group as: _ZN1VILi0EEC1Ev/2 > References: > Referring: _ZN1VILi0EEC1Ev/2 (alias) > Read from file: pr117010-1_.o > - Availability: local > + Availability: available > Unit id: 1 > - Function flags: local > + Function flags: > Called by: > Calls: > > ..., and we get the expected 'diff' for the GCN offloading code, > '[...].xamdgcn-amdhsa.mkoffload.2.s' (and similar for the nvptx offloading > code, 'pr117010-1_.xnvptx-none.mkoffload.s'): > > + .section > .text._ZN1VILi0EEC2Ev,"axG",@progbits,_ZN1VILi0EEC5Ev,comdat > .align 2 > + .weak _ZN1VILi0EEC2Ev > .type _ZN1VILi0EEC2Ev,@function > [...] > .size _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev > + .weak _ZN1VILi0EEC1Ev > .set _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev > > Now, so much for the mechanics. What this means semantically: whether > 'in_lto_p' should vs. shouldn't actually be set for offloading compilation, > I/we have to spend more thought on, whether all these > transformations/optimizations guarded by 'in_lto_p' are generally applicable > to offloading compilation or not? That shall be the topic of this new PR here.