Hi! Jakub, would you please provide guidance?
Elsewhere, I wrote: || I'm working on implementing (some) C++ standard library support for code || offloading in GCC, and ran into the following issue: per || <https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables>, || "variables declared at block scope with the specifier 'static' [...] have || static [...] storage duration but are initialized the first time control || passes through their declaration". || || To implement "initialized the first time [...]" in a multi-threaded || context, compilers typically use a guard variable and locking call to a || compiler-internal C++ support library function ('__cxa_guard_acquire'). || (..., which in GCC, you may disable with '-fno-threadsafe-statics', for || that matter.) || || In GCC, all this appears to work fine for multi-threaded host-side || (non-offladed) OpenMP 'parallel', for example. However, I'm now curious || about the OpenMP 'target' offloading case; minimal example: || || struct S || { || S() { } || ~S() { } || }; || || static void f() || { || // <https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables> || static S s; || } || || int main() || { || #pragma omp target || { || f(); || } || } || || (Everything other than 'main' is meant to be implicitly OpenMP || 'declare target'ed here.) On 2023-11-20T19:13:23+0100, Jakub Jelinek <ja...@redhat.com> wrote: > On Mon, Nov 20, 2023 at 06:43:47PM +0100, Thomas Schwinge wrote: >> Current GCC fails: >> >> error: variable ‘_ZGVZL1fvE1s’ has been referenced in offloaded code but >> hasn’t been marked to be included in the offloaded code >> >> ... with: >> >> $ c++filt _ZGVZL1fvE1s >> guard variable for f()::s >> >> That may "simply" be a bug to fix in GCC. The conclusion was: yes. >> (Something like implicitly >> creating respective guard variables on the device, I suppose.) > > Yeah, I believe we should in the omp_discover_* sub-pass handle with > a help of a langhook automatically mark the guard variables (possibly > iff the guarded variable is marked?), Looking at 'gcc/omp-offload.cc:omp_discover_implicit_declare_target' left me confused how that would be the code that marks up 'static' variables as implicit 'omp declare target'. Working through a simple POD example (say, 's%static S s%static int i') it turns out, indeed that's not where that is happending, but instead 'gcc/gimplify.cc:gimplify_bind_expr' is the place: [...] for (t = BIND_EXPR_VARS (bind_expr); t ; t = DECL_CHAIN (t)) [...] /* Static locals inside of target construct or offloaded routines need to be "omp declare target". */ if (TREE_STATIC (t)) for (; ctx; ctx = ctx->outer_context) if ((ctx->region_type & ORT_TARGET) != 0) { if (!lookup_attribute ("omp declare target", DECL_ATTRIBUTES (t))) { tree id = get_identifier ("omp declare target"); DECL_ATTRIBUTES (t) = tree_cons (id, NULL_TREE, DECL_ATTRIBUTES (t)); varpool_node *node = varpool_node::get (t); if (node) { node->offloadable = 1; if (ENABLE_OFFLOADING && !DECL_EXTERNAL (t)) { g->have_offload = true; if (!in_lto_p) vec_safe_push (offload_vars, t); } } } break; [...] You (Jakub) added that in commit 211b7533bff68e5dd72e7d75249f470101759d6d (Subversion r272322) "Make static vars inside of target regions or declare target routines implicitly declare target to (PR middle-end/90779)". Now, the problem why that existing code doesn't trigger for C++ guard variables is that those are not in 'BIND_EXPR_VARS', due to C++ front end use of 'pushdecl_top_level_and_finish'. If I change the C++ front end as follows (WIP; not reviewed in detail): --- gcc/cp/decl2.cc +++ gcc/cp/decl2.cc @@ -3576,5 +3576,6 @@ get_guard (tree decl) DECL_IGNORED_P (guard) = 1; TREE_USED (guard) = 1; - pushdecl_top_level_and_finish (guard, NULL_TREE); + pushdecl (guard); + cp_finish_decl (guard, NULL_TREE, false, NULL_TREE, 0); } return guard; ..., then we do get the expected behavior: --- a-r.cc.006t.gimple 2023-12-07 13:27:36.254963406 +0100 +++ a-r.cc.006t.gimple 2023-12-07 14:10:39.352107107 +0100 @@ -5,6 +5,7 @@ bool retval.1; bool D.2966; static struct S s1; + static long long int _ZGVZL2f1vE2s1; gimple_call <__atomic_load_1, _1, &_ZGVZL2f1vE2s1, 2> gimple_assign <eq_expr, retval.0, _1, 0, NULL> ..., and offloading compilation works down to the expected next issue: ld: error: undefined symbol: __cxa_guard_acquire >>> referenced by /tmp/ccAVyZpc.o:(f1()) [...] collect2: error: ld returned 1 exit status gcn mkoffload: fatal error: build-gcc/gcc/x86_64-pc-linux-gnu-accel-amdgcn-amdhsa-gcc returned 1 exit status [...] However: 'pushdecl_top_level_and_finish' has been used there "forever", and I currently have no clue at all whether changing that into 'pushdecl' would be acceptable, what effects that'd have elsewhere. That said... Couldn't we indeed move this gimplification-level code re 'Static locals [...] need to be "omp declare target"' into 'gcc/omp-offload.cc:omp_discover_implicit_declare_target'? First thought: 'gcc/omp-offload.cc:omp_discover_declare_target_tgt_fn_r' extended so that for each 'VAR_DECL' that is 'TREE_STATIC', we mark it 'omp declare target'. (That'll need some additional conditions, but you get the idea.) This way, we're not restricted to only 'static's in the current bind/block, but would also catch top-level things like C++ guard variables (without requiring any C++ front end changes). I suppose I'd first exclude all 'DECL_ARTIFICIAL' ones, and we then may gradually enable those, as we add test cases and handling as necessary: > or e.g. rtti info (_ZTS*, _ZTI*) > and eventually figure out what we should do about virtual tables (_ZTV*). > The last case is most complicated, as it contains function pointers, and we > need to figure out if we mark all methods, or say replace some pointers in > the virtual table with NULLs or something that errors or terminates if it > isn't marked. All those I plan to defer, for now. > And sure, __cxa_guard_* would need to be implemented in the offloading > libsupc++.a or libstdc++.a. Until proper libstdc++/libsupc++ support emerges (I'm working on it...), my idea was to add a temporary 'libgomp/config/accel/*.c' implementation (based on 'libstdc++-v3/libsupc++/guard.cc'). Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955