https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119835
Thomas Schwinge <tschwinge at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed|2025-05-20 00:00:00 |2025-5-21 Keywords| |ABI Summary|GCN, nvptx offloading: |GCN, nvptx offloading: ICE |'libgomp.c++/pr106445-1.C' |'during GIMPLE pass: nrv' |with '-fno-inline': ICE |in "return value |'during GIMPLE pass: nrv' |optimizations for functions | |which return aggregate | |types" --- Comment #2 from Thomas Schwinge <tschwinge at gcc dot gnu.org> --- So, per my current understanding, this is another host vs. offload target ABI-related issue... For example, also for C (not C++!) code: typedef struct {int a;} sint; static sint rsint(void) { sint t; t.a = sizeof t; return t; } ..., on the x86_64 host, 'gcc/gimplify.cc:gimplify_return_expr', we get: <result_decl 0x7ffff78dc000 D.2978 type <record_type 0x7ffff78b92a0 sint sizes-gimplified SI size <integer_cst 0x7ffff770d1b0 constant 32> unit-size <integer_cst 0x7ffff770d1c8 constant 4> align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff78b9150 fields <field_decl 0x7ffff771a8c0 a type <integer_type 0x7ffff770b5e8 int> SI source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c:10:21 size <integer_cst 0x7ffff770d1b0 32> unit-size <integer_cst 0x7ffff770d1c8 4> align:32 warn_if_not_align:0 offset_align 128 decl_not_flexarray: 1 offset <integer_cst 0x7ffff76ebf90 constant 0> bit-offset <integer_cst 0x7ffff76ebfd8 constant 0> context <record_type 0x7ffff78b9150>> chain <type_decl 0x7ffff771a820 D.2973>> ignored SI source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c:45:13 size <integer_cst 0x7ffff770d1b0 32> unit-size <integer_cst 0x7ffff770d1c8 4> align:32 warn_if_not_align:0 context <function_decl 0x7ffff78bf200 rsint>> ..., and 'aggregate_value_p (result_decl, TREE_TYPE (current_function_decl))' then does 'return false;' (at end of function). In particular: 2132 if (targetm.calls.return_in_memory (type, fntype)) ... does not 'return true;'. Therefore, 'gimplify_return_expr' then does: 1934 result = create_tmp_reg (TREE_TYPE (result_decl)); [...] ..., and does not do the "If aggregate_value_p is true, then we can return the bare RESULT_DECL." thing where it would set: 'result = result_decl;'. (Similarly, presumably but not verified, for ppc64le host.) However then in nvptx offloading compilation, we get: <result_decl 0x7ffff77df380 D.1771 type <record_type 0x7ffff77b0a80 sint BLK size <integer_cst 0x7ffff76eb6a8 constant 32> unit-size <integer_cst 0x7ffff76eb6c0 constant 4> align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff77b07e0 fields <field_decl 0x7ffff76fed20 a type <integer_type 0x7ffff76f45e8 int> SI source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c:10:21 size <integer_cst 0x7ffff76eb6a8 32> unit-size <integer_cst 0x7ffff76eb6c0 4> align:32 warn_if_not_align:0 offset_align 128 decl_not_flexarray: 1 offset <integer_cst 0x7ffff76eb498 constant 0> bit-offset <integer_cst 0x7ffff76eb4e0 constant 0> context <record_type 0x7ffff77b07e0>>> ignored BLK source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c:45:13 size <integer_cst 0x7ffff76eb6a8 32> unit-size <integer_cst 0x7ffff76eb6c0 4> align:32 warn_if_not_align:0 context <function_decl 0x7ffff77e0000 rsint>> ..., and with that 'aggregate_value_p' runs into: 2132 if (targetm.calls.return_in_memory (type, fntype)) (gdb) 2133 return true; ... because: pass_in_memory (mode=E_BLKmode, type=type@entry=0x7ffff77b0a80, for_return=for_return@entry=true) at ../../source-gcc/gcc/config/nvptx/nvptx.cc:658 658 { (gdb) n 659 if (type) (gdb) 661 if (AGGREGATE_TYPE_P (type)) (gdb) 662 return true; This is probably not surprising given 'BLK'? (Similarly, presumably but not verified, for GCN offload target.) This 'return true;' from 'aggregate_value_p' then enables the 'gcc/tree-nrv.cc:pass_nrv::execute' processing, which then fails in: tree result = DECL_RESULT (current_function_decl); [...] if (greturn *return_stmt = dyn_cast <greturn *> (stmt)) { /* In a function with an aggregate return value, the gimplifier has changed all non-empty RETURN_EXPRs to return the RESULT_DECL. */ ret_val = gimple_return_retval (return_stmt); if (ret_val) gcc_assert (ret_val == result); } ..., as the host code has not been set up conforming to that. Any good suggestion about how to address this mismatch? I don't understand enough of the GIMPLE/'DECL_RESULT' semantics to tell if we could just disable this 'assert' here, and the pass would still be doing correct transformations, or if that'd then potentially result in wrong code? Per a quick instrumented run on powerpc64le-unknown-linux-gnu, for 'check-gcc-{c,c++,fortran}' and 'check-target' (for host, not offloading), this pass is doing any code transformations only for exactly 24 (only!?) test cases, so one option might be to just disable it for offloading compilation... (Yuck!) But then, there's the more general problem of 'aggregate_value_p' potentially returning different things for host vs. offload target -- somewhat similar to what we had seen in PR120308 "'TYPE_EMPTY_P' vs. code offloading"... Help?