https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90811
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> --- (In reply to Alexander Monakov from comment #3) > Thanks. The change in the attached patch looks good to me, but I must admit > I don't see how the testcase triggers the problem. Basically, it's not > obvious how the controlling if condition becomes true: > > > if (!CONST_INT_P (size) || UINTVAL (align) > GET_MODE_SIZE (DImode)) > > (I don't expect that loop body to have variable-sized or over-aligned > objects) From the dumps it looks like overaligned objects. Debugging this in the system compiler, as I don't have nvptx offloading enabled build of current trunk. lower_rec_simd_input_clauses is called for d with: <var_decl 0x7ffff7ffbd80 d type <array_type 0x7fffeaab5f18 type <integer_type 0x7fffeaab5c78 long long int readonly sizes-gimplified DI size <integer_cst 0x7fffea976cf0 constant 64> unit-size <integer_cst 0x7fffea976d08 constant 8> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7fffeaab5c78 precision:64 min <integer_cst 0x7fffea976fa8 -9223372036854775808> max <integer_cst 0x7fffea976fd8 9223372036854775807>> sizes-gimplified BLK size <integer_cst 0x7fffeaab8288 constant 320> unit-size <integer_cst 0x7fffeaab85a0 constant 40> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7fffeaab5f18 domain <integer_type 0x7fffeaab5e70 type <integer_type 0x7fffea98e000 sizetype> sizes-gimplified DI size <integer_cst 0x7fffea976cf0 64> unit-size <integer_cst 0x7fffea976d08 8> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7fffeaab5e70 precision:64 min <integer_cst 0x7fffea976d20 0> max <integer_cst 0x7fffea976f48 4>>> readonly addressable used read BLK pr90811.c:15:23 size <integer_cst 0x7fffeaab8288 320> unit-size <integer_cst 0x7fffeaab85a0 40> align:64 warn_if_not_align:0 context <function_decl 0x7fffeaab7700 main>> so that var only needs 64-bit alignment. For non-simt, when we create tree atype = build_array_type_nelts (TREE_TYPE (new_var), sctx->max_vf); tree avar = create_tmp_var_raw (atype); I'm quite sure that kicks in the i386.c (ix86_data_alignment): /* x86-64 ABI requires arrays greater than 16 bytes to be aligned to 16byte boundary. */ if (TARGET_64BIT) { if ((opt ? AGGREGATE_TYPE_P (type) : TREE_CODE (type) == ARRAY_TYPE) && TYPE_SIZE (type) && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST && wi::geu_p (wi::to_wide (TYPE_SIZE (type)), 128) && align < 128) return 128; } kicks in that case; not sure where we create a temporary with ARRAY_TYPE in the simt case. Might be nice to force we don't add excessive alignment to those through DECL_USER_ALIGN and copying alignment from the type or something similar, unless the user variable is already DECL_USER_ALIGN.