This is a backport of Richard's fix for PR118320 to GCC 14 (squashed together with my follow-up tweak of the dump message). Pushed to releases/gcc-14 after testing on aarch64-linux-gnu.
Thanks, Alex -- >8 -- As Andrew says in the bugzilla comments, this PR is about a case where we tried to fuse two stores of x0, one in which x0 was defined and one in which it was undefined. merge_access_arrays failed on the conflict, but the failure wasn't caught. Normally the hazard detection code would fail if the instructions had incompatible uses. However, an undefined use doesn't impose many restrictions on movements. I think this is likely to be the only case where hazard detection isn't enough. As Andrew notes in bugzilla, it might be possible to allow uses of defined and undefined values to be merged to the defined value. But that sounds dangerous in the general case, as an rtl-ssa-level decision. We might run the risk of turning conditional UB into unconditional UB. And LLVM proves that the definition of "undef" isn't simple. This is a backport of r15-7282-gf559ac896942ffe0e2315d0a4d8b8b517a16d607. gcc/ PR rtl-optimization/118320 * config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair): Commonize the merge of input_uses and return early if it fails. gcc/testsuite/ PR rtl-optimization/118320 * g++.dg/torture/pr118320.C: New test.
diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc b/gcc/config/aarch64/aarch64-ldp-fusion.cc index f32d30d54c5..1791ca5563c 100644 --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc @@ -1784,6 +1784,24 @@ ldp_bb_info::fuse_pair (bool load_p, input_uses[i] = remove_note_accesses (attempt, input_uses[i]); } + // Get the uses that the pair instruction would have, and punt if + // the unpaired instructions use different definitions of the same + // register. That would normally be caught as a side-effect of + // hazard detection below, but this check also deals with cases + // in which one use is undefined and the other isn't. + auto new_uses = merge_access_arrays (attempt, + drop_memory_access (input_uses[0]), + drop_memory_access (input_uses[1])); + if (!new_uses.is_valid ()) + { + if (dump_file) + fprintf (dump_file, + " rejecting pair: i%d and i%d use different definiitions of" + " the same register\n", + insns[0]->uid (), insns[1]->uid ()); + return false; + } + // Edge case: if the first insn is a writeback load and the // second insn is a non-writeback load which transfers into the base // register, then we should drop the writeback altogether as the @@ -1911,10 +1929,7 @@ ldp_bb_info::fuse_pair (bool load_p, input_defs[1]); gcc_assert (pair_change->new_defs.is_valid ()); - pair_change->new_uses - = merge_access_arrays (attempt, - drop_memory_access (input_uses[0]), - drop_memory_access (input_uses[1])); + pair_change->new_uses = new_uses; gcc_assert (pair_change->new_uses.is_valid ()); set_pair_pat (pair_change); } @@ -1936,9 +1951,7 @@ ldp_bb_info::fuse_pair (bool load_p, case Action::CHANGE: { set_pair_pat (change); - change->new_uses = merge_access_arrays (attempt, - input_uses[0], - input_uses[1]); + change->new_uses = new_uses; auto d1 = drop_memory_access (input_defs[0]); auto d2 = drop_memory_access (input_defs[1]); change->new_defs = merge_access_arrays (attempt, d1, d2); @@ -1970,9 +1983,7 @@ ldp_bb_info::fuse_pair (bool load_p, auto new_insn = crtl->ssa->create_insn (attempt, INSN, pair_pat); change = make_change (new_insn); change->move_range = move_range; - change->new_uses = merge_access_arrays (attempt, - input_uses[0], - input_uses[1]); + change->new_uses = new_uses; gcc_assert (change->new_uses.is_valid ()); auto d1 = drop_memory_access (input_defs[0]); diff --git a/gcc/testsuite/g++.dg/torture/pr118320.C b/gcc/testsuite/g++.dg/torture/pr118320.C new file mode 100644 index 00000000000..228d79860b5 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr118320.C @@ -0,0 +1,15 @@ +// { dg-additional-options "-fno-tree-sra" } */ + +struct T{ + long f[2]; +}; +void f1(int); +void g(T&); +void f1() +{ + T a; + a.~T(); // To force the clobber + a.f[1] = 1; + T b = a; + g(b); +}