https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113727
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jamborm at gcc dot gnu.org --- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> --- Ah, so the mistake happens in 135.sra which does <bb 2> [local count: 178992760]: - as.f3 = 5; + as$f3_6 = 5; <bb 3> [local count: 894749064]: # y_24 = PHI <y_14(5), 0(2)> # as_27 = PHI <as_12(5), 169(2)> + # as$f3_8 = PHI <as$f3_4(5), as$f3_6(2)> _1 = as_27 & 31; if (_1 != 0) goto <bb 5>; [50.00%] @@ -26,12 +39,12 @@ goto <bb 4>; [50.00%] <bb 4> [local count: 447374532]: - cstore_19 = MEM <struct f> [(void *)&as].f3; + cstore_19 = as$f3_8; <bb 5> [local count: 894749064]: # as_12 = PHI <as_27(4), 66(3)> # cstore_20 = PHI <cstore_19(4), 154(3)> - MEM <struct f> [(void *)&as].f3 = cstore_20; + as$f3_4 = cstore_20; y_14 = y_24 + 1; if (y_14 <= 4) goto <bb 3>; [80.00%] @@ -41,8 +54,12 @@ <bb 6> [local count: 178992760]: # as_28 = PHI <as_12(5)> BIT_FIELD_REF <as, 8, 0> = as_28; + as$f3_22 = as.f3; + as.f3 = as$f3_22; aq1 = as; note how we elide as.f3 but in BB6 fail to process the BIT_FIELD_REF but then re-materialize as.f3 as if 'as' were fully stored to by the BIT_FIELD_REF. The BIT_FIELD_REF should have triggered re-materialization before it. Upon handling BIT_FIELD_REF <as, 8, 0> = as_28; we create the re-load of as.f3, but as said we fail to re-materialize 'as' before it from the replacement. For the following aggregate copy we run into if (access_has_children_p (lacc) && access_has_children_p (racc) /* When an access represents an unscalarizable region, it usually represents accesses with variable offset and thus must not be used to generate new memory accesses. */ && !lacc->grp_unscalarizable_region && !racc->grp_unscalarizable_region) { struct subreplacement_assignment_data sad; sad.left_offset = lacc->offset; sad.assignment_lhs = lhs; sad.assignment_rhs = rhs; sad.top_racc = racc; sad.old_gsi = *gsi; sad.new_gsi = gsi; sad.loc = gimple_location (stmt); sad.refreshed = SRA_UDH_NONE; if (lacc->grp_read && !lacc->grp_covered) handle_unscalarized_data_in_subtree (&sad); which I think is a similar situation in that the BIT_FIELD_REF on the LHS overlaps with replacements and is a RMW operation. I think SRA simply assumes that any non-aggregate copy will hever partially invalidate replacements? I'm not sure how BIT_FIELD_REF was handled (and worked) before my change, we record the whole variable as access for the BIT_FIELD_REF write (but with ->grp_partial_lhs set). But we do not look at grp_partial_lhs when analyzing for overlaps. The following fixes this, but a "better" change would be to record the proper extent, including the BIT_FIELD_REF, even for LHS? Before my RHS handling change we likely always produced a replacement for the BIT_FIELD_REF base and kept the BIT_FIELD_REFs around, correct? diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index f8e71ec48b9..848bb8b89e0 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -2269,6 +2269,11 @@ sort_and_splice_var_accesses (tree var) && TREE_CODE (access->expr) == COMPONENT_REF && DECL_BIT_FIELD (TREE_OPERAND (access->expr, 1))); + /* When there is a partial LHS involved we have no way to see what it + accesses, so if it's not the only access we have to fail. */ + if (access->grp_partial_lhs && access_count != 1) + return NULL; + if (first || access->offset >= high) { first = false;