https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87008
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P3 |P2 --- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- So on trunk ESRA does some pointless stuff at least: <bb 2> : ISRA.2 = MEM[(const struct A &)&x]; + SR.7_4 = MEM[(struct A *)&ISRA.2]; + SR.8_5 = MEM[(struct A *)&ISRA.2 + 8B]; + MEM[(struct A *)&ISRA.2] = SR.7_4; + MEM[(struct A *)&ISRA.2 + 8B] = SR.8_5; MEM[(struct A *)&y] = ISRA.2; - ISRA.2 ={v} {CLOBBER}; + y$D2304$a_17 = MEM[(struct A *)&y]; ISRA.2 = MEM[(const struct A &)&x]; + SR.5_20 = MEM[(struct A *)&ISRA.2]; + SR.6_21 = MEM[(struct A *)&ISRA.2 + 8B]; + MEM[(struct A *)&ISRA.2] = SR.5_20; + MEM[(struct A *)&ISRA.2 + 8B] = SR.6_21; MEM[(struct A *)&z] = ISRA.2; - ISRA.2 ={v} {CLOBBER}; - _1 = y.D.2304.a; - _2 = z.D.2304.a; + z$D2304$a_24 = MEM[(struct A *)&z]; + _1 = y$D2304$a_17; + _2 = z$D2304$a_24; _6 = _1 - _2; where the main grief is caused by IPA SRA which replaces cp<A> (&y.D.2304, &x.D.2304); cp<A> (&z.D.2304, &x.D.2304); _1 = y.D.2304.a; _2 = z.D.2304.a; _6 = _1 - _2; with _Z2cpI1AEvRT_RKS1_.isra.0 (&y, MEM[(const struct A &)&x]); _Z2cpI1AEvRT_RKS1_.isra.0 (&z, MEM[(const struct A &)&x]); _1 = y.D.2304.a; _2 = z.D.2304.a; _6 = _1 - _2; replacing the by-reference second argument by an aggregate by-value argument. I think that's unwarranted - Martin, can you see if there's a simple logic error that can rectify this? The same behavior is happening when the second parameter is not declared const. With IPA SRA disabled the IL gets nicer but then ESRA doesnt' do anything interesting anymore (but renaming stuff and uglifying/moving loads): MEM[(struct A *)&y] = MEM[(const struct A &)&x]; + y$D2304$a_4 = MEM[(struct A *)&y]; MEM[(struct A *)&z] = MEM[(const struct A &)&x]; - _1 = y.D.2304.a; - _2 = z.D.2304.a; + z$D2304$a_5 = MEM[(struct A *)&z]; + _1 = y$D2304$a_4; + _2 = z$D2304$a_5; _6 = _1 - _2; also sth to avoid IMHO. I guess it thinks it might fully scalarize the copies but late decides not to but leaves the rest of the trasform in-place. So yes, the copy is viewed as contains_vce_or_bfcref_p because we access a variable of type B via type A. But that only matters for variables we can scalarize away - we call this for MEM[&x] only but we want to scalarize MEM[&y] = MEM[&x] thus scalarize y away. That we mark x as cannot_scalarize_away_bitmap shouldn't affect total scalarization for the aggregate copy? I wonder why build_accesses_from_assign only looks at the RHS for total scalarization and not the LHS. Well to sum it up, we see all uses of y and z and thus we _can_ total scalarize y and z simply eliding the aggregate copy.