https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121527
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization CC| |mjambor at suse dot cz --- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Eric Botcazou from comment #11) > The missing stores are visible in the .optimized file wrt a compiler without > the problematic change (r16-3156-g5294840e3c7bf9) spotted by Sam. More specifically, I see a difference in c391002_4.adb.272t.optimized: @@ -1001,17 +1001,15 @@ <bb 4> [local count: 1073312328]: VIEW_CONVERT_EXPR<struct c391002_4__creator__S1b>(A11b) = VIEW_CONVERT_EXPR<struct c391002_4__creator__S1b>(*plugs_6(D)); _2 = VIEW_CONVERT_EXPR<struct c391002_4__creator__S1b>(*plugs_6(D))._parent.band; - A11b._parent._parent.band = _2; _3 = VIEW_CONVERT_EXPR<struct c391002_4__creator__S1b>(*plugs_6(D)).the_command; - A11b._parent.the_command = _3; A11b.tc_mc = -1; where this is redundant store removal done in FRE1: Value numbering store A11b._parent._parent.band to _3 Setting value number of .MEM_15 to .MEM_15 (changed) +Deleted redundant store A11b._parent._parent.band = _3; ... Value numbering store A11b._parent.the_command to _4 Setting value number of .MEM_16 to .MEM_16 (changed) +Deleted redundant store A11b._parent.the_command = _4; The apparent redundancy is visible above, we aggregate copy *plugs_6 to A11b and then copy separately a field of it. The stored value is of 1 byte size enumeral type. The store offset is 8 bytes to A11b, the load offset as well. The aggregate copy is done using TImode load/stores, the remaining loads to _2/_3 are QImode. I fail to see how there's anything wrong. Or are those not the stores "missing"? In the assembly I also see the stores to A11b (which is on the stack) and the corresponding load feeding the compares. In p.adb I see MEM <void (*ada__tags__prim_ptr) (void)[1:1] * {ref-all}> [(struct p__T43b * {ref-all})&A50b] = &MEM <struct c391002_3__Telectronics_moduleTS> [(void *)&c391002_3__electronics_moduleT + 32B]; - MEM <unsigned char> [(struct p__T43b * {ref-all})&A50b + 8B] = 2; MEM <integer> [(struct p__T43b * {ref-all})&A50b + 52B] = 66; VIEW_CONVERT_EXPR<struct p__T43b>(A63b) = MEM[(struct p__T43b * {ref-all})&A50b]; - A63b._parent._parent.band = 2; - A63b._parent.the_command = 0; the latter two look OK at the time they happen. The 2nd is difficult to track down. SRA seems to do quite stupid things around aggregate copies like VIEW_CONVERT_EXPR<struct p__B51b__T52b>(A50b) = A53b; where it does scalarize A50b, but before such V_C_E accesses restores A50b and afterwards re-loads all components. Disabling (non-early) SRA for p.adb resolves the issue and the A63b stores above are still gone (but there's a lot of difference otherwise). The late SRA beavior with/without the patch differs substantially. This definitely needs more analysis - there's no FRE after late SRA, so for now I think it's an error in SRA (apart from the fact it doing stupid things around such aggregate copies).