https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91869
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2019-09-24 CC| |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Confirmed. One issue is with the use of volatile in the testcase which on GIMPLE forces an intermediate init of an aggregate that is then volatile-copied to the destination. If you remove the volatile qualifications the code generation improves but we still see main () { <bb 2> [local count: 1073741825]: MEM[(struct Reg_T *)&Reg_0] = 0; MEM[(struct Reg_T *)&Reg_1] = 64; MEM[(struct Reg_T *)&Reg_2] = 8; Reg_3 = *.LC0; MEM[(struct Reg_T *)&Reg_4] = 4; Reg_5 = *.LC1; Reg_6 = *.LC2; Reg_7 = *.LC3; Reg_A = 0; Reg_B = 72; Reg_C = 255; return 0; thus 1-byte constant pool entries being used: movzbl .LC0(%rip), %eax ... on the GIMPLE level this isn't cleaned up because of the aggregate-ness (plus the constructor involving bitfields and us being lazy and giving up on native-interpreting those in the constant folding code - still we have code to deal with this in ctor emit code). The cases with *.LCN uses come from the gimplifier heuristic when there's more than one non-zero initializer: Reg_0.a = 0; Reg_0.b = 0; Reg_0.c = 0; Reg_1.a = 0; Reg_1.b = 0; Reg_1.c = 4; Reg_2.a = 0; Reg_2.b = 1; Reg_2.c = 0; Reg_3 = *.LC0; Reg_4.a = 4; Reg_4.b = 0; Reg_4.c = 0; Reg_5 = *.LC1; Reg_6 = *.LC2; Reg_7 = *.LC3; Reg_A = 0; Reg_B = 72; Reg_C = 255; the heuristics are a bit odd here given we don't use pre-init and in the other cases and thus don't save anything?! Note I'd rather have the gimplifier use *.LCN aggregate assigns always and leave the optimization to optimizations (which we obviously have to improve as can be seen here). The immediate "refactoring" possible is trying to unify the ctor emission code in varasm.c and the native_encode stuff. Then it could be SRAs job to optimally scalarize the aggregate init.