https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80770
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed| |2017-05-16 Ever confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Confirmed. The reason is that we (have to) expand the bitfield load to (insn 7 5 8 (set (reg:QI 93) (mem:QI (reg/v/f:DI 89 [ s ]) [1 *s_4(D)+0 S1 A8])) "t.c":8 -1 (nil)) (insn 8 7 9 (parallel [ (set (reg:QI 92) (and:QI (reg:QI 93) (const_int 1 [0x1]))) (clobber (reg:CC 17 flags)) ]) "t.c":8 -1 and similar the store has to use a read-modify-write cycle. Now, for this case this is all moot since the xor doesn't change the bits of the padding. But at the point we are expanding _1 = s_4(D)->b1; _2 = ~_1; s_4(D)->b1 = _2; this isn't easily visible. The general idea is to lower bitfields loads/stores earlier on GIMPLE, see various (partial) approaches posted over the last 10 years.