https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42172
--- Comment #10 from amker at gcc dot gnu.org --- The optimal code is generated on pre-armv7 processors. The difference starts from expand. On armv7-processors, zero_extract operator is generated, rather than logic operation. Seem combiner can't handle bit-field in the form of zero_extract as well as logic forms. I uploaded ud_dce&combine dumps for both cortex-m0 and cortex-m3. Apparently, code for m0 is what we want.