https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #68 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to rsand...@gcc.gnu.org from comment #62) > For the two bfi ones: are we really sure that the old code is better? > It's a difference between a MOV and a BFI or an AND and an ORR. > The BFI wins (at least for code-size) if we need the same MOV > for something else. But the AND/ORR sequence wins in high register > pressure, since it only needs one register rather than two. On some processors (ThunderX2 and OcteonTX2 and maybe others [I have not looked into all of the micro-arches there are]), the mov/bfi case is most likely better as the mov is removed during renaming phase and not actually issued so it will turn into just one instruction in a latency of 1 rather than 2 instructions and latency of 2.