https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71977
--- Comment #1 from Michael Meissner <meissner at gcc dot gnu.org> --- Unfortunately, the code gets even worse if you use -mcpu=power9: .L.mask_float: stfs 1,-16(1) lwz 9,-16(1) and 4,4,9 stw 4,-16(1) lfs 1,-16(1) blr I.e. instead of doing direct moves to the GPRs and doing the AND, it now stores the value on the stack and reloads it. Note in terms of the code in general, you have to make sure that the float value is converted to vector form before you do AND/OR/etc. on it. This is because within the register, 32-bit floats are actually stored as 64-bit double precision values.