https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77438
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- Profitability also depends on the ABI in this case as well as the ability to do the required extensions/extractions (and resource availability). For bit operations that directly map to the integer representation it will likely be a loss unless you consider ABI. As said in another PR vector lowering should be re-written to sth tree-complex.c like, with a lattice that allows to optimize insertion of extensions/extracts (and also could track cost of parameter / return value transform).