https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106770
--- Comment #5 from Segher Boessenkool <segher at gcc dot gnu.org> --- (In reply to Jens Seifert from comment #4) > PPCLE with no special option means -mcpu=power8 -maltivec (altivecle to be > mor precise). What? No. $ sh config.sub ppcle powerpcle-unknown-none This is typically the old 32-bit PowerPC ELF format. powerpcle-elf (which non-canonically can be called ppcle-elf) for example, or ppcle-linux, but not ppcle-aix and the like (that one doesn't even exist; at least one COFF format has existed in the past though). This may not matter for you, but it is awfully confusing for others. powerpc64le-linux (and I believe all other existing ELFv2 ports) require a p8 or later CPU, sure; but it is perfectly valid to have no AltiVec even then, or for a port to default to some other CPU. Currently we have no such thing, and all default defaults are like you say, but that might change. > vec_promote(<double value>, 1) should be a noop on ppcle. It never is, not on powerpc64le either. It always duplicates the selected elt to all lanes. > But value gets > splatted to both left and right part of vector register. => 2 unnecesary > xxpermdi So why are those not optimised away? *That* is the question! > The rest of the operations are done on left and right part. > > vec_extract(<vector double>, 1) should be noop on ppcle. But value gets > taken from right part of register which requires a xxpermdi > > Overall 3 unnecessary xxpermdi. Don't know why the right part of register > gets "preferred". I don't know what you mean there? The ABIs say where parameters and return values are stored, but you mean something else?