On Mon, Sep 29, 2014 at 05:26:14PM -0500, Bill Schmidt wrote: > The method used in this patch is to perform a byte-reversal of the > result of the lvsl/lvsr. This is accomplished by loading the vector > char constant {0,1,...,15}, which will appear in the register from left > to right as {15,...,1,0}. A vperm instruction (which uses BE element > ordering) is applied to the result of the lvsl/lvsr using the loaded > constant as the permute control vector.
It would be nice if you could arrange the generated sequence such that for the common case where the vec_lvsl feeds a vperm it is results in just lvsr;vnot machine instructions. Not so easy to do though :-( Some minor comments... > -(define_insn "altivec_lvsl" > +(define_expand "altivec_lvsl" > + [(use (match_operand:V16QI 0 "register_operand" "")) > + (use (match_operand:V16QI 1 "memory_operand" "Z"))] A define_expand should not have constraints. > + "TARGET_ALTIVEC" > + " No need for the quotes. > +{ > + if (VECTOR_ELT_ORDER_BIG) > + emit_insn (gen_altivec_lvsl_direct (operands[0], operands[1])); > + else > + { > + int i; > + rtx mask, perm[16], constv, vperm; > + mask = gen_reg_rtx (V16QImode); > + emit_insn (gen_altivec_lvsl_direct (mask, operands[1])); > + for (i = 0; i < 16; ++i) i++ is the common style. Segher