I had some local patches in my tree which adds a bswap tree code.
This breaks the aarch64 back-end vectorizing of byteswaps as we use
the standard mechanism to see if a tree code vectorizes (optabs).
Since it make sense to have consistent of the pattern names between
the vector version and the scalar version, I am proposing this patch
to make them consistent.

OK?  Build and tested on aarch64-elf with no regressions.

Thanks,
Andrew Pinski

ChangeLog:
        * config/aarch64/aarch64-simd-builtins.def (bswap): Use CF2 rather
        than CF10 so 2 is appended on the code.
        * config/aarch64/aarch64-simd.md (bswap<mode>): Rename to ...
        (bswap<mode>2): This so it matches for the optabs.
Index: config/aarch64/aarch64-simd.md
===================================================================
--- config/aarch64/aarch64-simd.md      (revision 218026)
+++ config/aarch64/aarch64-simd.md      (working copy)
@@ -286,7 +286,7 @@ (define_insn "mul<mode>3"
   [(set_attr "type" "neon_mul_<Vetype><q>")]
 )
 
-(define_insn "bswap<mode>"
+(define_insn "bswap<mode>2"
   [(set (match_operand:VDQHSD 0 "register_operand" "=w")
         (bswap:VDQHSD (match_operand:VDQHSD 1 "register_operand" "w")))]
   "TARGET_SIMD"
@@ -308,7 +308,7 @@ (define_expand "ctz<mode>2"
         (ctz:VS (match_operand:VS 1 "register_operand")))]
   "TARGET_SIMD"
   {
-     emit_insn (gen_bswap<mode> (operands[0], operands[1]));
+     emit_insn (gen_bswap<mode>2 (operands[0], operands[1]));
      rtx op0_castsi2qi = simplify_gen_subreg(<VS:VSI2QI>mode, operands[0],
                                             <MODE>mode, 0);
      emit_insn (gen_aarch64_rbit<VS:vsi2qi> (op0_castsi2qi, op0_castsi2qi));
Index: config/aarch64/aarch64-simd-builtins.def
===================================================================
--- config/aarch64/aarch64-simd-builtins.def    (revision 218026)
+++ config/aarch64/aarch64-simd-builtins.def    (working copy)
@@ -317,7 +317,7 @@
   VAR1 (UNOP, floatunsv4si, 2, v4sf)
   VAR1 (UNOP, floatunsv2di, 2, v2df)
 
-  VAR5 (UNOPU, bswap, 10, v4hi, v8hi, v2si, v4si, v2di)
+  VAR5 (UNOPU, bswap, 2, v4hi, v8hi, v2si, v4si, v2di)
 
   BUILTIN_VB (UNOP, rbit, 0)
 

Reply via email to