I've committed this patch to enable DImode one's-complement on amdgcn.

The hardware doesn't have 64-bit not, and this isn't needed by expand which is happy to use two SImode operations, but the vectorizer isn't so clever. Vector condition masks are DImode on amdgcn, so this has been causing lots of conditional code to fail to vectorize.

Andrew
amdgcn: 64-bit not

This makes the auto-vectorizer happier when handling masks.

gcc/ChangeLog:

        * config/gcn/gcn.md (one_cmpldi2): New.

diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md
index 033c1708e88..70a769babc4 100644
--- a/gcc/config/gcn/gcn.md
+++ b/gcc/config/gcn/gcn.md
@@ -1676,6 +1676,26 @@ (define_expand "<expander>si3_scc"
 ;; }}}
 ;; {{{ ALU: generic 64-bit
 
+(define_insn_and_split "one_cmpldi2"
+  [(set (match_operand:DI 0 "register_operand"        "=Sg,    v")
+       (not:DI (match_operand:DI 1 "gcn_alu_operand" "SgA,vSvDB")))
+   (clobber (match_scratch:BI 2                              "=cs,    X"))]
+  ""
+  "#"
+  "reload_completed"
+  [(parallel [(set (match_dup 3) (not:SI (match_dup 4)))
+             (clobber (match_dup 2))])
+   (parallel [(set (match_dup 5) (not:SI (match_dup 6)))
+             (clobber (match_dup 2))])]
+  {
+    operands[3] = gcn_operand_part (DImode, operands[0], 0);
+    operands[4] = gcn_operand_part (DImode, operands[1], 0);
+    operands[5] = gcn_operand_part (DImode, operands[0], 1);
+    operands[6] = gcn_operand_part (DImode, operands[1], 1);
+  }
+  [(set_attr "type" "mult")]
+)
+
 (define_code_iterator vec_and_scalar64_com [and ior xor])
 
 (define_insn_and_split "<expander>di3"

Reply via email to