I've committed this patch to enable DImode one's-complement on amdgcn.
The hardware doesn't have 64-bit not, and this isn't needed by expand
which is happy to use two SImode operations, but the vectorizer isn't so
clever. Vector condition masks are DImode on amdgcn, so this has been
causing lots of conditional code to fail to vectorize.
Andrew
amdgcn: 64-bit not
This makes the auto-vectorizer happier when handling masks.
gcc/ChangeLog:
* config/gcn/gcn.md (one_cmpldi2): New.
diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md
index 033c1708e88..70a769babc4 100644
--- a/gcc/config/gcn/gcn.md
+++ b/gcc/config/gcn/gcn.md
@@ -1676,6 +1676,26 @@ (define_expand "<expander>si3_scc"
;; }}}
;; {{{ ALU: generic 64-bit
+(define_insn_and_split "one_cmpldi2"
+ [(set (match_operand:DI 0 "register_operand" "=Sg, v")
+ (not:DI (match_operand:DI 1 "gcn_alu_operand" "SgA,vSvDB")))
+ (clobber (match_scratch:BI 2 "=cs, X"))]
+ ""
+ "#"
+ "reload_completed"
+ [(parallel [(set (match_dup 3) (not:SI (match_dup 4)))
+ (clobber (match_dup 2))])
+ (parallel [(set (match_dup 5) (not:SI (match_dup 6)))
+ (clobber (match_dup 2))])]
+ {
+ operands[3] = gcn_operand_part (DImode, operands[0], 0);
+ operands[4] = gcn_operand_part (DImode, operands[1], 0);
+ operands[5] = gcn_operand_part (DImode, operands[0], 1);
+ operands[6] = gcn_operand_part (DImode, operands[1], 1);
+ }
+ [(set_attr "type" "mult")]
+)
+
(define_code_iterator vec_and_scalar64_com [and ior xor])
(define_insn_and_split "<expander>di3"