cfe-commits@lists.llvm.org

2020-06-23 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81740/new/

https://reviews.llvm.org/D81740



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118044: [ARM] Undeprecate complex IT blocks

2022-02-03 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM marked 3 inline comments as done.
MarkMurrayARM added inline comments.



Comment at: llvm/test/CodeGen/ARM/ifcvt-branch-weight.ll:22
+; CHECK: bb.1.bb2:
+; CHECK: successors: %bb.2(0x4000)
 

dmgreen wrote:
> I'm not sure this is still checking anything useful. How has the full output 
> changed? Is there not a block with two outputs anymore?
It is a failure caused by my change. 



Comment at: llvm/test/CodeGen/Thumb2/v8_IT_3.ll:4
 ; RUN: llc < %s -mtriple=thumbv8 -arm-atomic-cfg-tidy=0 -relocation-model=pic 
| FileCheck %s --check-prefix=CHECK-PIC
-; RUN: llc < %s -mtriple=thumbv7 -arm-atomic-cfg-tidy=0 -arm-restrict-it 
-relocation-model=pic | FileCheck %s --check-prefix=CHECK-PIC
+; RUN: llc < %s -mtriple=thumbv7 -arm-atomic-cfg-tidy=0 -arm-restrict-it 
-relocation-model=pic | FileCheck %s --check-prefix=CHECK-PIC-RESTRICT-IT 
--check-prefix=CHECK-RESTRICT-IT
 

dmgreen wrote:
> I'm not sure what this test is doing. Can you just remove -arm-restrict-it 
> and update the check lines?
It's checking restricted ID blocks in position-independent code. I don't see 
what your request will achieve?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118044/new/

https://reviews.llvm.org/D118044

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118044: [ARM] Undeprecate complex IT blocks

2022-02-07 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM marked 2 inline comments as done.
MarkMurrayARM added inline comments.



Comment at: llvm/test/CodeGen/ARM/ifcvt-branch-weight.ll:21
 
-; CHECK: bb.2.bb2:
+; CHECK: bb.1.bb:
 ; CHECK: successors: %bb.4(0x4000), %bb.3(0x4000)

dmgreen wrote:
> I think this should still be bb.2.bb2
If I do that the test fails. It's definitely like this in the generated code.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118044/new/

https://reviews.llvm.org/D118044

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72690: [ARM,MVE] Use the new Tablegen `defvar` and `if` statements.

2020-01-14 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added inline comments.
This revision is now accepted and ready to land.



Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:2464
+  defvar pred_int   = int_arm_mve_vshll_imm_predicated;
+  defvar imm= inst_imm.immediateType;
+

MUCH better!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72690/new/

https://reviews.llvm.org/D72690



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72761: [ARM][MVE][Intrinsics] Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics.

2020-01-15 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
MarkMurrayARM added reviewers: simon_tatham, miyuki, dmgreen.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls.
Herald added projects: clang, LLVM.

Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics and unit tests.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D72761

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <8 x half> @test_vminnmaq_f16(<8 x half> %a, <8 x half> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vminnma.f16 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x half> @llvm.arm.mve.vminnma.v8f16(<8 x half> %a, <8 x half> %b)
+  ret <8 x half> %0
+}
+
+declare <8 x half> @llvm.arm.mve.vminnma.v8f16(<8 x half>, <8 x half>) #1
+
+define arm_aapcs_vfpcc <4 x float> @test_vminnmaq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vminnma.f32 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x float> @llvm.arm.mve.vminnma.v4f32(<4 x float> %a, <4 x float> %b)
+  ret <4 x float> %0
+}
+
+declare <4 x float> @llvm.arm.mve.vminnma.v4f32(<4 x float>, <4 x float>) #1
+
+define arm_aapcs_vfpcc <8 x half> @test_vminnmaq_m_f16(<8 x half> %a, <8 x half> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vminnmat.f16 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.vminnma.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x half> @llvm.arm.mve.vminnma.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>) #1
+
+define arm_aapcs_vfpcc <4 x float> @test_vminnmaq_m_f32(<4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vminnmat.f32 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x float> @llvm.arm.mve.vminnma.predicated.v4f32.v4i1(<4 x float> %a, <4 x float> %b, <4 x i1> %1)
+  ret <4 x float> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #1
+
+declare <4 x float> @llvm.arm.mve.vminnma.predicated.v4f32.v4i1(<4 x float>, <4 x float>, <4 x i1>) #1
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminaq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminaq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vminaq_s8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminaq_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmina.s8 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vmina.v16i8.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vmina.v16i8.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vminaq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminaq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmina.s16 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vmina.v8i16.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vmina.v8i16.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vminaq_s32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminaq_s32:
+; CHECK:   @ %bb.0: @ 

[PATCH] D72761: [ARM][MVE][Intrinsics] Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics.

2020-01-15 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 238269.
MarkMurrayARM added a comment.

Much better impementation after review feedback.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72761/new/

https://reviews.llvm.org/D72761

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
@@ -0,0 +1,68 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <8 x half> @test_vminnmaq_f16(<8 x half> %a, <8 x half> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vminnma.f16 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
+  %1 = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> %a, <8 x half> %0)
+  ret <8 x half> %1
+}
+
+declare <8 x half> @llvm.fabs.v8f16(<8 x half>) #1
+
+declare <8 x half> @llvm.minnum.v8f16(<8 x half>, <8 x half>) #1
+
+define arm_aapcs_vfpcc <4 x float> @test_vminnmaq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vminnma.f32 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
+  %1 = tail call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %0)
+  ret <4 x float> %1
+}
+
+declare <4 x float> @llvm.fabs.v4f32(<4 x float>) #1
+
+declare <4 x float> @llvm.minnum.v4f32(<4 x float>, <4 x float>) #1
+
+define arm_aapcs_vfpcc <8 x half> @test_vminnmaq_m_f16(<8 x half> %a, <8 x half> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vminnmat.f16 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.vminnma.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x half> @llvm.arm.mve.vminnma.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>) #2
+
+define arm_aapcs_vfpcc <4 x float> @test_vminnmaq_m_f32(<4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vminnmat.f32 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x float> @llvm.arm.mve.vminnma.predicated.v4f32.v4i1(<4 x float> %a, <4 x float> %b, <4 x i1> %1)
+  ret <4 x float> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x float> @llvm.arm.mve.vminnma.predicated.v4f32.v4i1(<4 x float>, <4 x float>, <4 x i1>) #2
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminaq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminaq.ll
@@ -0,0 +1,98 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vminaq_s8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminaq_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmina.s8 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp slt <16 x i8> %b, zeroinitializer
+  %1 = sub <16 x i8> zeroinitializer, %b
+  %2 = select <16 x i1> %0, <16 x i8> %1, <16 x i8> %b
+  %3 = icmp ult <16 x i8> %2, %a
+  %4 = select <16 x i1> %3, <16 x i8> %2, <16 x i8> %a
+  ret <16 x i8> %4
+}
+
+define arm_aapcs_vfpcc <8 x i16> @test_vminaq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminaq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmina.s16 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp slt <8 x i16> %b, zeroinitializer
+  %1 = sub <8 x i16> zeroinitializer, %b
+  %2 = select <8 x i1> %0, <8 x i16> %1, <8 x i16> %b
+  %3 = icmp ult <8 x i16

[PATCH] D72761: [ARM][MVE][Intrinsics] Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics.

2020-01-15 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGda9d57d2c2dc: [ARM][MVE][Intrinsics] Add VMINAQ, VMINNMAQ, 
VMAXAQ, VMAXNMAQ intrinsics. (authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72761/new/

https://reviews.llvm.org/D72761

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
@@ -0,0 +1,68 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <8 x half> @test_vminnmaq_f16(<8 x half> %a, <8 x half> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vminnma.f16 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
+  %1 = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> %a, <8 x half> %0)
+  ret <8 x half> %1
+}
+
+declare <8 x half> @llvm.fabs.v8f16(<8 x half>) #1
+
+declare <8 x half> @llvm.minnum.v8f16(<8 x half>, <8 x half>) #1
+
+define arm_aapcs_vfpcc <4 x float> @test_vminnmaq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vminnma.f32 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
+  %1 = tail call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %0)
+  ret <4 x float> %1
+}
+
+declare <4 x float> @llvm.fabs.v4f32(<4 x float>) #1
+
+declare <4 x float> @llvm.minnum.v4f32(<4 x float>, <4 x float>) #1
+
+define arm_aapcs_vfpcc <8 x half> @test_vminnmaq_m_f16(<8 x half> %a, <8 x half> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vminnmat.f16 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.vminnma.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x half> @llvm.arm.mve.vminnma.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>) #2
+
+define arm_aapcs_vfpcc <4 x float> @test_vminnmaq_m_f32(<4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminnmaq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vminnmat.f32 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x float> @llvm.arm.mve.vminnma.predicated.v4f32.v4i1(<4 x float> %a, <4 x float> %b, <4 x i1> %1)
+  ret <4 x float> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x float> @llvm.arm.mve.vminnma.predicated.v4f32.v4i1(<4 x float>, <4 x float>, <4 x i1>) #2
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminaq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminaq.ll
@@ -0,0 +1,98 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vminaq_s8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminaq_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmina.s8 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp slt <16 x i8> %b, zeroinitializer
+  %1 = sub <16 x i8> zeroinitializer, %b
+  %2 = select <16 x i1> %0, <16 x i8> %1, <16 x i8> %b
+  %3 = icmp ult <16 x i8> %2, %a
+  %4 = select <16 x i1> %3, <16 x i8> %2, <16 x i8> %a
+  ret <16 x i8> %4
+}
+
+define arm_aapcs_vfpcc <8 x i16> @test_vminaq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminaq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmina.s16 q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp slt <8 x i16> %b, zeroinitializer
+  %1 = sub <8 x i16> zeroinitializer, %b
+  

[PATCH] D72761: [ARM][MVE][Intrinsics] Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics.

2020-01-16 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM marked an inline comment as done.
MarkMurrayARM added inline comments.



Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:3658
+// Unpredicated v(max|min)nma
+def : Pat<(VTI.Vec (unpred_op (VTI.Vec MQPR:$Qd), (fabs (VTI.Vec 
MQPR:$Qm,
+  (VTI.Vec (Inst (VTI.Vec MQPR:$Qd), (VTI.Vec MQPR:$Qm)))>;

dmgreen wrote:
> If I'm reading the ARMARM correctly, the fp case seems to preform the abs on 
> both operands.
My bad. Fix coming under separate cover.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72761/new/

https://reviews.llvm.org/D72761



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72830: [ARM][MVE][Intrinsics] Take abs() of VMINNMAQ, VMAXNMAQ intrinsics' first arguments.

2020-01-16 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
MarkMurrayARM added reviewers: dmgreen, simon_tatham.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls.
Herald added projects: clang, LLVM.

Fix VMINNMAQ, VMAXNMAQ intrinsics; BOTH arguments have the absolute values 
taken.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D72830

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
===
--- llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
@@ -7,9 +7,10 @@
 ; CHECK-NEXT:vminnma.f16 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
-  %1 = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> %a, <8 x half> %0)
-  ret <8 x half> %1
+  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %a)
+  %1 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
+  %2 = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> %0, <8 x half> %1)
+  ret <8 x half> %2
 }
 
 declare <8 x half> @llvm.fabs.v8f16(<8 x half>) #1
@@ -22,9 +23,10 @@
 ; CHECK-NEXT:vminnma.f32 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
-  %1 = tail call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %0)
-  ret <4 x float> %1
+  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
+  %1 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
+  %2 = tail call <4 x float> @llvm.minnum.v4f32(<4 x float> %0, <4 x float> %1)
+  ret <4 x float> %2
 }
 
 declare <4 x float> @llvm.fabs.v4f32(<4 x float>) #1
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
===
--- llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
@@ -7,9 +7,10 @@
 ; CHECK-NEXT:vmaxnma.f16 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
-  %1 = tail call <8 x half> @llvm.maxnum.v8f16(<8 x half> %a, <8 x half> %0)
-  ret <8 x half> %1
+  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %a)
+  %1 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
+  %2 = tail call <8 x half> @llvm.maxnum.v8f16(<8 x half> %0, <8 x half> %1)
+  ret <8 x half> %2
 }
 
 declare <8 x half> @llvm.fabs.v8f16(<8 x half>) #1
@@ -22,9 +23,10 @@
 ; CHECK-NEXT:vmaxnma.f32 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
-  %1 = tail call <4 x float> @llvm.maxnum.v4f32(<4 x float> %a, <4 x float> %0)
-  ret <4 x float> %1
+  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
+  %1 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
+  %2 = tail call <4 x float> @llvm.maxnum.v4f32(<4 x float> %0, <4 x float> %1)
+  ret <4 x float> %2
 }
 
 declare <4 x float> @llvm.fabs.v4f32(<4 x float>) #1
Index: llvm/lib/Target/ARM/ARMInstrMVE.td
===
--- llvm/lib/Target/ARM/ARMInstrMVE.td
+++ llvm/lib/Target/ARM/ARMInstrMVE.td
@@ -3655,7 +3655,8 @@
 
   let Predicates = [HasMVEInt] in {
 // Unpredicated v(max|min)nma
-def : Pat<(VTI.Vec (unpred_op (VTI.Vec MQPR:$Qd), (fabs (VTI.Vec MQPR:$Qm,
+def : Pat<(VTI.Vec (unpred_op (fabs (VTI.Vec MQPR:$Qd)),
+  (fabs (VTI.Vec MQPR:$Qm,
   (VTI.Vec (Inst (VTI.Vec MQPR:$Qd), (VTI.Vec MQPR:$Qm)))>;
 
 // Predicated v(max|min)nma
Index: clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
===
--- clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
+++ clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
@@ -6,9 +6,10 @@
 
 // CHECK-LABEL: @test_vminnmaq_f16(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> [[B:%.*]])
-// CHECK-NEXT:[[TMP1:%.*]] = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> [[A:%.*]], <8 x half> [[TMP0]])
-// CHECK-NEXT:ret <8 x half> [[TMP1]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> [[A:%.*]])
+// CHECK-NEXT:[[TMP1:%.*]] = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> [[B:%.*]])
+// CHECK-NEXT:[[TMP2:%.*]] = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> [[TMP0]], <8 x half> [[TMP1]])
+// CHECK-NEXT:ret <8 x half> [[TMP2]]
 //
 float16x8_t test_vminnmaq_f16(float16x8_t a, float16x8_t b)
 {
@@ -21,9 +22,10 @@
 
 // CHECK-LABEL: @test_vminnmaq_f32(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call <4 x float> @llvm.fabs.v

[PATCH] D93022: [ARM][AArch64] Add Cortex-A78C Support for Clang and LLVM

2020-12-10 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
Herald added subscribers: danielkiss, hiraditya, kristof.beyls.
MarkMurrayARM requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

This patch upstreams support for the Arm-v8 Cortex-A78C
processor for AArch64 and ARM.

In detail:

- Adding cortex-a78c as cpu option for aarch64 and arm targets in clang
- Adding Cortex-A78C CPU name and ProcessorModel in llvm

Details of the CPU can be found here:
https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a78c


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D93022

Files:
  clang/test/Driver/aarch64-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -10,6 +10,7 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/Support/ARMBuildAttributes.h"
 #include "gtest/gtest.h"
+#include 
 #include 
 
 using namespace llvm;
@@ -40,10 +41,19 @@
   pass &= ARM::getFPUName(FPUKind).equals(ExpectedFPU);
 
   uint64_t ExtKind = ARM::getDefaultExtensions(CPUName, AK);
-  if (ExtKind > 1 && (ExtKind & ARM::AEK_NONE))
+  if (ExtKind > 1 && (ExtKind & ARM::AEK_NONE)) {
 pass &= ((ExtKind ^ ARM::AEK_NONE) == ExpectedFlags);
-  else
+if (!pass)
+  std::cout << "ExpectedFlags = 0x" << std::hex << ExpectedFlags
+<< " do not equal ExtKind = 0x" << std::hex << ExtKind
+<< "in the ARM::AEK_NONE case. " << std::endl;
+  }
+  else {
 pass &= (ExtKind == ExpectedFlags);
+if (!pass)
+  std::cout << "ExpectedFlags = 0x" << std::hex << ExpectedFlags
+<< " do not equal ExtKind = 0x" << std::hex << ExtKind << std::endl;
+  }
   pass &= ARM::getCPUAttr(AK).equals(CPUAttr);
 
   return pass;
@@ -268,6 +278,13 @@
  ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
  ARM::AEK_DSP | ARM::AEK_CRC | ARM::AEK_RAS,
  "8.2-A"));
+  EXPECT_TRUE(testARMCPU("cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+ ARM::AEK_RAS | ARM::AEK_SEC | ARM::AEK_MP |
+ ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
+ ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP |
+ ARM::AEK_CRC | ARM::AEK_RAS,
+ "8.2-A"));
+
   EXPECT_TRUE(testARMCPU("cortex-x1", "armv8.2-a", "crypto-neon-fp-armv8",
  ARM::AEK_RAS | ARM::AEK_FP16 | ARM::AEK_DOTPROD |
  ARM::AEK_SEC | ARM::AEK_MP | ARM::AEK_VIRT |
@@ -334,7 +351,7 @@
  "7-S"));
 }
 
-static constexpr unsigned NumARMCPUArchs = 91;
+static constexpr unsigned NumARMCPUArchs = 92;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -800,11 +817,20 @@
   bool pass = AArch64::getArchName(AK).equals(ExpectedArch);
 
   uint64_t ExtKind = AArch64::getDefaultExtensions(CPUName, AK);
-  if (ExtKind > 1 && (ExtKind & AArch64::AEK_NONE))
+  if (ExtKind > 1 && (ExtKind & AArch64::AEK_NONE)) {
 pass &= ((ExtKind ^ AArch64::AEK_NONE) == ExpectedFlags);
-  else
+if (!pass)
+  std::cout << "ExpectedFlags = 0x" << std::hex << ExpectedFlags
+<< " do not equal ExtKind = 0x" << std::hex << ExtKind
+<< "in the ARM::AEK_NONE case. " << std::endl;
+  }
+  else {
 pass &= (ExtKind == ExpectedFlags);
-
+if (!pass) {
+  std::cout << "ExpectedFlags = 0x" << std::hex << ExpectedFlags
+<< " do not equal ExtKind = 0x" << std::hex << ExtKind << std::endl;
+}
+  }
   pass &= AArch64::getCPUAttr(AK).equals(CPUAttr);
 
   return pass;
@@ -893,6 +919,12 @@
   AArch64::AEK_LSE | AArch64::AEK_FP16 | AArch64::AEK_DOTPROD |
   AArch64::AEK_RCPC | AArch64::AEK_SSBS,
   "8.2-A"));
+  EXPECT_TRUE(testAArch64CPU(
+  "cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+  AArch64::AEK_RAS | AArch64::AEK_CRC | AArch64::AEK_CRYPTO |
+  AArch64::AEK_FP | AArch64::AEK_SIMD | AArch64::AEK_RAS |
+  AArch64::AEK_LSE | AArch64::AEK_RDM,
+  "8.2-A"));
   EXPECT_TRUE(testAArch64CPU(
   "neoverse-v1", "armv8.4-a", "crypto-neon-fp-armv8",
   AArch64::AEK_RAS | AArch64::AEK_SVE | AArch64::AEK_SSBS |
@@ -1063,7 +1095,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 45;
+static constexpr unsigned NumAArch64CPUArchs = 46;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/lib/Target/ARM/ARMSubtarget.h
=

[PATCH] D93022: [ARM][AArch64] Add Cortex-A78C Support for Clang and LLVM

2020-12-10 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64.td:673-686
+def ProcA78C : SubtargetFeature<"cortex-a78c", "ARMProcFamily",
+"CortexA78C",
+"Cortex-A78C ARM processors", [
+HasV8_2aOps,
+FeatureCrypto,
+FeatureFPARMv8,
+FeatureFuseAES,

ktkachov wrote:
> According to the TRM at https://developer.arm.com/documentation/102226/0001 
> Cortex-A78C also supports Pointer Authetication and the Flag Manipulation 
> instructions as well (CFINV, RMIF etc). I think this feature set doesn't 
> reflect that?
ktkachov: FeaturePA?




Comment at: llvm/unittests/Support/TargetParserTest.cpp:833
+}
+  }
   pass &= AArch64::getCPUAttr(AK).equals(CPUAttr);

DavidSpickett wrote:
> I assume this was left in from debugging, if not it should be its own change.
> (considering this file is already using gtest, you could refactor to 
> ASSERT_EQ the flags and get the messages for free)
I like the idea of ASSERT_EQ, but I need the numbers to be in hex.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93022/new/

https://reviews.llvm.org/D93022

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93022: [ARM][AArch64] Add Cortex-A78C Support for Clang and LLVM

2020-12-15 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 311856.
MarkMurrayARM added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93022/new/

https://reviews.llvm.org/D93022

Files:
  clang/test/Driver/aarch64-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -268,6 +268,13 @@
  ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
  ARM::AEK_DSP | ARM::AEK_CRC | ARM::AEK_RAS,
  "8.2-A"));
+  EXPECT_TRUE(testARMCPU("cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+ ARM::AEK_RAS | ARM::AEK_SEC | ARM::AEK_MP |
+ ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
+ ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP |
+ ARM::AEK_CRC | ARM::AEK_RAS,
+ "8.2-A"));
+
   EXPECT_TRUE(testARMCPU("cortex-x1", "armv8.2-a", "crypto-neon-fp-armv8",
  ARM::AEK_RAS | ARM::AEK_FP16 | ARM::AEK_DOTPROD |
  ARM::AEK_SEC | ARM::AEK_MP | ARM::AEK_VIRT |
@@ -334,7 +341,7 @@
  "7-S"));
 }
 
-static constexpr unsigned NumARMCPUArchs = 91;
+static constexpr unsigned NumARMCPUArchs = 92;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -893,6 +900,12 @@
   AArch64::AEK_LSE | AArch64::AEK_FP16 | AArch64::AEK_DOTPROD |
   AArch64::AEK_RCPC | AArch64::AEK_SSBS,
   "8.2-A"));
+  EXPECT_TRUE(testAArch64CPU(
+  "cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+  AArch64::AEK_RAS | AArch64::AEK_CRC | AArch64::AEK_CRYPTO |
+  AArch64::AEK_FP | AArch64::AEK_SIMD | AArch64::AEK_RAS |
+  AArch64::AEK_LSE | AArch64::AEK_RDM,
+  "8.2-A"));
   EXPECT_TRUE(testAArch64CPU(
   "neoverse-v1", "armv8.4-a", "crypto-neon-fp-armv8",
   AArch64::AEK_RAS | AArch64::AEK_SVE | AArch64::AEK_SSBS |
@@ -1062,7 +1075,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 45;
+static constexpr unsigned NumAArch64CPUArchs = 46;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/lib/Target/ARM/ARMSubtarget.h
===
--- llvm/lib/Target/ARM/ARMSubtarget.h
+++ llvm/lib/Target/ARM/ARMSubtarget.h
@@ -63,6 +63,7 @@
 CortexA76,
 CortexA77,
 CortexA78,
+CortexA78C,
 CortexA8,
 CortexA9,
 CortexM3,
Index: llvm/lib/Target/ARM/ARMSubtarget.cpp
===
--- llvm/lib/Target/ARM/ARMSubtarget.cpp
+++ llvm/lib/Target/ARM/ARMSubtarget.cpp
@@ -294,6 +294,7 @@
   case CortexA76:
   case CortexA77:
   case CortexA78:
+  case CortexA78C:
   case CortexR4:
   case CortexR4F:
   case CortexR5:
Index: llvm/lib/Target/ARM/ARM.td
===
--- llvm/lib/Target/ARM/ARM.td
+++ llvm/lib/Target/ARM/ARM.td
@@ -598,6 +598,8 @@
"Cortex-A77 ARM processors", []>;
 def ProcA78 : SubtargetFeature<"cortex-a78", "ARMProcFamily", "CortexA78",
"Cortex-A78 ARM processors", []>;
+def ProcA78C: SubtargetFeature<"a78c", "ARMProcFamily", "CortexA78C",
+   "Cortex-A78C ARM processors", []>;
 def ProcX1  : SubtargetFeature<"cortex-x1", "ARMProcFamily", "CortexX1",
"Cortex-X1 ARM processors", []>;
 
@@ -1275,6 +1277,13 @@
  FeatureFullFP16,
  FeatureDotProd]>;
 
+def : ProcNoItin<"cortex-a78c", [ARMv82a,
+ FeatureHWDivThumb,
+ FeatureHWDivARM,
+ FeatureCrypto,
+ FeatureCRC,
+ FeatureFullFP16]>;
+
 def : ProcNoItin<"cortex-x1",   [ARMv82a, ProcX1,
  FeatureHWDivThumb,
  FeatureHWDivARM,
Index: llvm/lib/Target/AArch64/AArch64Subtarget.h

[PATCH] D93022: [ARM][AArch64] Add Cortex-A78C Support for Clang and LLVM

2020-12-15 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM marked 4 inline comments as done.
MarkMurrayARM added a comment.

Marked addressed comments as "done". Some debug code has been abandoned.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93022/new/

https://reviews.llvm.org/D93022

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93022: [ARM][AArch64] Add Cortex-A78C Support for Clang and LLVM

2020-12-24 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 313678.
MarkMurrayARM added a comment.

Incorporate reviewer comments. Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93022/new/

https://reviews.llvm.org/D93022

Files:
  clang/test/Driver/aarch64-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -308,6 +308,12 @@
  ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP |
  ARM::AEK_FP16 | ARM::AEK_RAS | ARM::AEK_DOTPROD,
  "8.2-A"),
+ARMCPUTestParams("cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+ ARM::AEK_RAS | ARM::AEK_SEC | ARM::AEK_MP |
+ ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
+ ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP |
+ ARM::AEK_CRC | ARM::AEK_RAS,
+ "8.2-A"),
 ARMCPUTestParams("cortex-a77", "armv8.2-a", "crypto-neon-fp-armv8",
  ARM::AEK_CRC | ARM::AEK_SEC | ARM::AEK_MP |
  ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
@@ -385,7 +391,7 @@
  ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
  "7-S")), );
 
-static constexpr unsigned NumARMCPUArchs = 91;
+static constexpr unsigned NumARMCPUArchs = 92;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -962,6 +968,12 @@
  AArch64::AEK_DOTPROD | AArch64::AEK_RCPC |
  AArch64::AEK_SSBS,
  "8.2-A"),
+ARMCPUTestParams("cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+ AArch64::AEK_RAS | AArch64::AEK_CRC |
+ AArch64::AEK_CRYPTO | AArch64::AEK_FP |
+ AArch64::AEK_SIMD | AArch64::AEK_RAS |
+ AArch64::AEK_LSE | AArch64::AEK_RDM,
+ "8.2-A"),
 ARMCPUTestParams(
 "neoverse-v1", "armv8.4-a", "crypto-neon-fp-armv8",
 AArch64::AEK_RAS | AArch64::AEK_SVE | AArch64::AEK_SSBS |
@@ -1151,7 +1163,7 @@
  AArch64::AEK_LSE | AArch64::AEK_RDM,
  "8.2-A")), );
 
-static constexpr unsigned NumAArch64CPUArchs = 45;
+static constexpr unsigned NumAArch64CPUArchs = 46;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/lib/Target/ARM/ARMSubtarget.h
===
--- llvm/lib/Target/ARM/ARMSubtarget.h
+++ llvm/lib/Target/ARM/ARMSubtarget.h
@@ -63,6 +63,7 @@
 CortexA76,
 CortexA77,
 CortexA78,
+CortexA78C,
 CortexA8,
 CortexA9,
 CortexM3,
Index: llvm/lib/Target/ARM/ARMSubtarget.cpp
===
--- llvm/lib/Target/ARM/ARMSubtarget.cpp
+++ llvm/lib/Target/ARM/ARMSubtarget.cpp
@@ -294,6 +294,7 @@
   case CortexA76:
   case CortexA77:
   case CortexA78:
+  case CortexA78C:
   case CortexR4:
   case CortexR4F:
   case CortexR5:
Index: llvm/lib/Target/ARM/ARM.td
===
--- llvm/lib/Target/ARM/ARM.td
+++ llvm/lib/Target/ARM/ARM.td
@@ -616,6 +616,8 @@
"Cortex-A77 ARM processors", []>;
 def ProcA78 : SubtargetFeature<"cortex-a78", "ARMProcFamily", "CortexA78",
"Cortex-A78 ARM processors", []>;
+def ProcA78C: SubtargetFeature<"a78c", "ARMProcFamily", "CortexA78C",
+   "Cortex-A78C ARM processors", []>;
 def ProcX1  : SubtargetFeature<"cortex-x1", "ARMProcFamily", "CortexX1",
"Cortex-X1 ARM processors", []>;
 
@@ -1308,6 +1310,13 @@
  FeatureFullFP16,
  FeatureDotProd]>;
 
+def : ProcNoItin<"cortex-a78c", [ARMv82a,
+ FeatureHWDivThumb,
+ FeatureHWDivARM,
+ FeatureCrypto,
+ FeatureCRC,
+ FeatureF

[PATCH] D93022: [ARM][AArch64] Add Cortex-A78C Support for Clang and LLVM

2020-12-24 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 313694.
MarkMurrayARM added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93022/new/

https://reviews.llvm.org/D93022

Files:
  clang/test/Driver/aarch64-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -308,6 +308,13 @@
  ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP |
  ARM::AEK_FP16 | ARM::AEK_RAS | ARM::AEK_DOTPROD,
  "8.2-A"),
+ARMCPUTestParams("cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+ ARM::AEK_SEC | ARM::AEK_MP |
+ ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
+ ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP |
+ ARM::AEK_CRC | ARM::AEK_RAS |
+ ARM::AEK_FP16 | ARM::AEK_DOTPROD,
+ "8.2-A"),
 ARMCPUTestParams("cortex-a77", "armv8.2-a", "crypto-neon-fp-armv8",
  ARM::AEK_CRC | ARM::AEK_SEC | ARM::AEK_MP |
  ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
@@ -385,7 +392,7 @@
  ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
  "7-S")), );
 
-static constexpr unsigned NumARMCPUArchs = 91;
+static constexpr unsigned NumARMCPUArchs = 92;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -962,6 +969,12 @@
  AArch64::AEK_DOTPROD | AArch64::AEK_RCPC |
  AArch64::AEK_SSBS,
  "8.2-A"),
+ARMCPUTestParams("cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+ AArch64::AEK_RAS | AArch64::AEK_CRC |
+ AArch64::AEK_CRYPTO | AArch64::AEK_FP |
+ AArch64::AEK_SIMD | AArch64::AEK_RAS |
+ AArch64::AEK_LSE | AArch64::AEK_RDM,
+ "8.2-A"),
 ARMCPUTestParams(
 "neoverse-v1", "armv8.4-a", "crypto-neon-fp-armv8",
 AArch64::AEK_RAS | AArch64::AEK_SVE | AArch64::AEK_SSBS |
@@ -1151,7 +1164,7 @@
  AArch64::AEK_LSE | AArch64::AEK_RDM,
  "8.2-A")), );
 
-static constexpr unsigned NumAArch64CPUArchs = 45;
+static constexpr unsigned NumAArch64CPUArchs = 46;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/lib/Target/ARM/ARMSubtarget.h
===
--- llvm/lib/Target/ARM/ARMSubtarget.h
+++ llvm/lib/Target/ARM/ARMSubtarget.h
@@ -63,6 +63,7 @@
 CortexA76,
 CortexA77,
 CortexA78,
+CortexA78C,
 CortexA8,
 CortexA9,
 CortexM3,
Index: llvm/lib/Target/ARM/ARMSubtarget.cpp
===
--- llvm/lib/Target/ARM/ARMSubtarget.cpp
+++ llvm/lib/Target/ARM/ARMSubtarget.cpp
@@ -294,6 +294,7 @@
   case CortexA76:
   case CortexA77:
   case CortexA78:
+  case CortexA78C:
   case CortexR4:
   case CortexR4F:
   case CortexR5:
Index: llvm/lib/Target/ARM/ARM.td
===
--- llvm/lib/Target/ARM/ARM.td
+++ llvm/lib/Target/ARM/ARM.td
@@ -616,6 +616,8 @@
"Cortex-A77 ARM processors", []>;
 def ProcA78 : SubtargetFeature<"cortex-a78", "ARMProcFamily", "CortexA78",
"Cortex-A78 ARM processors", []>;
+def ProcA78C: SubtargetFeature<"a78c", "ARMProcFamily", "CortexA78C",
+   "Cortex-A78C ARM processors", []>;
 def ProcX1  : SubtargetFeature<"cortex-x1", "ARMProcFamily", "CortexX1",
"Cortex-X1 ARM processors", []>;
 
@@ -1308,6 +1310,13 @@
  FeatureFullFP16,
  FeatureDotProd]>;
 
+def : ProcNoItin<"cortex-a78c", [ARMv82a,
+ FeatureHWDivThumb,
+ FeatureHWDivARM,
+ FeatureCrypto,
+ FeatureCRC,
+ 

[PATCH] D93022: [ARM][AArch64] Add Cortex-A78C Support for Clang and LLVM

2020-12-24 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 313700.
MarkMurrayARM added a comment.

More review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93022/new/

https://reviews.llvm.org/D93022

Files:
  clang/test/Driver/aarch64-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -308,6 +308,13 @@
  ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP |
  ARM::AEK_FP16 | ARM::AEK_RAS | ARM::AEK_DOTPROD,
  "8.2-A"),
+ARMCPUTestParams("cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+ ARM::AEK_SEC | ARM::AEK_MP |
+ ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
+ ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP |
+ ARM::AEK_CRC | ARM::AEK_RAS |
+ ARM::AEK_FP16 | ARM::AEK_DOTPROD,
+ "8.2-A"),
 ARMCPUTestParams("cortex-a77", "armv8.2-a", "crypto-neon-fp-armv8",
  ARM::AEK_CRC | ARM::AEK_SEC | ARM::AEK_MP |
  ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
@@ -385,7 +392,7 @@
  ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
  "7-S")), );
 
-static constexpr unsigned NumARMCPUArchs = 91;
+static constexpr unsigned NumARMCPUArchs = 92;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -962,6 +969,14 @@
  AArch64::AEK_DOTPROD | AArch64::AEK_RCPC |
  AArch64::AEK_SSBS,
  "8.2-A"),
+ARMCPUTestParams("cortex-a78c", "armv8.2-a", "crypto-neon-fp-armv8",
+ AArch64::AEK_RAS | AArch64::AEK_CRC |
+ AArch64::AEK_CRYPTO | AArch64::AEK_FP |
+ AArch64::AEK_SIMD | AArch64::AEK_RAS |
+ AArch64::AEK_LSE | AArch64::AEK_RDM |
+ AArch64::AEK_FP16 | AArch64::AEK_DOTPROD |
+ AArch64::AEK_RCPC | AArch64::AEK_SSBS,
+ "8.2-A"),
 ARMCPUTestParams(
 "neoverse-v1", "armv8.4-a", "crypto-neon-fp-armv8",
 AArch64::AEK_RAS | AArch64::AEK_SVE | AArch64::AEK_SSBS |
@@ -1151,7 +1166,7 @@
  AArch64::AEK_LSE | AArch64::AEK_RDM,
  "8.2-A")), );
 
-static constexpr unsigned NumAArch64CPUArchs = 45;
+static constexpr unsigned NumAArch64CPUArchs = 46;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/lib/Target/ARM/ARMSubtarget.h
===
--- llvm/lib/Target/ARM/ARMSubtarget.h
+++ llvm/lib/Target/ARM/ARMSubtarget.h
@@ -63,6 +63,7 @@
 CortexA76,
 CortexA77,
 CortexA78,
+CortexA78C,
 CortexA8,
 CortexA9,
 CortexM3,
Index: llvm/lib/Target/ARM/ARMSubtarget.cpp
===
--- llvm/lib/Target/ARM/ARMSubtarget.cpp
+++ llvm/lib/Target/ARM/ARMSubtarget.cpp
@@ -294,6 +294,7 @@
   case CortexA76:
   case CortexA77:
   case CortexA78:
+  case CortexA78C:
   case CortexR4:
   case CortexR4F:
   case CortexR5:
Index: llvm/lib/Target/ARM/ARM.td
===
--- llvm/lib/Target/ARM/ARM.td
+++ llvm/lib/Target/ARM/ARM.td
@@ -616,6 +616,8 @@
"Cortex-A77 ARM processors", []>;
 def ProcA78 : SubtargetFeature<"cortex-a78", "ARMProcFamily", "CortexA78",
"Cortex-A78 ARM processors", []>;
+def ProcA78C: SubtargetFeature<"a78c", "ARMProcFamily", "CortexA78C",
+   "Cortex-A78C ARM processors", []>;
 def ProcX1  : SubtargetFeature<"cortex-x1", "ARMProcFamily", "CortexX1",
"Cortex-X1 ARM processors", []>;
 
@@ -1308,6 +1310,14 @@
  FeatureFullFP16,
  FeatureDotProd]>;
 
+def : ProcNoItin<"cortex-a78c", [ARMv82a, ProcA78C,
+ FeatureHWDivThumb,
+ FeatureHWDivARM,
+  

[PATCH] D91695: [ARM][AArch64] Adding Neoverse N2 CPU support

2020-11-18 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
Herald added subscribers: llvm-commits, cfe-commits, danielkiss, hiraditya, 
kristof.beyls.
Herald added projects: clang, LLVM.
MarkMurrayARM requested review of this revision.

Add support for the Neoverse N2 CPU to the ARM and AArch64 backends.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D91695

Files:
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/AArch64/cpus.ll
  llvm/test/CodeGen/AArch64/neon-dot-product.ll
  llvm/test/CodeGen/AArch64/remat.ll
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.3a-rcpc.s
  llvm/test/MC/AArch64/armv8.5a-ssbs.s
  llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
  llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -328,7 +328,7 @@
  "7-S"));
 }
 
-static constexpr unsigned NumARMCPUArchs = 90;
+static constexpr unsigned NumARMCPUArchs = 91;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -1048,7 +1048,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 44;
+static constexpr unsigned NumAArch64CPUArchs = 45;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
@@ -9,6 +9,7 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=cortex-x1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-e1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n1 --disassemble < %s | FileCheck %s
+# RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n2 --disassemble < %s | FileCheck %s
 
 # CHECK: ldaprb w0, [x0]
 # CHECK: ldaprh w0, [x0]
Index: llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple thumb -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple thumb -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 
 // RUN: not llvm-mc -triple thumb -mattr=-dotprod -show-encoding < %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-ERROR < %t %s
Index: llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
@@ -3,6 +3,7 @@
 // RUN: llvm-mc -triple arm -mcpu=cortex-a75 -show-encoding < %s | FileCheck %s  --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-a76 -show-encoding < %s | FileCheck %s  --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple arm -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-a77 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
Index: llvm/test/MC/AArch64/armv8.5a-ssbs.s
===
--- llvm/test/MC/AArch64/armv8.5a-ssbs.s
+++ llvm/test/MC/AArch64/armv8.5a-ssbs.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=cortex-a76ae < %s | FileCheck %s
 // RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=neoverse-e1 < %s  | FileCheck %s
 // RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=neoverse-n1 < %s  | FileCheck %s
+// RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=neoverse-n2 < %s  | FileCheck %s
 // RUN: not llvm-mc -triple aarch64 -show-encoding -mattr=-ssbs  < %s 2>&1 | FileCheck %s --check-prefix=NOSPECID
 
 mrs x2, SSBS
Index: llvm/test/MC/AArch6

[PATCH] D91695: [ARM][AArch64] Adding Neoverse N2 CPU support

2020-11-18 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 306062.
MarkMurrayARM added a comment.
Herald added a subscriber: dexonsmith.

Address reviewer comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91695/new/

https://reviews.llvm.org/D91695

Files:
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/Host.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/AArch64/cpus.ll
  llvm/test/CodeGen/AArch64/neon-dot-product.ll
  llvm/test/CodeGen/AArch64/remat.ll
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.3a-rcpc.s
  llvm/test/MC/AArch64/armv8.5a-ssbs.s
  llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
  llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -328,7 +328,7 @@
  "7-S"));
 }
 
-static constexpr unsigned NumARMCPUArchs = 90;
+static constexpr unsigned NumARMCPUArchs = 91;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -1048,7 +1048,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 44;
+static constexpr unsigned NumAArch64CPUArchs = 45;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
@@ -9,6 +9,7 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=cortex-x1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-e1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n1 --disassemble < %s | FileCheck %s
+# RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n2 --disassemble < %s | FileCheck %s
 
 # CHECK: ldaprb w0, [x0]
 # CHECK: ldaprh w0, [x0]
Index: llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple thumb -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple thumb -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 
 // RUN: not llvm-mc -triple thumb -mattr=-dotprod -show-encoding < %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-ERROR < %t %s
Index: llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
@@ -3,6 +3,7 @@
 // RUN: llvm-mc -triple arm -mcpu=cortex-a75 -show-encoding < %s | FileCheck %s  --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-a76 -show-encoding < %s | FileCheck %s  --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple arm -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-a77 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
Index: llvm/test/MC/AArch64/armv8.5a-ssbs.s
===
--- llvm/test/MC/AArch64/armv8.5a-ssbs.s
+++ llvm/test/MC/AArch64/armv8.5a-ssbs.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=cortex-a76ae < %s | FileCheck %s
 // RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=neoverse-e1 < %s  | FileCheck %s
 // RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=neoverse-n1 < %s  | FileCheck %s
+// RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=neoverse-n2 < %s  | FileCheck %s
 // RUN: not llvm-mc -triple aarch64 -show-encoding -mattr=-ssbs  < %s 2>&1 | FileCheck %s --check-prefix=NOSPECID
 
 mrs x2, SSBS
Index: llvm/test/MC/AArch64/armv8.3a-rcpc.s
==

[PATCH] D91695: [ARM][AArch64] Adding Neoverse N2 CPU support

2020-11-18 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 306063.
MarkMurrayARM added a comment.

Address review comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91695/new/

https://reviews.llvm.org/D91695

Files:
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/Host.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/AArch64/cpus.ll
  llvm/test/CodeGen/AArch64/neon-dot-product.ll
  llvm/test/CodeGen/AArch64/remat.ll
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.3a-rcpc.s
  llvm/test/MC/AArch64/armv8.5a-ssbs.s
  llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
  llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -328,7 +328,7 @@
  "7-S"));
 }
 
-static constexpr unsigned NumARMCPUArchs = 90;
+static constexpr unsigned NumARMCPUArchs = 91;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -1048,7 +1048,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 44;
+static constexpr unsigned NumAArch64CPUArchs = 45;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
@@ -9,6 +9,7 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=cortex-x1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-e1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n1 --disassemble < %s | FileCheck %s
+# RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n2 --disassemble < %s | FileCheck %s
 
 # CHECK: ldaprb w0, [x0]
 # CHECK: ldaprh w0, [x0]
Index: llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple thumb -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple thumb -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 
 // RUN: not llvm-mc -triple thumb -mattr=-dotprod -show-encoding < %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-ERROR < %t %s
Index: llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
@@ -3,6 +3,7 @@
 // RUN: llvm-mc -triple arm -mcpu=cortex-a75 -show-encoding < %s | FileCheck %s  --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-a76 -show-encoding < %s | FileCheck %s  --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple arm -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-a77 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple arm -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
Index: llvm/test/MC/AArch64/armv8.5a-ssbs.s
===
--- llvm/test/MC/AArch64/armv8.5a-ssbs.s
+++ llvm/test/MC/AArch64/armv8.5a-ssbs.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=cortex-a76ae < %s | FileCheck %s
 // RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=neoverse-e1 < %s  | FileCheck %s
 // RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=neoverse-n1 < %s  | FileCheck %s
+// RUN: llvm-mc -triple aarch64 -show-encoding -mcpu=neoverse-n2 < %s  | FileCheck %s
 // RUN: not llvm-mc -triple aarch64 -show-encoding -mattr=-ssbs  < %s 2>&1 | FileCheck %s --check-prefix=NOSPECID
 
 mrs x2, SSBS
Index: llvm/test/MC/AArch64/armv8.3a-rcpc.s

[PATCH] D91695: [ARM][AArch64] Adding Neoverse N2 CPU support

2020-11-18 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added a comment.

In D91695#2402495 , @tschuett wrote:

> In the first version N2 had FeatureTME. Did you remove that on purpose?

Yes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91695/new/

https://reviews.llvm.org/D91695

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D91695: [ARM][AArch64] Adding Neoverse N2 CPU support

2020-11-18 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added a comment.

In D91695#2402616 , @tschuett wrote:

> The features still differ from gcc.
> https://github.com/gcc-mirror/gcc/blob/cb1a4876a0e724ca3962ec14dce9e7819fa72ea5/gcc/config/aarch64/aarch64-cores.def#L146
> It has SVE2_Bitperm.

I'm still testing some fixes. :-)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91695/new/

https://reviews.llvm.org/D91695

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D91695: [ARM][AArch64] Adding Neoverse N2 CPU support

2020-11-19 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 306373.
MarkMurrayARM added a comment.

Address more review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91695/new/

https://reviews.llvm.org/D91695

Files:
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/Host.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/AArch64/cpus.ll
  llvm/test/CodeGen/AArch64/neon-dot-product.ll
  llvm/test/CodeGen/AArch64/remat.ll
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.3a-rcpc.s
  llvm/test/MC/AArch64/armv8.5a-ssbs.s
  llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
  llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -280,6 +280,12 @@
 ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_FP16 |
 ARM::AEK_RAS | ARM::AEK_DOTPROD,
 "8.2-A"));
+  EXPECT_TRUE(testARMCPU("neoverse-n2", "armv8.5-a", "crypto-neon-fp-armv8",
+ ARM::AEK_CRC | ARM::AEK_HWDIVTHUMB | ARM::AEK_HWDIVARM |
+ ARM::AEK_MP | ARM::AEK_SEC | ARM::AEK_VIRT |
+ ARM::AEK_DSP | ARM::AEK_BF16 | ARM::AEK_DOTPROD |
+ ARM::AEK_RAS | ARM::AEK_I8MM | ARM::AEK_SB,
+"8.5-A"));
   EXPECT_TRUE(testARMCPU("neoverse-v1", "armv8.4-a", "crypto-neon-fp-armv8",
  ARM::AEK_SEC | ARM::AEK_MP | ARM::AEK_VIRT |
  ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
@@ -328,7 +334,7 @@
  "7-S"));
 }
 
-static constexpr unsigned NumARMCPUArchs = 90;
+static constexpr unsigned NumARMCPUArchs = 91;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -996,6 +1002,15 @@
   AArch64::AEK_PROFILE | AArch64::AEK_RAS | AArch64::AEK_RCPC |
   AArch64::AEK_RDM | AArch64::AEK_SIMD | AArch64::AEK_SSBS,
  "8.2-A"));
+  EXPECT_TRUE(testAArch64CPU(
+ "neoverse-n2", "armv8.5-a", "crypto-neon-fp-armv8",
+  AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_FP |
+  AArch64::AEK_SIMD | AArch64::AEK_FP16 | AArch64::AEK_RAS |
+  AArch64::AEK_LSE | AArch64::AEK_SVE | AArch64::AEK_DOTPROD |
+  AArch64::AEK_RCPC | AArch64::AEK_RDM | AArch64::AEK_MTE |
+  AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_SVE2 |
+  AArch64::AEK_SVE2BITPERM | AArch64::AEK_BF16 | AArch64::AEK_I8MM,
+ "8.5-A"));
   EXPECT_TRUE(testAArch64CPU(
   "thunderx2t99", "armv8.1-a", "crypto-neon-fp-armv8",
   AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_LSE |
@@ -1048,7 +1063,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 44;
+static constexpr unsigned NumAArch64CPUArchs = 45;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
@@ -9,6 +9,7 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=cortex-x1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-e1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n1 --disassemble < %s | FileCheck %s
+# RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n2 --disassemble < %s | FileCheck %s
 
 # CHECK: ldaprb w0, [x0]
 # CHECK: ldaprh w0, [x0]
Index: llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple thumb -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple thumb -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 
 // RUN: not llvm-mc -triple thumb -mattr=-dotprod -show-encoding < %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-ERROR < %t %s
Index: llvm/test/MC/ARM/armv8.2a-dotprod-a32.s

[PATCH] D91695: [ARM][AArch64] Adding Neoverse N2 CPU support

2020-11-19 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 306408.
MarkMurrayARM added a comment.

Address reviewer comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91695/new/

https://reviews.llvm.org/D91695

Files:
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/Host.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/AArch64/cpus.ll
  llvm/test/CodeGen/AArch64/neon-dot-product.ll
  llvm/test/CodeGen/AArch64/remat.ll
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.3a-rcpc.s
  llvm/test/MC/AArch64/armv8.5a-ssbs.s
  llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
  llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -280,6 +280,12 @@
 ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_FP16 |
 ARM::AEK_RAS | ARM::AEK_DOTPROD,
 "8.2-A"));
+  EXPECT_TRUE(testARMCPU("neoverse-n2", "armv8.5-a", "crypto-neon-fp-armv8",
+ ARM::AEK_CRC | ARM::AEK_HWDIVTHUMB | ARM::AEK_HWDIVARM |
+ ARM::AEK_MP | ARM::AEK_SEC | ARM::AEK_VIRT |
+ ARM::AEK_DSP | ARM::AEK_BF16 | ARM::AEK_DOTPROD |
+ ARM::AEK_RAS | ARM::AEK_I8MM | ARM::AEK_SB,
+"8.5-A"));
   EXPECT_TRUE(testARMCPU("neoverse-v1", "armv8.4-a", "crypto-neon-fp-armv8",
  ARM::AEK_SEC | ARM::AEK_MP | ARM::AEK_VIRT |
  ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
@@ -328,7 +334,7 @@
  "7-S"));
 }
 
-static constexpr unsigned NumARMCPUArchs = 90;
+static constexpr unsigned NumARMCPUArchs = 91;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -996,6 +1002,15 @@
   AArch64::AEK_PROFILE | AArch64::AEK_RAS | AArch64::AEK_RCPC |
   AArch64::AEK_RDM | AArch64::AEK_SIMD | AArch64::AEK_SSBS,
  "8.2-A"));
+  EXPECT_TRUE(testAArch64CPU(
+ "neoverse-n2", "armv8.5-a", "crypto-neon-fp-armv8",
+  AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_FP |
+  AArch64::AEK_SIMD | AArch64::AEK_FP16 | AArch64::AEK_RAS |
+  AArch64::AEK_LSE | AArch64::AEK_SVE | AArch64::AEK_DOTPROD |
+  AArch64::AEK_RCPC | AArch64::AEK_RDM | AArch64::AEK_MTE |
+  AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_SVE2 |
+  AArch64::AEK_SVE2BITPERM | AArch64::AEK_BF16 | AArch64::AEK_I8MM,
+ "8.5-A"));
   EXPECT_TRUE(testAArch64CPU(
   "thunderx2t99", "armv8.1-a", "crypto-neon-fp-armv8",
   AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_LSE |
@@ -1048,7 +1063,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 44;
+static constexpr unsigned NumAArch64CPUArchs = 45;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
@@ -9,6 +9,7 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=cortex-x1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-e1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n1 --disassemble < %s | FileCheck %s
+# RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n2 --disassemble < %s | FileCheck %s
 
 # CHECK: ldaprb w0, [x0]
 # CHECK: ldaprh w0, [x0]
Index: llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple thumb -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple thumb -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 
 // RUN: not llvm-mc -triple thumb -mattr=-dotprod -show-encoding < %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-ERROR < %t %s
Index: llvm/test/MC/ARM/armv8.2a-dotprod-a32.s

[PATCH] D91695: [ARM][AArch64] Adding Neoverse N2 CPU support

2020-11-23 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 307083.
MarkMurrayARM added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91695/new/

https://reviews.llvm.org/D91695

Files:
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/Host.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/AArch64/cpus.ll
  llvm/test/CodeGen/AArch64/neon-dot-product.ll
  llvm/test/CodeGen/AArch64/remat.ll
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.3a-rcpc.s
  llvm/test/MC/AArch64/armv8.5a-ssbs.s
  llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
  llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -280,6 +280,12 @@
 ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_FP16 |
 ARM::AEK_RAS | ARM::AEK_DOTPROD,
 "8.2-A"));
+  EXPECT_TRUE(testARMCPU("neoverse-n2", "armv8.5-a", "crypto-neon-fp-armv8",
+ ARM::AEK_CRC | ARM::AEK_HWDIVTHUMB | ARM::AEK_HWDIVARM |
+ ARM::AEK_MP | ARM::AEK_SEC | ARM::AEK_VIRT |
+ ARM::AEK_DSP | ARM::AEK_BF16 | ARM::AEK_DOTPROD |
+ ARM::AEK_RAS | ARM::AEK_I8MM | ARM::AEK_SB,
+"8.5-A"));
   EXPECT_TRUE(testARMCPU("neoverse-v1", "armv8.4-a", "crypto-neon-fp-armv8",
  ARM::AEK_SEC | ARM::AEK_MP | ARM::AEK_VIRT |
  ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
@@ -328,7 +334,7 @@
  "7-S"));
 }
 
-static constexpr unsigned NumARMCPUArchs = 90;
+static constexpr unsigned NumARMCPUArchs = 91;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -996,6 +1002,15 @@
   AArch64::AEK_PROFILE | AArch64::AEK_RAS | AArch64::AEK_RCPC |
   AArch64::AEK_RDM | AArch64::AEK_SIMD | AArch64::AEK_SSBS,
  "8.2-A"));
+  EXPECT_TRUE(testAArch64CPU(
+ "neoverse-n2", "armv8.5-a", "crypto-neon-fp-armv8",
+  AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_FP |
+  AArch64::AEK_SIMD | AArch64::AEK_FP16 | AArch64::AEK_RAS |
+  AArch64::AEK_LSE | AArch64::AEK_SVE | AArch64::AEK_DOTPROD |
+  AArch64::AEK_RCPC | AArch64::AEK_RDM | AArch64::AEK_MTE |
+  AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_SVE2 |
+  AArch64::AEK_SVE2BITPERM | AArch64::AEK_BF16 | AArch64::AEK_I8MM,
+ "8.5-A"));
   EXPECT_TRUE(testAArch64CPU(
   "thunderx2t99", "armv8.1-a", "crypto-neon-fp-armv8",
   AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_LSE |
@@ -1048,7 +1063,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 44;
+static constexpr unsigned NumAArch64CPUArchs = 45;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
@@ -9,6 +9,7 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=cortex-x1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-e1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n1 --disassemble < %s | FileCheck %s
+# RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n2 --disassemble < %s | FileCheck %s
 
 # CHECK: ldaprb w0, [x0]
 # CHECK: ldaprh w0, [x0]
Index: llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple thumb -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple thumb -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 
 // RUN: not llvm-mc -triple thumb -mattr=-dotprod -show-encoding < %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-ERROR < %t %s
Index: llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
==

[PATCH] D91695: [ARM][AArch64] Adding Neoverse N2 CPU support

2020-11-25 Thread Mark Murray via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG2b6691894ab6: [ARM][AArch64] Adding Neoverse N2 CPU support 
(authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91695/new/

https://reviews.llvm.org/D91695

Files:
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/arm-cortex-cpus.c
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/Host.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMSubtarget.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/AArch64/cpus.ll
  llvm/test/CodeGen/AArch64/neon-dot-product.ll
  llvm/test/CodeGen/AArch64/remat.ll
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.3a-rcpc.s
  llvm/test/MC/AArch64/armv8.5a-ssbs.s
  llvm/test/MC/ARM/armv8.2a-dotprod-a32.s
  llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -280,6 +280,12 @@
 ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_FP16 |
 ARM::AEK_RAS | ARM::AEK_DOTPROD,
 "8.2-A"));
+  EXPECT_TRUE(testARMCPU("neoverse-n2", "armv8.5-a", "crypto-neon-fp-armv8",
+ ARM::AEK_CRC | ARM::AEK_HWDIVTHUMB | ARM::AEK_HWDIVARM |
+ ARM::AEK_MP | ARM::AEK_SEC | ARM::AEK_VIRT |
+ ARM::AEK_DSP | ARM::AEK_BF16 | ARM::AEK_DOTPROD |
+ ARM::AEK_RAS | ARM::AEK_I8MM | ARM::AEK_SB,
+"8.5-A"));
   EXPECT_TRUE(testARMCPU("neoverse-v1", "armv8.4-a", "crypto-neon-fp-armv8",
  ARM::AEK_SEC | ARM::AEK_MP | ARM::AEK_VIRT |
  ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB |
@@ -328,7 +334,7 @@
  "7-S"));
 }
 
-static constexpr unsigned NumARMCPUArchs = 90;
+static constexpr unsigned NumARMCPUArchs = 91;
 
 TEST(TargetParserTest, testARMCPUArchList) {
   SmallVector List;
@@ -996,6 +1002,15 @@
   AArch64::AEK_PROFILE | AArch64::AEK_RAS | AArch64::AEK_RCPC |
   AArch64::AEK_RDM | AArch64::AEK_SIMD | AArch64::AEK_SSBS,
  "8.2-A"));
+  EXPECT_TRUE(testAArch64CPU(
+ "neoverse-n2", "armv8.5-a", "crypto-neon-fp-armv8",
+  AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_FP |
+  AArch64::AEK_SIMD | AArch64::AEK_FP16 | AArch64::AEK_RAS |
+  AArch64::AEK_LSE | AArch64::AEK_SVE | AArch64::AEK_DOTPROD |
+  AArch64::AEK_RCPC | AArch64::AEK_RDM | AArch64::AEK_MTE |
+  AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_SVE2 |
+  AArch64::AEK_SVE2BITPERM | AArch64::AEK_BF16 | AArch64::AEK_I8MM,
+ "8.5-A"));
   EXPECT_TRUE(testAArch64CPU(
   "thunderx2t99", "armv8.1-a", "crypto-neon-fp-armv8",
   AArch64::AEK_CRC | AArch64::AEK_CRYPTO | AArch64::AEK_LSE |
@@ -1048,7 +1063,7 @@
   "8.2-A"));
 }
 
-static constexpr unsigned NumAArch64CPUArchs = 44;
+static constexpr unsigned NumAArch64CPUArchs = 45;
 
 TEST(TargetParserTest, testAArch64CPUArchList) {
   SmallVector List;
Index: llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
@@ -9,6 +9,7 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=cortex-x1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-e1 --disassemble < %s | FileCheck %s
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n1 --disassemble < %s | FileCheck %s
+# RUN: llvm-mc -triple aarch64-none-linux-gnu -mcpu=neoverse-n2 --disassemble < %s | FileCheck %s
 
 # CHECK: ldaprb w0, [x0]
 # CHECK: ldaprh w0, [x0]
Index: llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
===
--- llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
+++ llvm/test/MC/ARM/armv8.2a-dotprod-t32.s
@@ -6,6 +6,7 @@
 // RUN: llvm-mc -triple thumb -mcpu=cortex-a78 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=cortex-x1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 // RUN: llvm-mc -triple thumb -mcpu=neoverse-n1 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
+// RUN: llvm-mc -triple thumb -mcpu=neoverse-n2 -show-encoding < %s | FileCheck %s --check-prefix=CHECK
 
 // RUN: not llvm-mc -triple thumb -mattr=-dotprod -show-encoding < %s 2> %t
 // RUN: 

[PATCH] D74838: [ARM,MVE] Fix predicate types of some intrinsics

2020-02-19 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added a comment.

Nice catch!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74838/new/

https://reviews.llvm.org/D74838



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74845: [ARM,MVE] Add vqdmull[b,t]q intrinsic families

2020-02-19 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74845/new/

https://reviews.llvm.org/D74845



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75165: [ARM,MVE] Add predicated intrinsics for many unary functions.

2020-02-26 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75165/new/

https://reviews.llvm.org/D75165



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D70545: [ARM][MVE][Intrinsics] Add MVE VABD intrinsics.

2019-11-21 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
MarkMurrayARM added reviewers: simon_tatham, ostannard, dmgreen.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls.
Herald added projects: clang, LLVM.

Add MVE VABD intrinsics.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D70545

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vabdq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <4 x i32> @test_vabdq_u32(<4 x i32> %a, <4 x i32> %b) {
+; CHECK-LABEL: test_vabdq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>%a, <4 x i32>%b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>, <4 x i32>)
+
+define arm_aapcs_vfpcc <4 x float> @test_vabdq_f32(<4 x float> %a, <4 x float> %b) {
+; CHECK-LABEL: test_vabdq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.f32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>%a, <4 x float>%b)
+  ret <4 x float> %0
+}
+
+declare <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>, <4 x float>)
+
+define arm_aapcs_vfpcc <16 x i8> @test_vabdq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32)
+
+declare <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>)
+
+define arm_aapcs_vfpcc <8 x half> @test_vabdq_m_f16(<8 x half> %inactive, <8 x half> %a, <8 x half> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.f16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1, <8 x half> %inactive)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32)
+
+declare <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>, <8 x half>)
Index: llvm/lib/Target/ARM/ARMInstrMVE.td
===
--- llvm/lib/Target/ARM/ARMInstrMVE.td
+++ llvm/lib/Target/ARM/ARMInstrMVE.td
@@ -1664,8 +1664,9 @@
 }
 
 
-class MVE_VABD_int size, list pattern=[]>
-  : MVE_int<"vabd", suffix, size, pattern> {
+class MVE_VABD_int size,
+ list pattern=[]>
+  : MVE_int {
 
   let Inst{28} = U;
   let Inst{25-23} = 0b110;
@@ -1676,12 +1677,35 @@
   let validForTailPredication = 1;
 }
 
-def MVE_VABDs8  : MVE_VABD_int<"s8", 0b0, 0b00>;
-def MVE_VABDs16 : MVE_VABD_int<"s16", 0b0, 0b01>;
-def MVE_VABDs32 : MVE_VABD_int<"s32", 0b0, 0b10>;
-def MVE_VABDu8  : MVE_VABD_int<"u8", 0b1, 0b00>;
-def MVE_VABDu16 : MVE_VABD_int<"u16", 0b1, 0b01>;
-def MVE_VABDu32 : MVE_VABD_int<"u32", 0b1, 0b10>;
+multiclass MVE_VABD_m {
+  def "" : MVE_VABD_int;
+
+  let Predicates = [HasMVEInt] in {
+// Unpredicated absolute difference
+def : Pat<(VTI.Vec (unpred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn)))>;
+
+// Predicated absolute difference
+def : Pat<(VTI.Vec (pred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(VTI.Pred VCCR:$mask), (VTI.Vec MQPR:$inactive))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(i32 1), (VTI.Pred VCCR:$mask),
+(VTI.Vec MQPR:$inactive)))>;
+  }
+}
+
+multiclass MVE_VABD
+  : MVE_VABD_m<"vabd", VTI, int_arm_mve_vabd, int_arm_mve_abd_predicated>;
+
+defm MVE_VABDs8  : MVE_VABD;
+defm MVE_VABDs16 : MVE_VABD;
+defm MVE_VABDs32 : MVE_VABD;
+defm MVE_VABDu8  : MVE_VABD;
+defm MVE_VABDu16 : MVE_VABD;
+defm MVE_VABDu32 : MVE_VABD;
 

[PATCH] D70547: [ARM][MVE][Intrinsics] Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics.

2019-11-21 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
MarkMurrayARM added reviewers: simon_tatham, ostannard, dmgreen.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls.
Herald added projects: clang, LLVM.

Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D70547

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vandq.c
  clang/test/CodeGen/arm-mve-intrinsics/vbicq.c
  clang/test/CodeGen/arm-mve-intrinsics/veorq.c
  clang/test/CodeGen/arm-mve-intrinsics/vornq.c
  clang/test/CodeGen/arm-mve-intrinsics/vorrq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vandq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vbicq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/veorq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vornq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
@@ -0,0 +1,104 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <16 x i8> %b, %a
+  ret <16 x i8> %0
+}
+
+define arm_aapcs_vfpcc <4 x i32> @test_vorrq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <8 x i16> %b, %a
+  ret <8 x i16> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vorrq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = or <4 x i32> %1, %0
+  %3 = bitcast <4 x i32> %2 to <4 x float>
+  ret <4 x float> %3
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+; Function Attrs: nounwind readnone
+define arm_aapcs_vfpcc <8 x half> @test_vorrq_m_f32(<4 x float> %inactive, <4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = zext i16 %p to i32
+  %3 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %2)
+  %4 = bitcast <4 x float> %inactive to <4 x i32>
+  %5 = tail call <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32> %0, <4 x i32> %1, <4 x i1> %3, <4 x i32> %4)
+  %6 = bitcast <4 x i32> %5 to <8 x half>
+  ret <8 x half> %6
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #

[PATCH] D70546: [ARM][MVE][Intrinsics] Add MVE VMUL intrinsics.

2019-11-21 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
MarkMurrayARM added reviewers: simon_tatham, ostannard, dmgreen.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls.
Herald added projects: clang, LLVM.

Add MVE VMUL intrinsics.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D70546

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmulq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
  llvm/unittests/Target/ARM/MachineInstrTest.cpp

Index: llvm/unittests/Target/ARM/MachineInstrTest.cpp
===
--- llvm/unittests/Target/ARM/MachineInstrTest.cpp
+++ llvm/unittests/Target/ARM/MachineInstrTest.cpp
@@ -250,9 +250,9 @@
 case MVE_VMUL_qr_i8:
 case MVE_VMULf16:
 case MVE_VMULf32:
-case MVE_VMULt1i16:
-case MVE_VMULt1i8:
-case MVE_VMULt1i32:
+case MVE_VMULi16:
+case MVE_VMULi8:
+case MVE_VMULi32:
 case MVE_VMVN:
 case MVE_VMVNimmi16:
 case MVE_VMVNimmi32:
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulq_u32(<4 x i32> %a, <4 x i32> %b) {
+; CHECK-LABEL: test_vmulq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmul.i32 q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = mul <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vmulq_f32(<4 x float> %a, <4 x float> %b) {
+; CHECK-LABEL: test_vmulq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmul.f32 q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = fmul <4 x float> %b, %a
+  ret <4 x float> %0
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vmulq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vmulq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmult.i8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.mul.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32)
+
+declare <16 x i8> @llvm.arm.mve.mul.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>)
+
+define arm_aapcs_vfpcc <8 x half> @test_vmulq_m_f16(<8 x half> %inactive, <8 x half> %a, <8 x half> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vmulq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmult.f16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.mul.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1, <8 x half> %inactive)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32)
+
+declare <8 x half> @llvm.arm.mve.mul.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>, <8 x half>)
Index: llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
===
--- llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
+++ llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
@@ -216,7 +216,7 @@
   ; CHECK:   renamable $r4 = t2ADDrr renamable $r0, renamable $r12, 14, $noreg, $noreg
   ; CHECK:   renamable $r12 = t2ADDri killed renamable $r12, 16, 14, $noreg, $noreg
   ; CHECK:   renamable $r3, dead $cpsr = tSUBi8 killed renamable $r3, 16, 14, $noreg
-  ; CHECK:   renamable $q0 = MVE_VMULt1i8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
+  ; CHECK:   renamable $q0 = MVE_VMULi8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
   ; CHECK:   MVE_VSTRBU8 killed renamable $q0, killed renamable $r4, 0, 0, killed $noreg :: (store 16 into %ir.scevgep1, align 1)
   ; CHECK:   $lr = MVE_LETP renamable $lr, %bb.2
   ; CHECK: bb.3.for.cond.cleanup:
@@ -257,7 +257,7 @@
 renamable $r4 = t2ADDrr renamable $r0, renamable $r12, 14, $noreg, $noreg
 renamable $r12 = t2ADDri killed renamable $r12, 16, 14, $noreg, $noreg
 renamable $r3, dead $cpsr = tSUBi8 killed renamable $r3, 16, 14, $noreg
-renamable $q0 = MVE_VMULt1i8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
+renamable $q0 = MVE_VMULi8 killed renamable $q1, killed renamable $q0

[PATCH] D70545: [ARM][MVE][Intrinsics] Add MVE VABD intrinsics.

2019-11-25 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 230890.
MarkMurrayARM added a comment.

Merge all VABD intrinis types under T.Usual instead of doing the floats
separately.

Add more tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70545/new/

https://reviews.llvm.org/D70545

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vabdq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <4 x i32> @test_vabdq_u32(<4 x i32> %a, <4 x i32> %b) {
+; CHECK-LABEL: test_vabdq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>%a, <4 x i32>%b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>, <4 x i32>)
+
+define arm_aapcs_vfpcc <4 x float> @test_vabdq_f32(<4 x float> %a, <4 x float> %b) {
+; CHECK-LABEL: test_vabdq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.f32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>%a, <4 x float>%b)
+  ret <4 x float> %0
+}
+
+declare <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>, <4 x float>)
+
+define arm_aapcs_vfpcc <16 x i8> @test_vabdq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32)
+
+declare <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>)
+
+define arm_aapcs_vfpcc <8 x half> @test_vabdq_m_f16(<8 x half> %inactive, <8 x half> %a, <8 x half> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.f16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1, <8 x half> %inactive)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32)
+
+declare <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>, <8 x half>)
Index: llvm/lib/Target/ARM/ARMInstrMVE.td
===
--- llvm/lib/Target/ARM/ARMInstrMVE.td
+++ llvm/lib/Target/ARM/ARMInstrMVE.td
@@ -1664,8 +1664,9 @@
 }
 
 
-class MVE_VABD_int size, list pattern=[]>
-  : MVE_int<"vabd", suffix, size, pattern> {
+class MVE_VABD_int size,
+ list pattern=[]>
+  : MVE_int {
 
   let Inst{28} = U;
   let Inst{25-23} = 0b110;
@@ -1676,12 +1677,35 @@
   let validForTailPredication = 1;
 }
 
-def MVE_VABDs8  : MVE_VABD_int<"s8", 0b0, 0b00>;
-def MVE_VABDs16 : MVE_VABD_int<"s16", 0b0, 0b01>;
-def MVE_VABDs32 : MVE_VABD_int<"s32", 0b0, 0b10>;
-def MVE_VABDu8  : MVE_VABD_int<"u8", 0b1, 0b00>;
-def MVE_VABDu16 : MVE_VABD_int<"u16", 0b1, 0b01>;
-def MVE_VABDu32 : MVE_VABD_int<"u32", 0b1, 0b10>;
+multiclass MVE_VABD_m {
+  def "" : MVE_VABD_int;
+
+  let Predicates = [HasMVEInt] in {
+// Unpredicated absolute difference
+def : Pat<(VTI.Vec (unpred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn)))>;
+
+// Predicated absolute difference
+def : Pat<(VTI.Vec (pred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(VTI.Pred VCCR:$mask), (VTI.Vec MQPR:$inactive))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(i32 1), (VTI.Pred VCCR:$mask),
+(VTI.Vec MQPR:$inactive)))>;
+  }
+}
+
+multiclass MVE_VABD
+  : MVE_VABD_m<"vabd", VTI, int_arm_mve_vabd, int_arm_mve_abd_predicated>;
+
+defm MVE_VABDs8  : MVE_VABD;
+defm MVE_VABDs16 : MVE_VABD;
+defm MVE_VABDs32 : MVE_VABD;
+defm MVE_VABDu8  : MVE_VABD;
+defm MVE_VABDu16 : MVE_VABD;
+defm MVE_VABDu32 : MVE_

[PATCH] D70547: [ARM][MVE][Intrinsics] Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics.

2019-11-25 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 230891.
MarkMurrayARM added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70547/new/

https://reviews.llvm.org/D70547

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vandq.c
  clang/test/CodeGen/arm-mve-intrinsics/vbicq.c
  clang/test/CodeGen/arm-mve-intrinsics/veorq.c
  clang/test/CodeGen/arm-mve-intrinsics/vornq.c
  clang/test/CodeGen/arm-mve-intrinsics/vorrq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vandq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vbicq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/veorq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vornq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
@@ -0,0 +1,104 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <16 x i8> %b, %a
+  ret <16 x i8> %0
+}
+
+define arm_aapcs_vfpcc <4 x i32> @test_vorrq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <8 x i16> %b, %a
+  ret <8 x i16> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vorrq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = or <4 x i32> %1, %0
+  %3 = bitcast <4 x i32> %2 to <4 x float>
+  ret <4 x float> %3
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+; Function Attrs: nounwind readnone
+define arm_aapcs_vfpcc <8 x half> @test_vorrq_m_f32(<4 x float> %inactive, <4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = zext i16 %p to i32
+  %3 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %2)
+  %4 = bitcast <4 x float> %inactive to <4 x i32>
+  %5 = tail call <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32> %0, <4 x i32> %1, <4 x i1> %3, <4 x i32> %4)
+  %6 = bitcast <4 x i32> %5 to <8 x half>
+  ret <8 x half> %6
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>)

[PATCH] D70547: [ARM][MVE][Intrinsics] Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics.

2019-11-25 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 230895.
MarkMurrayARM added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70547/new/

https://reviews.llvm.org/D70547

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vandq.c
  clang/test/CodeGen/arm-mve-intrinsics/vbicq.c
  clang/test/CodeGen/arm-mve-intrinsics/veorq.c
  clang/test/CodeGen/arm-mve-intrinsics/vornq.c
  clang/test/CodeGen/arm-mve-intrinsics/vorrq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vandq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vbicq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/veorq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vornq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
@@ -0,0 +1,104 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <16 x i8> %b, %a
+  ret <16 x i8> %0
+}
+
+define arm_aapcs_vfpcc <4 x i32> @test_vorrq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <8 x i16> %b, %a
+  ret <8 x i16> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vorrq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = or <4 x i32> %1, %0
+  %3 = bitcast <4 x i32> %2 to <4 x float>
+  ret <4 x float> %3
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+; Function Attrs: nounwind readnone
+define arm_aapcs_vfpcc <8 x half> @test_vorrq_m_f32(<4 x float> %inactive, <4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = zext i16 %p to i32
+  %3 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %2)
+  %4 = bitcast <4 x float> %inactive to <4 x i32>
+  %5 = tail call <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32> %0, <4 x i32> %1, <4 x i1> %3, <4 x i32> %4)
+  %6 = bitcast <4 x i32> %5 to <8 x half>
+  ret <8 x half> %6
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>)

[PATCH] D70547: [ARM][MVE][Intrinsics] Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics.

2019-11-25 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 230896.
MarkMurrayARM added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70547/new/

https://reviews.llvm.org/D70547

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vandq.c
  clang/test/CodeGen/arm-mve-intrinsics/vbicq.c
  clang/test/CodeGen/arm-mve-intrinsics/veorq.c
  clang/test/CodeGen/arm-mve-intrinsics/vornq.c
  clang/test/CodeGen/arm-mve-intrinsics/vorrq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vandq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vbicq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/veorq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vornq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
@@ -0,0 +1,104 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <16 x i8> %b, %a
+  ret <16 x i8> %0
+}
+
+define arm_aapcs_vfpcc <4 x i32> @test_vorrq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <8 x i16> %b, %a
+  ret <8 x i16> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vorrq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = or <4 x i32> %1, %0
+  %3 = bitcast <4 x i32> %2 to <4 x float>
+  ret <4 x float> %3
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+; Function Attrs: nounwind readnone
+define arm_aapcs_vfpcc <8 x half> @test_vorrq_m_f32(<4 x float> %inactive, <4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = zext i16 %p to i32
+  %3 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %2)
+  %4 = bitcast <4 x float> %inactive to <4 x i32>
+  %5 = tail call <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32> %0, <4 x i32> %1, <4 x i1> %3, <4 x i32> %4)
+  %6 = bitcast <4 x i32> %5 to <8 x half>
+  ret <8 x half> %6
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>)

[PATCH] D70546: [ARM][MVE][Intrinsics] Add MVE VMUL intrinsics.

2019-11-26 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 231035.
MarkMurrayARM added a comment.

Rebase and reupload patches.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70546/new/

https://reviews.llvm.org/D70546

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmulq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
  llvm/unittests/Target/ARM/MachineInstrTest.cpp

Index: llvm/unittests/Target/ARM/MachineInstrTest.cpp
===
--- llvm/unittests/Target/ARM/MachineInstrTest.cpp
+++ llvm/unittests/Target/ARM/MachineInstrTest.cpp
@@ -250,9 +250,9 @@
 case MVE_VMUL_qr_i8:
 case MVE_VMULf16:
 case MVE_VMULf32:
-case MVE_VMULt1i16:
-case MVE_VMULt1i8:
-case MVE_VMULt1i32:
+case MVE_VMULi16:
+case MVE_VMULi8:
+case MVE_VMULi32:
 case MVE_VMVN:
 case MVE_VMVNimmi16:
 case MVE_VMVNimmi32:
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulq_u32(<4 x i32> %a, <4 x i32> %b) {
+; CHECK-LABEL: test_vmulq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmul.i32 q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = mul <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vmulq_f32(<4 x float> %a, <4 x float> %b) {
+; CHECK-LABEL: test_vmulq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmul.f32 q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = fmul <4 x float> %b, %a
+  ret <4 x float> %0
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vmulq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vmulq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmult.i8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.mul.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32)
+
+declare <16 x i8> @llvm.arm.mve.mul.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>)
+
+define arm_aapcs_vfpcc <8 x half> @test_vmulq_m_f16(<8 x half> %inactive, <8 x half> %a, <8 x half> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vmulq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmult.f16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.mul.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1, <8 x half> %inactive)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32)
+
+declare <8 x half> @llvm.arm.mve.mul.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>, <8 x half>)
Index: llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
===
--- llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
+++ llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
@@ -211,7 +211,7 @@
   ; CHECK:   renamable $r4 = t2ADDrr renamable $r0, renamable $r12, 14, $noreg, $noreg
   ; CHECK:   renamable $r12 = t2ADDri killed renamable $r12, 16, 14, $noreg, $noreg
   ; CHECK:   renamable $r3, dead $cpsr = tSUBi8 killed renamable $r3, 16, 14, $noreg
-  ; CHECK:   renamable $q0 = MVE_VMULt1i8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
+  ; CHECK:   renamable $q0 = MVE_VMULi8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
   ; CHECK:   MVE_VSTRBU8 killed renamable $q0, killed renamable $r4, 0, 0, killed $noreg :: (store 16 into %ir.scevgep1, align 1)
   ; CHECK:   $lr = MVE_LETP renamable $lr, %bb.2
   ; CHECK: bb.3.for.cond.cleanup:
@@ -252,7 +252,7 @@
 renamable $r4 = t2ADDrr renamable $r0, renamable $r12, 14, $noreg, $noreg
 renamable $r12 = t2ADDri killed renamable $r12, 16, 14, $noreg, $noreg
 renamable $r3, dead $cpsr = tSUBi8 killed renamable $r3, 16, 14, $noreg
-renamable $q0 = MVE_VMULt1i8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
+renamable $q0 = MVE_VMULi8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
 MVE_VPST 8, implicit $vpr

[PATCH] D70545: [ARM][MVE][Intrinsics] Add MVE VABD intrinsics.

2019-11-26 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 231034.
MarkMurrayARM added a comment.

Rebase and reupload patches.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70545/new/

https://reviews.llvm.org/D70545

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vabdq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <4 x i32> @test_vabdq_u32(<4 x i32> %a, <4 x i32> %b) {
+; CHECK-LABEL: test_vabdq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>%a, <4 x i32>%b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>, <4 x i32>)
+
+define arm_aapcs_vfpcc <4 x float> @test_vabdq_f32(<4 x float> %a, <4 x float> %b) {
+; CHECK-LABEL: test_vabdq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.f32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>%a, <4 x float>%b)
+  ret <4 x float> %0
+}
+
+declare <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>, <4 x float>)
+
+define arm_aapcs_vfpcc <16 x i8> @test_vabdq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32)
+
+declare <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>)
+
+define arm_aapcs_vfpcc <8 x half> @test_vabdq_m_f16(<8 x half> %inactive, <8 x half> %a, <8 x half> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.f16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1, <8 x half> %inactive)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32)
+
+declare <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>, <8 x half>)
Index: llvm/lib/Target/ARM/ARMInstrMVE.td
===
--- llvm/lib/Target/ARM/ARMInstrMVE.td
+++ llvm/lib/Target/ARM/ARMInstrMVE.td
@@ -1664,8 +1664,9 @@
 }
 
 
-class MVE_VABD_int size, list pattern=[]>
-  : MVE_int<"vabd", suffix, size, pattern> {
+class MVE_VABD_int size,
+ list pattern=[]>
+  : MVE_int {
 
   let Inst{28} = U;
   let Inst{25-23} = 0b110;
@@ -1676,12 +1677,35 @@
   let validForTailPredication = 1;
 }
 
-def MVE_VABDs8  : MVE_VABD_int<"s8", 0b0, 0b00>;
-def MVE_VABDs16 : MVE_VABD_int<"s16", 0b0, 0b01>;
-def MVE_VABDs32 : MVE_VABD_int<"s32", 0b0, 0b10>;
-def MVE_VABDu8  : MVE_VABD_int<"u8", 0b1, 0b00>;
-def MVE_VABDu16 : MVE_VABD_int<"u16", 0b1, 0b01>;
-def MVE_VABDu32 : MVE_VABD_int<"u32", 0b1, 0b10>;
+multiclass MVE_VABD_m {
+  def "" : MVE_VABD_int;
+
+  let Predicates = [HasMVEInt] in {
+// Unpredicated absolute difference
+def : Pat<(VTI.Vec (unpred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn)))>;
+
+// Predicated absolute difference
+def : Pat<(VTI.Vec (pred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(VTI.Pred VCCR:$mask), (VTI.Vec MQPR:$inactive))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(i32 1), (VTI.Pred VCCR:$mask),
+(VTI.Vec MQPR:$inactive)))>;
+  }
+}
+
+multiclass MVE_VABD
+  : MVE_VABD_m<"vabd", VTI, int_arm_mve_vabd, int_arm_mve_abd_predicated>;
+
+defm MVE_VABDs8  : MVE_VABD;
+defm MVE_VABDs16 : MVE_VABD;
+defm MVE_VABDs32 : MVE_VABD;
+defm MVE_VABDu8  : MVE_VABD;
+defm MVE_VABDu16 : MVE_VABD;
+defm MVE_VABDu32 : MVE_VABD;
 
 class MVE_VRHADD size, list pattern=[]>
   : MVE_int<"vrhadd", 

[PATCH] D70547: [ARM][MVE][Intrinsics] Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics.

2019-11-26 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 231037.
MarkMurrayARM added a comment.

Rebase and reupload patches.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70547/new/

https://reviews.llvm.org/D70547

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vandq.c
  clang/test/CodeGen/arm-mve-intrinsics/vbicq.c
  clang/test/CodeGen/arm-mve-intrinsics/veorq.c
  clang/test/CodeGen/arm-mve-intrinsics/vornq.c
  clang/test/CodeGen/arm-mve-intrinsics/vorrq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vandq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vbicq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/veorq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vornq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
@@ -0,0 +1,104 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <16 x i8> %b, %a
+  ret <16 x i8> %0
+}
+
+define arm_aapcs_vfpcc <4 x i32> @test_vorrq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <8 x i16> %b, %a
+  ret <8 x i16> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vorrq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = or <4 x i32> %1, %0
+  %3 = bitcast <4 x i32> %2 to <4 x float>
+  ret <4 x float> %3
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+; Function Attrs: nounwind readnone
+define arm_aapcs_vfpcc <8 x half> @test_vorrq_m_f32(<4 x float> %inactive, <4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = zext i16 %p to i32
+  %3 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %2)
+  %4 = bitcast <4 x float> %inactive to <4 x i32>
+  %5 = tail call <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32> %0, <4 x i32> %1, <4 x i1> %3, <4 x i32> %4)
+  %6 = bitcast <4 x i32> %5 to <8 x half>
+  ret <8 x half> %6
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>,

[PATCH] D70545: [ARM][MVE][Intrinsics] Add MVE VABD intrinsics.

2019-11-26 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 231055.
MarkMurrayARM added a comment.

Respond to review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70545/new/

https://reviews.llvm.org/D70545

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vabdq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <4 x i32> @test_vabdq_u32(<4 x i32> %a, <4 x i32> %b) {
+; CHECK-LABEL: test_vabdq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>%a, <4 x i32>%b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>, <4 x i32>)
+
+define arm_aapcs_vfpcc <4 x float> @test_vabdq_f32(<4 x float> %a, <4 x float> %b) {
+; CHECK-LABEL: test_vabdq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.f32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>%a, <4 x float>%b)
+  ret <4 x float> %0
+}
+
+declare <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>, <4 x float>)
+
+define arm_aapcs_vfpcc <16 x i8> @test_vabdq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32)
+
+declare <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>)
+
+define arm_aapcs_vfpcc <8 x half> @test_vabdq_m_f16(<8 x half> %inactive, <8 x half> %a, <8 x half> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.f16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1, <8 x half> %inactive)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32)
+
+declare <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>, <8 x half>)
Index: llvm/lib/Target/ARM/ARMInstrMVE.td
===
--- llvm/lib/Target/ARM/ARMInstrMVE.td
+++ llvm/lib/Target/ARM/ARMInstrMVE.td
@@ -1664,7 +1664,8 @@
 }
 
 
-class MVE_VABD_int size, list pattern=[]>
+class MVE_VABD_int size,
+ list pattern=[]>
   : MVE_int<"vabd", suffix, size, pattern> {
 
   let Inst{28} = U;
@@ -1676,12 +1677,35 @@
   let validForTailPredication = 1;
 }
 
-def MVE_VABDs8  : MVE_VABD_int<"s8", 0b0, 0b00>;
-def MVE_VABDs16 : MVE_VABD_int<"s16", 0b0, 0b01>;
-def MVE_VABDs32 : MVE_VABD_int<"s32", 0b0, 0b10>;
-def MVE_VABDu8  : MVE_VABD_int<"u8", 0b1, 0b00>;
-def MVE_VABDu16 : MVE_VABD_int<"u16", 0b1, 0b01>;
-def MVE_VABDu32 : MVE_VABD_int<"u32", 0b1, 0b10>;
+multiclass MVE_VABD_m {
+  def "" : MVE_VABD_int;
+
+  let Predicates = [HasMVEInt] in {
+// Unpredicated absolute difference
+def : Pat<(VTI.Vec (unpred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn)))>;
+
+// Predicated absolute difference
+def : Pat<(VTI.Vec (pred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(VTI.Pred VCCR:$mask), (VTI.Vec MQPR:$inactive))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(i32 1), (VTI.Pred VCCR:$mask),
+(VTI.Vec MQPR:$inactive)))>;
+  }
+}
+
+multiclass MVE_VABD
+  : MVE_VABD_m;
+
+defm MVE_VABDs8  : MVE_VABD;
+defm MVE_VABDs16 : MVE_VABD;
+defm MVE_VABDs32 : MVE_VABD;
+defm MVE_VABDu8  : MVE_VABD;
+defm MVE_VABDu16 : MVE_VABD;
+defm MVE_VABDu32 : MVE_VABD;
 
 class MVE_VRHADD size, list pattern=[]>
   : MVE_int<"vrhadd", suffix, size, pattern> {
@@ -2950,8 +2974,28 @@
   let validForTailPredication = 1;
 }
 
-def MVE_VABDf

[PATCH] D70547: [ARM][MVE][Intrinsics] Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics.

2019-11-26 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 231057.
MarkMurrayARM added a comment.

Respond to review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70547/new/

https://reviews.llvm.org/D70547

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vandq.c
  clang/test/CodeGen/arm-mve-intrinsics/vbicq.c
  clang/test/CodeGen/arm-mve-intrinsics/veorq.c
  clang/test/CodeGen/arm-mve-intrinsics/vornq.c
  clang/test/CodeGen/arm-mve-intrinsics/vorrq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vandq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vbicq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/veorq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vornq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
@@ -0,0 +1,104 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <16 x i8> %b, %a
+  ret <16 x i8> %0
+}
+
+define arm_aapcs_vfpcc <4 x i32> @test_vorrq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <8 x i16> %b, %a
+  ret <8 x i16> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vorrq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = or <4 x i32> %1, %0
+  %3 = bitcast <4 x i32> %2 to <4 x float>
+  ret <4 x float> %3
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+; Function Attrs: nounwind readnone
+define arm_aapcs_vfpcc <8 x half> @test_vorrq_m_f32(<4 x float> %inactive, <4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = zext i16 %p to i32
+  %3 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %2)
+  %4 = bitcast <4 x float> %inactive to <4 x i32>
+  %5 = tail call <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32> %0, <4 x i32> %1, <4 x i1> %3, <4 x i32> %4)
+  %6 = bitcast <4 x i32> %5 to <8 x half>
+  ret <8 x half> %6
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, 

[PATCH] D70546: [ARM][MVE][Intrinsics] Add MVE VMUL intrinsics.

2019-11-26 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 231056.
MarkMurrayARM added a comment.

Respond to review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70546/new/

https://reviews.llvm.org/D70546

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmulq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
  llvm/unittests/Target/ARM/MachineInstrTest.cpp

Index: llvm/unittests/Target/ARM/MachineInstrTest.cpp
===
--- llvm/unittests/Target/ARM/MachineInstrTest.cpp
+++ llvm/unittests/Target/ARM/MachineInstrTest.cpp
@@ -250,9 +250,9 @@
 case MVE_VMUL_qr_i8:
 case MVE_VMULf16:
 case MVE_VMULf32:
-case MVE_VMULt1i16:
-case MVE_VMULt1i8:
-case MVE_VMULt1i32:
+case MVE_VMULi16:
+case MVE_VMULi8:
+case MVE_VMULi32:
 case MVE_VMVN:
 case MVE_VMVNimmi16:
 case MVE_VMVNimmi32:
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulq_u32(<4 x i32> %a, <4 x i32> %b) {
+; CHECK-LABEL: test_vmulq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmul.i32 q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = mul <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vmulq_f32(<4 x float> %a, <4 x float> %b) {
+; CHECK-LABEL: test_vmulq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmul.f32 q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = fmul <4 x float> %b, %a
+  ret <4 x float> %0
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vmulq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vmulq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmult.i8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.mul.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32)
+
+declare <16 x i8> @llvm.arm.mve.mul.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>)
+
+define arm_aapcs_vfpcc <8 x half> @test_vmulq_m_f16(<8 x half> %inactive, <8 x half> %a, <8 x half> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vmulq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmult.f16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.mul.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1, <8 x half> %inactive)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32)
+
+declare <8 x half> @llvm.arm.mve.mul.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>, <8 x half>)
Index: llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
===
--- llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
+++ llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
@@ -211,7 +211,7 @@
   ; CHECK:   renamable $r4 = t2ADDrr renamable $r0, renamable $r12, 14, $noreg, $noreg
   ; CHECK:   renamable $r12 = t2ADDri killed renamable $r12, 16, 14, $noreg, $noreg
   ; CHECK:   renamable $r3, dead $cpsr = tSUBi8 killed renamable $r3, 16, 14, $noreg
-  ; CHECK:   renamable $q0 = MVE_VMULt1i8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
+  ; CHECK:   renamable $q0 = MVE_VMULi8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
   ; CHECK:   MVE_VSTRBU8 killed renamable $q0, killed renamable $r4, 0, 0, killed $noreg :: (store 16 into %ir.scevgep1, align 1)
   ; CHECK:   $lr = MVE_LETP renamable $lr, %bb.2
   ; CHECK: bb.3.for.cond.cleanup:
@@ -252,7 +252,7 @@
 renamable $r4 = t2ADDrr renamable $r0, renamable $r12, 14, $noreg, $noreg
 renamable $r12 = t2ADDri killed renamable $r12, 16, 14, $noreg, $noreg
 renamable $r3, dead $cpsr = tSUBi8 killed renamable $r3, 16, 14, $noreg
-renamable $q0 = MVE_VMULt1i8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
+renamable $q0 = MVE_VMULi8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
 MVE_VPST 8, implicit $vpr
 

[PATCH] D70545: [ARM][MVE][Intrinsics] Add MVE VABD intrinsics.

2019-11-27 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGf4bba07b87ce: [ARM][MVE][Intrinsics] Add MVE VABD 
intrinsics. Add unit tests. (authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70545/new/

https://reviews.llvm.org/D70545

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vabdq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vabdq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <4 x i32> @test_vabdq_u32(<4 x i32> %a, <4 x i32> %b) {
+; CHECK-LABEL: test_vabdq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>%a, <4 x i32>%b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vabd.v4i32(<4 x i32>, <4 x i32>)
+
+define arm_aapcs_vfpcc <4 x float> @test_vabdq_f32(<4 x float> %a, <4 x float> %b) {
+; CHECK-LABEL: test_vabdq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vabd.f32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>%a, <4 x float>%b)
+  ret <4 x float> %0
+}
+
+declare <4 x float> @llvm.arm.mve.vabd.v4f32(<4 x float>, <4 x float>)
+
+define arm_aapcs_vfpcc <16 x i8> @test_vabdq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32)
+
+declare <16 x i8> @llvm.arm.mve.abd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>)
+
+define arm_aapcs_vfpcc <8 x half> @test_vabdq_m_f16(<8 x half> %inactive, <8 x half> %a, <8 x half> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vabdq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vabdt.f16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1, <8 x half> %inactive)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32)
+
+declare <8 x half> @llvm.arm.mve.abd.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>, <8 x half>)
Index: llvm/lib/Target/ARM/ARMInstrMVE.td
===
--- llvm/lib/Target/ARM/ARMInstrMVE.td
+++ llvm/lib/Target/ARM/ARMInstrMVE.td
@@ -1664,7 +1664,8 @@
 }
 
 
-class MVE_VABD_int size, list pattern=[]>
+class MVE_VABD_int size,
+ list pattern=[]>
   : MVE_int<"vabd", suffix, size, pattern> {
 
   let Inst{28} = U;
@@ -1676,12 +1677,35 @@
   let validForTailPredication = 1;
 }
 
-def MVE_VABDs8  : MVE_VABD_int<"s8", 0b0, 0b00>;
-def MVE_VABDs16 : MVE_VABD_int<"s16", 0b0, 0b01>;
-def MVE_VABDs32 : MVE_VABD_int<"s32", 0b0, 0b10>;
-def MVE_VABDu8  : MVE_VABD_int<"u8", 0b1, 0b00>;
-def MVE_VABDu16 : MVE_VABD_int<"u16", 0b1, 0b01>;
-def MVE_VABDu32 : MVE_VABD_int<"u32", 0b1, 0b10>;
+multiclass MVE_VABD_m {
+  def "" : MVE_VABD_int;
+
+  let Predicates = [HasMVEInt] in {
+// Unpredicated absolute difference
+def : Pat<(VTI.Vec (unpred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn)))>;
+
+// Predicated absolute difference
+def : Pat<(VTI.Vec (pred_int (VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(VTI.Pred VCCR:$mask), (VTI.Vec MQPR:$inactive))),
+  (VTI.Vec (!cast(NAME)
+(VTI.Vec MQPR:$Qm), (VTI.Vec MQPR:$Qn),
+(i32 1), (VTI.Pred VCCR:$mask),
+(VTI.Vec MQPR:$inactive)))>;
+  }
+}
+
+multiclass MVE_VABD
+  : MVE_VABD_m;
+
+defm MVE_VABDs8  : MVE_VABD;
+defm MVE_VABDs16 : MVE_VABD;
+defm MVE_VABDs32 : MVE_VABD;
+defm MVE_VABDu8  : MVE_VABD;
+defm MVE_VABDu16 : MVE_VABD;
+defm MVE_VABDu32 : MVE_VABD;
 
 class MVE_VRHADD size, list pattern=[]>
   : MVE_int<"vrhadd", suffix, size, p

[PATCH] D70546: [ARM][MVE][Intrinsics] Add MVE VMUL intrinsics.

2019-11-27 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe8a8dbe9c458: [ARM][MVE][Intrinsics] Add MVE VMUL 
intrinsics. Remove annoying "t1" from VMUL*… (authored by 
MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70546/new/

https://reviews.llvm.org/D70546

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmulq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
  llvm/unittests/Target/ARM/MachineInstrTest.cpp

Index: llvm/unittests/Target/ARM/MachineInstrTest.cpp
===
--- llvm/unittests/Target/ARM/MachineInstrTest.cpp
+++ llvm/unittests/Target/ARM/MachineInstrTest.cpp
@@ -250,9 +250,9 @@
 case MVE_VMUL_qr_i8:
 case MVE_VMULf16:
 case MVE_VMULf32:
-case MVE_VMULt1i16:
-case MVE_VMULt1i8:
-case MVE_VMULt1i32:
+case MVE_VMULi16:
+case MVE_VMULi8:
+case MVE_VMULi32:
 case MVE_VMVN:
 case MVE_VMVNimmi16:
 case MVE_VMVNimmi32:
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulq.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulq_u32(<4 x i32> %a, <4 x i32> %b) {
+; CHECK-LABEL: test_vmulq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmul.i32 q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = mul <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vmulq_f32(<4 x float> %a, <4 x float> %b) {
+; CHECK-LABEL: test_vmulq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmul.f32 q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = fmul <4 x float> %b, %a
+  ret <4 x float> %0
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vmulq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vmulq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmult.i8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.mul.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32)
+
+declare <16 x i8> @llvm.arm.mve.mul.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>)
+
+define arm_aapcs_vfpcc <8 x half> @test_vmulq_m_f16(<8 x half> %inactive, <8 x half> %a, <8 x half> %b, i16 zeroext %p) {
+; CHECK-LABEL: test_vmulq_m_f16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmult.f16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x half> @llvm.arm.mve.mul.predicated.v8f16.v8i1(<8 x half> %a, <8 x half> %b, <8 x i1> %1, <8 x half> %inactive)
+  ret <8 x half> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32)
+
+declare <8 x half> @llvm.arm.mve.mul.predicated.v8f16.v8i1(<8 x half>, <8 x half>, <8 x i1>, <8 x half>)
Index: llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
===
--- llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
+++ llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
@@ -211,7 +211,7 @@
   ; CHECK:   renamable $r4 = t2ADDrr renamable $r0, renamable $r12, 14, $noreg, $noreg
   ; CHECK:   renamable $r12 = t2ADDri killed renamable $r12, 16, 14, $noreg, $noreg
   ; CHECK:   renamable $r3, dead $cpsr = tSUBi8 killed renamable $r3, 16, 14, $noreg
-  ; CHECK:   renamable $q0 = MVE_VMULt1i8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
+  ; CHECK:   renamable $q0 = MVE_VMULi8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
   ; CHECK:   MVE_VSTRBU8 killed renamable $q0, killed renamable $r4, 0, 0, killed $noreg :: (store 16 into %ir.scevgep1, align 1)
   ; CHECK:   $lr = MVE_LETP renamable $lr, %bb.2
   ; CHECK: bb.3.for.cond.cleanup:
@@ -252,7 +252,7 @@
 renamable $r4 = t2ADDrr renamable $r0, renamable $r12, 14, $noreg, $noreg
 renamable $r12 = t2ADDri killed renamable $r12, 16, 14, $noreg, $noreg
 renamable $r3, dead $cpsr = tSUBi8 killed renamable $r3, 16, 14, $noreg
-renamable $q0 = MVE_VMULt1i8 killed renamable $q1, killed renamable $q0, 0, $noreg, undef renamable $q0
+renamable $q0 = MVE_VMULi8 k

[PATCH] D70547: [ARM][MVE][Intrinsics] Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics.

2019-11-27 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa048bf87fb65: [ARM][MVE][Intrinsics] Add MVE 
VAND/VORR/VORN/VEOR/VBIC intrinsics. Add unit… (authored by MarkMurrayARM).

Changed prior to commit:
  https://reviews.llvm.org/D70547?vs=231057&id=231272#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70547/new/

https://reviews.llvm.org/D70547

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vandq.c
  clang/test/CodeGen/arm-mve-intrinsics/vbicq.c
  clang/test/CodeGen/arm-mve-intrinsics/veorq.c
  clang/test/CodeGen/arm-mve-intrinsics/vornq.c
  clang/test/CodeGen/arm-mve-intrinsics/vorrq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vandq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vbicq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/veorq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vornq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vorrq.ll
@@ -0,0 +1,104 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <16 x i8> %b, %a
+  ret <16 x i8> %0
+}
+
+define arm_aapcs_vfpcc <4 x i32> @test_vorrq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <4 x i32> %b, %a
+  ret <4 x i32> %0
+}
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = or <8 x i16> %b, %a
+  ret <8 x i16> %0
+}
+
+define arm_aapcs_vfpcc <4 x float> @test_vorrq_f32(<4 x float> %a, <4 x float> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vorrq_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vorr q0, q1, q0
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = or <4 x i32> %1, %0
+  %3 = bitcast <4 x i32> %2 to <4 x float>
+  ret <4 x float> %3
+}
+
+define arm_aapcs_vfpcc <16 x i8> @test_vorrq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.orr.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define arm_aapcs_vfpcc <8 x i16> @test_vorrq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.orr.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+; Function Attrs: nounwind readnone
+define arm_aapcs_vfpcc <8 x half> @test_vorrq_m_f32(<4 x float> %inactive, <4 x float> %a, <4 x float> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vorrq_m_f32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vorrt q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = bitcast <4 x float> %a to <4 x i32>
+  %1 = bitcast <4 x float> %b to <4 x i32>
+  %2 = zext i16 %p to i32
+  %3 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %2)
+  %4 = bitcast <4 x float> %inactive to <4 x i32>
+  %5 = tail call <4 x i32> @llvm.arm.mve.orr.predicated.v4i32.v4i1(<4 x i32> %0, <4 x i32> %1, <4 x i1> %3, <4 x i32> %4)
+  %6 = bitcast <4 x

[PATCH] D70829: [ARM][MVE][Intrinsics] Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics.

2019-11-28 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
Herald added subscribers: llvm-commits, cfe-commits, dmgreen, hiraditya, 
kristof.beyls.
Herald added projects: clang, LLVM.

Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics and their predicated versions. Add 
unit tests.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D70829

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmaxq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll
@@ -0,0 +1,89 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define dso_local arm_aapcs_vfpcc <16 x i8> @test_vminq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.u8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp ugt <16 x i8> %a, %b
+  %1 = select <16 x i1> %0, <16 x i8> %b, <16 x i8> %a
+  ret <16 x i8> %1
+}
+
+define dso_local arm_aapcs_vfpcc <8 x i16> @test_vminq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp sgt <8 x i16> %a, %b
+  %1 = select <8 x i1> %0, <8 x i16> %b, <8 x i16> %a
+  ret <8 x i16> %1
+}
+
+define dso_local arm_aapcs_vfpcc <4 x i32> @test_vminq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.u32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp ugt <4 x i32> %a, %b
+  %1 = select <4 x i1> %0, <4 x i32> %b, <4 x i32> %a
+  ret <4 x i32> %1
+}
+
+define dso_local arm_aapcs_vfpcc <16 x i8> @test_vminq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.min.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.min.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define dso_local arm_aapcs_vfpcc <8 x i16> @test_vminq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.min.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.min.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+define dso_local arm_aapcs_vfpcc <4 x i32> @test_vminq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.min.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x i32> @llvm.arm.mve.min.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #2
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s 

[PATCH] D70829: [ARM][MVE][Intrinsics] Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics.

2019-11-29 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 231540.
MarkMurrayARM added a comment.

Fix non-predicated floats.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70829/new/

https://reviews.llvm.org/D70829

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmaxq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll
@@ -0,0 +1,89 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define dso_local arm_aapcs_vfpcc <16 x i8> @test_vminq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.u8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp ugt <16 x i8> %a, %b
+  %1 = select <16 x i1> %0, <16 x i8> %b, <16 x i8> %a
+  ret <16 x i8> %1
+}
+
+define dso_local arm_aapcs_vfpcc <8 x i16> @test_vminq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp sgt <8 x i16> %a, %b
+  %1 = select <8 x i1> %0, <8 x i16> %b, <8 x i16> %a
+  ret <8 x i16> %1
+}
+
+define dso_local arm_aapcs_vfpcc <4 x i32> @test_vminq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.u32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp ugt <4 x i32> %a, %b
+  %1 = select <4 x i1> %0, <4 x i32> %b, <4 x i32> %a
+  ret <4 x i32> %1
+}
+
+define dso_local arm_aapcs_vfpcc <16 x i8> @test_vminq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.min.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.min.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define dso_local arm_aapcs_vfpcc <8 x i16> @test_vminq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.min.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.min.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+define dso_local arm_aapcs_vfpcc <4 x i32> @test_vminq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.min.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x i32> @llvm.arm.mve.min.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #2
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define dso_local arm_aapcs_vfpcc <8 x half> @test_vminnmq_f

[PATCH] D70829: [ARM][MVE][Intrinsics] Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics.

2019-12-02 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 231664.
MarkMurrayARM added a comment.

Address whitespace nit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70829/new/

https://reviews.llvm.org/D70829

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmaxq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll
@@ -0,0 +1,89 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define dso_local arm_aapcs_vfpcc <16 x i8> @test_vminq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.u8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp ugt <16 x i8> %a, %b
+  %1 = select <16 x i1> %0, <16 x i8> %b, <16 x i8> %a
+  ret <16 x i8> %1
+}
+
+define dso_local arm_aapcs_vfpcc <8 x i16> @test_vminq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp sgt <8 x i16> %a, %b
+  %1 = select <8 x i1> %0, <8 x i16> %b, <8 x i16> %a
+  ret <8 x i16> %1
+}
+
+define dso_local arm_aapcs_vfpcc <4 x i32> @test_vminq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.u32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp ugt <4 x i32> %a, %b
+  %1 = select <4 x i1> %0, <4 x i32> %b, <4 x i32> %a
+  ret <4 x i32> %1
+}
+
+define dso_local arm_aapcs_vfpcc <16 x i8> @test_vminq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.min.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.min.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define dso_local arm_aapcs_vfpcc <8 x i16> @test_vminq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.min.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.min.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+define dso_local arm_aapcs_vfpcc <4 x i32> @test_vminq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.min.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x i32> @llvm.arm.mve.min.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #2
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define dso_local arm_aapcs_vfpcc <8 x half> @test_vminnmq_f16(

[PATCH] D70829: [ARM][MVE][Intrinsics] Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics.

2019-12-02 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG510792a2e0e3: [ARM][MVE][Intrinsics] Add 
VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics. (authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70829/new/

https://reviews.llvm.org/D70829

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmaxq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminq.ll
@@ -0,0 +1,89 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define dso_local arm_aapcs_vfpcc <16 x i8> @test_vminq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.u8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp ugt <16 x i8> %a, %b
+  %1 = select <16 x i1> %0, <16 x i8> %b, <16 x i8> %a
+  ret <16 x i8> %1
+}
+
+define dso_local arm_aapcs_vfpcc <8 x i16> @test_vminq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp sgt <8 x i16> %a, %b
+  %1 = select <8 x i1> %0, <8 x i16> %b, <8 x i16> %a
+  ret <8 x i16> %1
+}
+
+define dso_local arm_aapcs_vfpcc <4 x i32> @test_vminq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vminq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmin.u32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = icmp ugt <4 x i32> %a, %b
+  %1 = select <4 x i1> %0, <4 x i32> %b, <4 x i32> %a
+  ret <4 x i32> %1
+}
+
+define dso_local arm_aapcs_vfpcc <16 x i8> @test_vminq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.min.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #2
+
+declare <16 x i8> @llvm.arm.mve.min.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #2
+
+define dso_local arm_aapcs_vfpcc <8 x i16> @test_vminq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.min.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #2
+
+declare <8 x i16> @llvm.arm.mve.min.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #2
+
+define dso_local arm_aapcs_vfpcc <4 x i32> @test_vminq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #1 {
+; CHECK-LABEL: test_vminq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmint.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.min.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #2
+
+declare <4 x i32> @llvm.arm.mve.min.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #2
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmq.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machine

[PATCH] D70829: [ARM][MVE][Intrinsics] Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics.

2019-12-02 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM marked an inline comment as done.
MarkMurrayARM added a comment.

Nit terminated with extreme prejudice.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70829/new/

https://reviews.llvm.org/D70829



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D70948: [ARM][MVE][Intrinsics] Add VMULH/VRMULH intrinsics.

2019-12-03 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
MarkMurrayARM added reviewers: simon_tatham, ostannard, dmgreen.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls.
Herald added projects: clang, LLVM.

Add MVE VMULH/VRMULH intrinsics and unit tests.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D70948

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrmulhq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrmulhq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrmulh.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrmulh.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrmulhq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrmulh.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrmulh.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrmulhq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrmulh.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrmulh.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrmulhq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rmulh.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rmulh.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrmulhq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rmulh.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rmulh.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrmulhq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rmulh.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.rmulh.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #1
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <

[PATCH] D70948: [ARM][MVE][Intrinsics] Add VMULH/VRMULH intrinsics.

2019-12-04 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM marked 3 inline comments as done.
MarkMurrayARM added inline comments.



Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:3644
+multiclass MVE_VxMULH_m {

dmgreen wrote:
> Formatting.
I took a guess as to what you wanted here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70948/new/

https://reviews.llvm.org/D70948



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D70948: [ARM][MVE][Intrinsics] Add VMULH/VRMULH intrinsics.

2019-12-04 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 232093.
MarkMurrayARM added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70948/new/

https://reviews.llvm.org/D70948

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrmulhq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrmulhq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrmulh.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrmulh.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrmulhq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrmulh.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrmulh.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrmulhq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrmulh.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrmulh.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrmulhq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rmulh.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rmulh.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrmulhq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rmulh.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rmulh.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrmulhq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rmulh.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.rmulh.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #1
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vmulhq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL

[PATCH] D70948: [ARM][MVE][Intrinsics] Add VMULH/VRMULH intrinsics.

2019-12-04 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 232094.
MarkMurrayARM marked an inline comment as done.
MarkMurrayARM added a comment.

Fix incorrect argument order error.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70948/new/

https://reviews.llvm.org/D70948

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrmulhq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrmulhq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrmulh.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrmulh.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrmulhq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrmulh.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrmulh.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrmulhq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrmulh.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrmulh.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrmulhq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rmulh.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rmulh.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrmulhq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rmulh.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rmulh.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrmulhq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rmulh.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.rmulh.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #1
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vmulhq_u8(<16 x i

[PATCH] D70948: [ARM][MVE][Intrinsics] Add VMULH/VRMULH intrinsics.

2019-12-04 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGd3f62ceac0ce: [ARM][MVE][Intrinsics] Add VMULH/VRMULH 
intrinsics. (authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70948/new/

https://reviews.llvm.org/D70948

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrmulhq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrmulhq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrmulhq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrmulh.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrmulh.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrmulhq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrmulh.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrmulh.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrmulhq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrmulh.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrmulh.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrmulh.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrmulhq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rmulh.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rmulh.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrmulhq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rmulh.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rmulh.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrmulhq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrmulhq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrmulht.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rmulh.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.rmulh.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #1
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulhq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @te

[PATCH] D71066: [ARM][MVE][Intrinsics] Add VMULH/VRMULH intrinsics.

2019-12-05 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
MarkMurrayARM added reviewers: simon_tatham, ostannard, dmgreen.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls.
Herald added projects: clang, LLVM.

Add VMULH/VRMULH intrinsics and unit tests.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D71066

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmullbq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmulltq.c
  clang/utils/TableGen/MveEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmullbq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
@@ -0,0 +1,121 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8> %a, <16 x i8> %b, i32 1)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8>, <16 x i8>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s32 q2, q0, q1
+; CHECK-NEXT:vmov q0, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32> %a, <4 x i32> %b, i32 1)
+  ret <2 x i64> %0
+}
+
+declare <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32>, <4 x i32>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_poly_p16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_poly_p16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.p16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, i32 1, <16 x i1> %1, <16 x i8> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8>, <16 x i8>, i32, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, i32 1, <8 x i1> %1, <8 x i16> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16>, <8 x i16>, i32, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to

[PATCH] D71062: [ARM][MVE] Add vector reduction intrinsics with two vector operands

2019-12-05 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added inline comments.



Comment at: clang/include/clang/Basic/arm_mve_defs.td:448
+// for example (u32 x) where x is 0 is transformed into (u32 { 0 }) by the
+// Tablegen parser.
+def V {

See D71066; I just used the numbers bare, so 0 or 1 not (u32 0) or (u32 1).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71062/new/

https://reviews.llvm.org/D71062



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71066: [ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT|POLY) intrinsics.

2019-12-05 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 232355.
MarkMurrayARM added a comment.

Correct name of intrinsics in commit message. D'oh!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71066/new/

https://reviews.llvm.org/D71066

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmullbq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmulltq.c
  clang/utils/TableGen/MveEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmullbq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
@@ -0,0 +1,121 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8> %a, <16 x i8> %b, i32 1)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8>, <16 x i8>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s32 q2, q0, q1
+; CHECK-NEXT:vmov q0, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32> %a, <4 x i32> %b, i32 1)
+  ret <2 x i64> %0
+}
+
+declare <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32>, <4 x i32>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_poly_p16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_poly_p16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.p16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, i32 1, <16 x i1> %1, <16 x i8> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8>, <16 x i8>, i32, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, i32 1, <8 x i1> %1, <8 x i16> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16>, <8 x i16>, i32, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(

[PATCH] D71066: [ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT|POLY) intrinsics.

2019-12-05 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 232367.
MarkMurrayARM added a comment.

Address review comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71066/new/

https://reviews.llvm.org/D71066

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmullbq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmulltq.c
  clang/utils/TableGen/MveEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmullbq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
@@ -0,0 +1,121 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8> %a, <16 x i8> %b, i32 1)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8>, <16 x i8>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s32 q2, q0, q1
+; CHECK-NEXT:vmov q0, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32> %a, <4 x i32> %b, i32 1)
+  ret <2 x i64> %0
+}
+
+declare <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32>, <4 x i32>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_poly_p16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_poly_p16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.p16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, i32 1, <16 x i1> %1, <16 x i8> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8>, <16 x i8>, i32, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, i32 1, <8 x i1> %1, <8 x i16> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16>, <8 x i16>, i32, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <2

[PATCH] D71065: [ARM][MVE] Add intrinsics for immediate shifts.

2019-12-06 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71065/new/

https://reviews.llvm.org/D71065



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71190: [ARM][MVE] Add complex vector intrinsics

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM




Comment at: llvm/include/llvm/IR/IntrinsicsARM.td:932
+  def "":  Intrinsic;
+  def _predicated: Intrinsic;

I like this!!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71190/new/

https://reviews.llvm.org/D71190



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71198: [ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics.

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
MarkMurrayARM added reviewers: simon_tatham, ostannard, dmgreen, miyuki.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls.
Herald added projects: clang, LLVM.

Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics and 
unit tests.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D71198

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vhaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vhsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqrdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrhaddq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqrdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare

[PATCH] D71062: [ARM][MVE] Add vector reduction intrinsics with two vector operands

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

This looks scary and fragile, but it seems to work, so OK.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71062/new/

https://reviews.llvm.org/D71062



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71198: [ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics.

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 232820.
MarkMurrayARM added a comment.

Fix long lines.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71198/new/

https://reviews.llvm.org/D71198

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vhaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vhsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqrdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrhaddq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqrdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #

[PATCH] D71198: [ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics.

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 232827.
MarkMurrayARM added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71198/new/

https://reviews.llvm.org/D71198

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vhaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vhsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqrdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrhaddq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqrdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 

[PATCH] D71066: [ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT|POLY) intrinsics.

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 232841.
MarkMurrayARM added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71066/new/

https://reviews.llvm.org/D71066

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmullbq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmulltq.c
  clang/utils/TableGen/MveEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmullbq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
@@ -0,0 +1,121 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8> %a, <16 x i8> %b, i32 1)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8>, <16 x i8>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s32 q2, q0, q1
+; CHECK-NEXT:vmov q0, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32> %a, <4 x i32> %b, i32 1)
+  ret <2 x i64> %0
+}
+
+declare <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32>, <4 x i32>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_poly_p16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_poly_p16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.p16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, i32 1, <16 x i1> %1, <16 x i8> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8>, <16 x i8>, i32, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, i32 1, <8 x i1> %1, <8 x i16> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16>, <8 x i16>, i32, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <2 x i64> @llvm.ar

[PATCH] D71198: [ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics.

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 232842.
MarkMurrayARM added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71198/new/

https://reviews.llvm.org/D71198

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vhaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vhsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqrdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrhaddq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqrdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32>, <4 x i32>, <4 x i1>, <4 x i32>) #1
Index:

[PATCH] D71066: [ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT|POLY) intrinsics.

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM marked 2 inline comments as done.
MarkMurrayARM added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71066/new/

https://reviews.llvm.org/D71066



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71198: [ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics.

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 232872.
MarkMurrayARM added a comment.

Address review comment delivered out-of-band.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71198/new/

https://reviews.llvm.org/D71198

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vhaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vhsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqrdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrhaddq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqrdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32>, <4 

[PATCH] D71066: [ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT|POLY) intrinsics.

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG2eb61fa5d685: [ARM][MVE][Intrinsics] Add 
VMULL[BT]Q_(INT|POLY) intrinsics. (authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71066/new/

https://reviews.llvm.org/D71066

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td
  clang/test/CodeGen/arm-mve-intrinsics/vmullbq.c
  clang/test/CodeGen/arm-mve-intrinsics/vmulltq.c
  clang/utils/TableGen/MveEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmullbq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmulltq.ll
@@ -0,0 +1,121 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8> %a, <16 x i8> %b, i32 1)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vmull.v8i16.v16i8(<16 x i8>, <16 x i8>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.s32 q2, q0, q1
+; CHECK-NEXT:vmov q0, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32> %a, <4 x i32> %b, i32 1)
+  ret <2 x i64> %0
+}
+
+declare <2 x i64> @llvm.arm.mve.vmull.v2i64.v4i32(<4 x i32>, <4 x i32>, i32) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_poly_p16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_poly_p16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmullt.p16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16> %a, <8 x i16> %b, i32 1)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vmull.poly.v4i32.v8i16(<8 x i16>, <8 x i16>, i32) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vmulltq_int_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, i32 1, <16 x i1> %1, <16 x i8> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.mull.int.predicated.v8i16.v16i8.v16i1(<16 x i8>, <16 x i8>, i32, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vmulltq_int_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, i32 1, <8 x i1> %1, <8 x i16> %inactive)
+  ret <4 x i32> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <4 x i32> @llvm.arm.mve.mull.int.predicated.v4i32.v8i16.v8i1(<8 x i16>, <8 x i16>, i32, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <2 x i64> @test_vmulltq_int_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vmulltq_int_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vmulltt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p t

[PATCH] D71198: [ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics.

2019-12-09 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGfc3417cb5a9d: [ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, 
VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ… (authored by MarkMurrayARM).

Changed prior to commit:
  https://reviews.llvm.org/D71198?vs=232872&id=232880#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71198/new/

https://reviews.llvm.org/D71198

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vhaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vhsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqaddq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqrdmulhq.c
  clang/test/CodeGen/arm-mve-intrinsics/vqsubq.c
  clang/test/CodeGen/arm-mve-intrinsics/vrhaddq.c
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vhsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqaddq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqrdmulhq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vqsubq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
===
--- /dev/null
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vrhaddq.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - %s | FileCheck %s
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_u8(<16 x i8> %a, <16 x i8> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s8 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8> %a, <16 x i8> %b)
+  ret <16 x i8> %0
+}
+
+declare <16 x i8> @llvm.arm.mve.vrhadd.v16i8(<16 x i8>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_s16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_s16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s16 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16> %a, <8 x i16> %b)
+  ret <8 x i16> %0
+}
+
+declare <8 x i16> @llvm.arm.mve.vrhadd.v8i16(<8 x i16>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_u32(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_u32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vrhadd.s32 q0, q0, q1
+; CHECK-NEXT:bx lr
+entry:
+  %0 = tail call <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32> %a, <4 x i32> %b)
+  ret <4 x i32> %0
+}
+
+declare <4 x i32> @llvm.arm.mve.vrhadd.v4i32(<4 x i32>, <4 x i32>) #1
+
+define arm_aapcs_vfpcc <16 x i8> @test_vrhaddq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s8:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s8 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32 %0)
+  %2 = tail call <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8> %a, <16 x i8> %b, <16 x i1> %1, <16 x i8> %inactive)
+  ret <16 x i8> %2
+}
+
+declare <16 x i1> @llvm.arm.mve.pred.i2v.v16i1(i32) #1
+
+declare <16 x i8> @llvm.arm.mve.rhadd.predicated.v16i8.v16i1(<16 x i8>, <16 x i8>, <16 x i1>, <16 x i8>) #1
+
+define arm_aapcs_vfpcc <8 x i16> @test_vrhaddq_m_u16(<8 x i16> %inactive, <8 x i16> %a, <8 x i16> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_u16:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s16 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32 %0)
+  %2 = tail call <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16> %a, <8 x i16> %b, <8 x i1> %1, <8 x i16> %inactive)
+  ret <8 x i16> %2
+}
+
+declare <8 x i1> @llvm.arm.mve.pred.i2v.v8i1(i32) #1
+
+declare <8 x i16> @llvm.arm.mve.rhadd.predicated.v8i16.v8i1(<8 x i16>, <8 x i16>, <8 x i1>, <8 x i16>) #1
+
+define arm_aapcs_vfpcc <4 x i32> @test_vrhaddq_m_s32(<4 x i32> %inactive, <4 x i32> %a, <4 x i32> %b, i16 zeroext %p) local_unnamed_addr #0 {
+; CHECK-LABEL: test_vrhaddq_m_s32:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:vmsr p0, r0
+; CHECK-NEXT:vpst
+; CHECK-NEXT:vrhaddt.s32 q0, q1, q2
+; CHECK-NEXT:bx lr
+entry:
+  %0 = zext i16 %p to i32
+  %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0)
+  %2 = tail call <4 x i32> @llvm.arm.mve.rhadd.predicated.v4i32.v4i1(<4 x i32> %a, <4 x i32> %b, <4 x i1> %1, <4 x i32>

[PATCH] D71335: [ARM][MVE] Factor out an IntrinsicMX multiclass.

2019-12-11 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.

*LIKE*


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71335/new/

https://reviews.llvm.org/D71335



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71458: [ARM][MVE] Add intrinsics for more immediate shifts.

2019-12-13 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM




Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:2399
+  foreach pred_int = [int_arm_mve_vshll_imm_predicated] in
+  foreach imm = [inst_imm.immediateType] in {
+

*OUCH* 6 deep :-)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71458/new/

https://reviews.llvm.org/D71458



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71062: [ARM][MVE] Add vector reduction intrinsics with two vector operands

2019-12-13 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.

Tiny observation of types - fix only if you feel like it.




Comment at: clang/include/clang/Basic/arm_mve.td:845
+
+let params = [s16, s32] in {
+defm vmlaldav : MVEBinaryVectorHoriz64;

Types again?



Comment at: clang/include/clang/Basic/arm_mve.td:854
+
+let params = [s32] in {
+defm vrmlaldavh : MVEBinaryVectorHoriz64R;

Types again?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71062/new/

https://reviews.llvm.org/D71062



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71466: [ARM][MVE][Intrinsics] remove extraneous intrinsics.

2019-12-13 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
MarkMurrayARM added a reviewer: simon_tatham.
Herald added subscribers: cfe-commits, dmgreen, kristof.beyls.
Herald added a project: clang.

I overstepped my reach and generated too many intrinsics; these never
made it into the tests.

Remove these extras. Some needed to be signed-olny, and there were some
possible but unrequired _x variants that needed an extra argument to
IntrinsicMX to allow [de-]selection at compile-time.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D71466

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td

Index: clang/include/clang/Basic/arm_mve_defs.td
===
--- clang/include/clang/Basic/arm_mve_defs.td
+++ clang/include/clang/Basic/arm_mve_defs.td
@@ -435,6 +435,7 @@
 // A wrapper to define both _m and _x versions of a predicated
 // intrinsic.
 multiclass IntrinsicMX {
   // The _m variant takes an initial parameter called $inactive, which
@@ -444,15 +445,17 @@
   def "_m" # nameSuffix:
  Intrinsic;
 
-  // The _x variant leaves off that parameter, and simply uses an
-  // undef value of the same type.
-  def "_x" # nameSuffix:
- Intrinsic {
-// Allow overriding of the polymorphic name type, because
-// sometimes the _m and _x variants polymorph differently
-// (typically because the type of the inactive parameter can be
-// used as a disambiguator if it's present).
-let pnt = pnt_x;
+  foreach unusedVar = !if(!eq(wantXVariant, 1), [1], []) in {
+// The _x variant leaves off that parameter, and simply uses an
+// undef value of the same type.
+def "_x" # nameSuffix:
+   Intrinsic {
+  // Allow overriding of the polymorphic name type, because
+  // sometimes the _m and _x variants polymorph differently
+  // (typically because the type of the inactive parameter can be
+  // used as a disambiguator if it's present).
+  let pnt = pnt_x;
+}
   }
 }
 
Index: clang/include/clang/Basic/arm_mve.td
===
--- clang/include/clang/Basic/arm_mve.td
+++ clang/include/clang/Basic/arm_mve.td
@@ -79,10 +79,6 @@
   (IRInt<"vmulh", [Vector]> $a, $b)>;
 def vrmulhq: Intrinsic $a, $b)>;
-def vqdmulhq: Intrinsic $a, $b)>;
-def vqrdmulhq: Intrinsic $a, $b)>;
 def vmullbq_int: Intrinsic
$a, $b, 0)>;
@@ -90,6 +86,12 @@
   (IRInt<"vmull", [DblVector, Vector]>
$a, $b, 1)>;
 }
+let params = T.Signed in {
+def vqdmulhq: Intrinsic $a, $b)>;
+def vqrdmulhq: Intrinsic $a, $b)>;
+}
 
 let params = T.Poly, overrideKindLetter = "p" in {
 def vmullbq_poly: Intrinsic $a, $b)>;
 }
 
-multiclass VectorVectorArithmetic {
+multiclass VectorVectorArithmetic {
   defm "" : IntrinsicMX $a, $b,
- $pred, $inactive)>;
+ $pred, $inactive),
+wantXVariant>;
 }
 
 multiclass VectorVectorArithmeticBitcast {
@@ -179,16 +182,18 @@
   defm vmaxq : VectorVectorArithmetic<"max_predicated">;
   defm vmulhq : VectorVectorArithmetic<"mulh_predicated">;
   defm vrmulhq : VectorVectorArithmetic<"rmulh_predicated">;
-  defm vqdmulhq : VectorVectorArithmetic<"qdmulh_predicated">;
-  defm vqrdmulhq : VectorVectorArithmetic<"qrdmulh_predicated">;
-  defm vqaddq : VectorVectorArithmetic<"qadd_predicated">;
+  defm vqaddq : VectorVectorArithmetic<"qadd_predicated", 0>;
   defm vhaddq : VectorVectorArithmetic<"hadd_predicated">;
   defm vrhaddq : VectorVectorArithmetic<"rhadd_predicated">;
-  defm vqsubq : VectorVectorArithmetic<"qsub_predicated">;
+  defm vqsubq : VectorVectorArithmetic<"qsub_predicated", 0>;
   defm vhsubq : VectorVectorArithmetic<"hsub_predicated">;
   defm vmullbq_int : DblVectorVectorArithmetic<"mull_int_predicated", (u32 0)>;
   defm vmulltq_int : DblVectorVectorArithmetic<"mull_int_predicated", (u32 1)>;
 }
+let params = T.Signed in {
+  defm vqdmulhq : VectorVectorArithmetic<"qdmulh_predicated", 0>;
+  defm vqrdmulhq : VectorVectorArithmetic<"qrdmulh_predicated", 0>;
+}
 
 let params = T.Poly, overrideKindLetter = "p" in {
   defm vmullbq_poly : DblVectorVectorArithmetic<"mull_poly_predicated", (u32 0)>;
@@ -594,7 +599,7 @@
   defm vshlq: IntrinsicMX
-   $v, $sh, $pred, $inactive), "_n">;
+   $v, $sh, $pred, $inactive), 1, "_n">;
 
   let pnt = PNT_NType in {
 def vshrq_n: Intrinsic
- $v, $sh, (unsignedflag Scalar), $pred, $inactive), "_n">;
+ $v, $sh, (unsignedflag Scalar), $pred, $inactive), 1, "_n">;
   }
 }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71466: [ARM][MVE][Intrinsics] remove extraneous intrinsics.

2019-12-13 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG0eb099273918: [ARM][MVE][Intrinsics] remove extraneous 
intrinsics. (authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71466/new/

https://reviews.llvm.org/D71466

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/include/clang/Basic/arm_mve_defs.td

Index: clang/include/clang/Basic/arm_mve_defs.td
===
--- clang/include/clang/Basic/arm_mve_defs.td
+++ clang/include/clang/Basic/arm_mve_defs.td
@@ -440,6 +440,7 @@
 // A wrapper to define both _m and _x versions of a predicated
 // intrinsic.
 multiclass IntrinsicMX {
   // The _m variant takes an initial parameter called $inactive, which
@@ -449,15 +450,17 @@
   def "_m" # nameSuffix:
  Intrinsic;
 
-  // The _x variant leaves off that parameter, and simply uses an
-  // undef value of the same type.
-  def "_x" # nameSuffix:
- Intrinsic {
-// Allow overriding of the polymorphic name type, because
-// sometimes the _m and _x variants polymorph differently
-// (typically because the type of the inactive parameter can be
-// used as a disambiguator if it's present).
-let pnt = pnt_x;
+  foreach unusedVar = !if(!eq(wantXVariant, 1), [1], []) in {
+// The _x variant leaves off that parameter, and simply uses an
+// undef value of the same type.
+def "_x" # nameSuffix:
+   Intrinsic {
+  // Allow overriding of the polymorphic name type, because
+  // sometimes the _m and _x variants polymorph differently
+  // (typically because the type of the inactive parameter can be
+  // used as a disambiguator if it's present).
+  let pnt = pnt_x;
+}
   }
 }
 
Index: clang/include/clang/Basic/arm_mve.td
===
--- clang/include/clang/Basic/arm_mve.td
+++ clang/include/clang/Basic/arm_mve.td
@@ -79,10 +79,6 @@
   (IRInt<"vmulh", [Vector]> $a, $b)>;
 def vrmulhq: Intrinsic $a, $b)>;
-def vqdmulhq: Intrinsic $a, $b)>;
-def vqrdmulhq: Intrinsic $a, $b)>;
 def vmullbq_int: Intrinsic
$a, $b, 0)>;
@@ -90,6 +86,12 @@
   (IRInt<"vmull", [DblVector, Vector]>
$a, $b, 1)>;
 }
+let params = T.Signed in {
+def vqdmulhq: Intrinsic $a, $b)>;
+def vqrdmulhq: Intrinsic $a, $b)>;
+}
 
 let params = T.Poly, overrideKindLetter = "p" in {
 def vmullbq_poly: Intrinsic $a, $b)>;
 }
 
-multiclass VectorVectorArithmetic {
+multiclass VectorVectorArithmetic {
   defm "" : IntrinsicMX $a, $b,
- $pred, $inactive)>;
+ $pred, $inactive),
+wantXVariant>;
 }
 
 multiclass VectorVectorArithmeticBitcast {
@@ -179,16 +182,18 @@
   defm vmaxq : VectorVectorArithmetic<"max_predicated">;
   defm vmulhq : VectorVectorArithmetic<"mulh_predicated">;
   defm vrmulhq : VectorVectorArithmetic<"rmulh_predicated">;
-  defm vqdmulhq : VectorVectorArithmetic<"qdmulh_predicated">;
-  defm vqrdmulhq : VectorVectorArithmetic<"qrdmulh_predicated">;
-  defm vqaddq : VectorVectorArithmetic<"qadd_predicated">;
+  defm vqaddq : VectorVectorArithmetic<"qadd_predicated", 0>;
   defm vhaddq : VectorVectorArithmetic<"hadd_predicated">;
   defm vrhaddq : VectorVectorArithmetic<"rhadd_predicated">;
-  defm vqsubq : VectorVectorArithmetic<"qsub_predicated">;
+  defm vqsubq : VectorVectorArithmetic<"qsub_predicated", 0>;
   defm vhsubq : VectorVectorArithmetic<"hsub_predicated">;
   defm vmullbq_int : DblVectorVectorArithmetic<"mull_int_predicated", (u32 0)>;
   defm vmulltq_int : DblVectorVectorArithmetic<"mull_int_predicated", (u32 1)>;
 }
+let params = T.Signed in {
+  defm vqdmulhq : VectorVectorArithmetic<"qdmulh_predicated", 0>;
+  defm vqrdmulhq : VectorVectorArithmetic<"qrdmulh_predicated", 0>;
+}
 
 let params = T.Poly, overrideKindLetter = "p" in {
   defm vmullbq_poly : DblVectorVectorArithmetic<"mull_poly_predicated", (u32 0)>;
@@ -594,7 +599,7 @@
   defm vshlq: IntrinsicMX
-   $v, $sh, $pred, $inactive), "_n">;
+   $v, $sh, $pred, $inactive), 1, "_n">;
 
   let pnt = PNT_NType in {
 def vshrq_n: Intrinsic
- $v, $sh, (unsignedflag Scalar), $pred, $inactive), "_n">;
+ $v, $sh, (unsignedflag Scalar), $pred, $inactive), 1, "_n">;
   }
 }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76296: [ARM,CDE] Implement GPR CDE intrinsics

2020-03-20 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76296/new/

https://reviews.llvm.org/D76296



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76299: [ARM,CDE] Implement CDE unpredicated Q-register intrinsics

2020-03-20 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

I like to use of the macros instead of a huge cross-product of types.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76299/new/

https://reviews.llvm.org/D76299



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76298: [ARM,CDE] Implement CDE S and D-register intrinsics

2020-03-20 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76298/new/

https://reviews.llvm.org/D76298



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76300: [ARM,CDE] Implement CDE vreinterpret intrinsics

2020-03-20 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76300/new/

https://reviews.llvm.org/D76300



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D76490: [ARM,MVE] Add ACLE intrinsics for the vminv/vmaxv family.

2020-03-20 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76490/new/

https://reviews.llvm.org/D76490



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74134: [ARM][MVE] Add fixed point vector conversion intrinsics

2020-02-06 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

This looks familiar and reassuring.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74134/new/

https://reviews.llvm.org/D74134



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74332: [ARM,MVE] Add intrinsics for int <-> float conversion.

2020-02-11 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74332/new/

https://reviews.llvm.org/D74332



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74331: [ARM,MVE] Add intrinsics for abs, neg and not operations.

2020-02-11 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74331/new/

https://reviews.llvm.org/D74331



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72830: [ARM][MVE][Intrinsics] Take abs() of VMINNMAQ, VMAXNMAQ intrinsics' first arguments.

2020-01-20 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 239112.
MarkMurrayARM added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72830/new/

https://reviews.llvm.org/D72830

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
===
--- llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
@@ -7,9 +7,10 @@
 ; CHECK-NEXT:vminnma.f16 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
-  %1 = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> %a, <8 x half> %0)
-  ret <8 x half> %1
+  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %a)
+  %1 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
+  %2 = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> %0, <8 x half> %1)
+  ret <8 x half> %2
 }
 
 declare <8 x half> @llvm.fabs.v8f16(<8 x half>) #1
@@ -22,9 +23,10 @@
 ; CHECK-NEXT:vminnma.f32 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
-  %1 = tail call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %0)
-  ret <4 x float> %1
+  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
+  %1 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
+  %2 = tail call <4 x float> @llvm.minnum.v4f32(<4 x float> %0, <4 x float> %1)
+  ret <4 x float> %2
 }
 
 declare <4 x float> @llvm.fabs.v4f32(<4 x float>) #1
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
===
--- llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
@@ -7,9 +7,10 @@
 ; CHECK-NEXT:vmaxnma.f16 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
-  %1 = tail call <8 x half> @llvm.maxnum.v8f16(<8 x half> %a, <8 x half> %0)
-  ret <8 x half> %1
+  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %a)
+  %1 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
+  %2 = tail call <8 x half> @llvm.maxnum.v8f16(<8 x half> %0, <8 x half> %1)
+  ret <8 x half> %2
 }
 
 declare <8 x half> @llvm.fabs.v8f16(<8 x half>) #1
@@ -22,9 +23,10 @@
 ; CHECK-NEXT:vmaxnma.f32 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
-  %1 = tail call <4 x float> @llvm.maxnum.v4f32(<4 x float> %a, <4 x float> %0)
-  ret <4 x float> %1
+  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
+  %1 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
+  %2 = tail call <4 x float> @llvm.maxnum.v4f32(<4 x float> %0, <4 x float> %1)
+  ret <4 x float> %2
 }
 
 declare <4 x float> @llvm.fabs.v4f32(<4 x float>) #1
Index: llvm/lib/Target/ARM/ARMInstrMVE.td
===
--- llvm/lib/Target/ARM/ARMInstrMVE.td
+++ llvm/lib/Target/ARM/ARMInstrMVE.td
@@ -3655,7 +3655,8 @@
 
   let Predicates = [HasMVEInt] in {
 // Unpredicated v(max|min)nma
-def : Pat<(VTI.Vec (unpred_op (VTI.Vec MQPR:$Qd), (fabs (VTI.Vec MQPR:$Qm,
+def : Pat<(VTI.Vec (unpred_op (fabs (VTI.Vec MQPR:$Qd)),
+  (fabs (VTI.Vec MQPR:$Qm,
   (VTI.Vec (Inst (VTI.Vec MQPR:$Qd), (VTI.Vec MQPR:$Qm)))>;
 
 // Predicated v(max|min)nma
Index: clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
===
--- clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
+++ clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
@@ -6,9 +6,10 @@
 
 // CHECK-LABEL: @test_vminnmaq_f16(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> [[B:%.*]])
-// CHECK-NEXT:[[TMP1:%.*]] = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> [[A:%.*]], <8 x half> [[TMP0]])
-// CHECK-NEXT:ret <8 x half> [[TMP1]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> [[A:%.*]])
+// CHECK-NEXT:[[TMP1:%.*]] = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> [[B:%.*]])
+// CHECK-NEXT:[[TMP2:%.*]] = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> [[TMP0]], <8 x half> [[TMP1]])
+// CHECK-NEXT:ret <8 x half> [[TMP2]]
 //
 float16x8_t test_vminnmaq_f16(float16x8_t a, float16x8_t b)
 {
@@ -21,9 +22,10 @@
 
 // CHECK-LABEL: @test_vminnmaq_f32(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> [[B:%.*]])
-// CHECK-NEXT:[[TMP1:%.*]] = tail call <4 x float> @llvm.minnum.v4f32(<4 x float> [[A:%.*]], <4 x f

[PATCH] D72830: [ARM][MVE][Intrinsics] Take abs() of VMINNMAQ, VMAXNMAQ intrinsics' first arguments.

2020-01-20 Thread Mark Murray via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGb10a0eb04adf: [ARM][MVE][Intrinsics] Take abs() of VMINNMAQ, 
VMAXNMAQ intrinsics' first… (authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72830/new/

https://reviews.llvm.org/D72830

Files:
  clang/include/clang/Basic/arm_mve.td
  clang/test/CodeGen/arm-mve-intrinsics/vmaxnmaq.c
  clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
  llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll

Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
===
--- llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vminnmaq.ll
@@ -7,9 +7,10 @@
 ; CHECK-NEXT:vminnma.f16 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
-  %1 = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> %a, <8 x half> %0)
-  ret <8 x half> %1
+  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %a)
+  %1 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
+  %2 = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> %0, <8 x half> %1)
+  ret <8 x half> %2
 }
 
 declare <8 x half> @llvm.fabs.v8f16(<8 x half>) #1
@@ -22,9 +23,10 @@
 ; CHECK-NEXT:vminnma.f32 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
-  %1 = tail call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %0)
-  ret <4 x float> %1
+  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
+  %1 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
+  %2 = tail call <4 x float> @llvm.minnum.v4f32(<4 x float> %0, <4 x float> %1)
+  ret <4 x float> %2
 }
 
 declare <4 x float> @llvm.fabs.v4f32(<4 x float>) #1
Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
===
--- llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
+++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmaxnmaq.ll
@@ -7,9 +7,10 @@
 ; CHECK-NEXT:vmaxnma.f16 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
-  %1 = tail call <8 x half> @llvm.maxnum.v8f16(<8 x half> %a, <8 x half> %0)
-  ret <8 x half> %1
+  %0 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %a)
+  %1 = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> %b)
+  %2 = tail call <8 x half> @llvm.maxnum.v8f16(<8 x half> %0, <8 x half> %1)
+  ret <8 x half> %2
 }
 
 declare <8 x half> @llvm.fabs.v8f16(<8 x half>) #1
@@ -22,9 +23,10 @@
 ; CHECK-NEXT:vmaxnma.f32 q0, q1
 ; CHECK-NEXT:bx lr
 entry:
-  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
-  %1 = tail call <4 x float> @llvm.maxnum.v4f32(<4 x float> %a, <4 x float> %0)
-  ret <4 x float> %1
+  %0 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
+  %1 = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> %b)
+  %2 = tail call <4 x float> @llvm.maxnum.v4f32(<4 x float> %0, <4 x float> %1)
+  ret <4 x float> %2
 }
 
 declare <4 x float> @llvm.fabs.v4f32(<4 x float>) #1
Index: llvm/lib/Target/ARM/ARMInstrMVE.td
===
--- llvm/lib/Target/ARM/ARMInstrMVE.td
+++ llvm/lib/Target/ARM/ARMInstrMVE.td
@@ -3655,7 +3655,8 @@
 
   let Predicates = [HasMVEInt] in {
 // Unpredicated v(max|min)nma
-def : Pat<(VTI.Vec (unpred_op (VTI.Vec MQPR:$Qd), (fabs (VTI.Vec MQPR:$Qm,
+def : Pat<(VTI.Vec (unpred_op (fabs (VTI.Vec MQPR:$Qd)),
+  (fabs (VTI.Vec MQPR:$Qm,
   (VTI.Vec (Inst (VTI.Vec MQPR:$Qd), (VTI.Vec MQPR:$Qm)))>;
 
 // Predicated v(max|min)nma
Index: clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
===
--- clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
+++ clang/test/CodeGen/arm-mve-intrinsics/vminnmaq.c
@@ -6,9 +6,10 @@
 
 // CHECK-LABEL: @test_vminnmaq_f16(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> [[B:%.*]])
-// CHECK-NEXT:[[TMP1:%.*]] = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> [[A:%.*]], <8 x half> [[TMP0]])
-// CHECK-NEXT:ret <8 x half> [[TMP1]]
+// CHECK-NEXT:[[TMP0:%.*]] = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> [[A:%.*]])
+// CHECK-NEXT:[[TMP1:%.*]] = tail call <8 x half> @llvm.fabs.v8f16(<8 x half> [[B:%.*]])
+// CHECK-NEXT:[[TMP2:%.*]] = tail call <8 x half> @llvm.minnum.v8f16(<8 x half> [[TMP0]], <8 x half> [[TMP1]])
+// CHECK-NEXT:ret <8 x half> [[TMP2]]
 //
 float16x8_t test_vminnmaq_f16(float16x8_t a, float16x8_t b)
 {
@@ -21,9 +22,10 @@
 
 // CHECK-LABEL: @test_vminnmaq_f32(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:[[TMP0:%.*]] = tail call <4 x float> @llvm.fabs.v4f32(<4 x flo

[PATCH] D73268: [ARM,MVE] Make the MVE intrinsics work in C++!

2020-01-23 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73268/new/

https://reviews.llvm.org/D73268



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118044: [ARM] Undeprecate complex IT blocks

2022-01-24 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added inline comments.



Comment at: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp:10977
-Warning(IDLoc, "deprecated instruction in IT block");
-  }
 }

DavidSpickett wrote:
> Is this warning something you'd still want if you opted into restricting IT 
> blocks? Or can we say that because there has/will not be any core produced 
> that follows the deprecation, you would never need the warning.
No - in effect, deprecation is gone entirely. There is nothing to warn about. 
You can ask the compiler not to do it, but if you craft complex IT blocks by 
hand, that is up to you.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118044/new/

https://reviews.llvm.org/D118044

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118044: [ARM] Undeprecate complex IT blocks

2022-01-25 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added a comment.

In D118044#3266521 , @DavidSpickett 
wrote:

> Please define what "complex" is, in the commit message and (if it doesn't 
> make the description a whole paragraph) the command line help.
>
> Given that we're flipping a default for v8 thumb targets does this warrant a 
> release note?

I've updated the commit message.

I'll look at the release notes next.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118044/new/

https://reviews.llvm.org/D118044

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D101606: [ARM] vcreateq lane ordering for big endian

2021-04-30 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added a comment.

Not sure yet.




Comment at: clang/test/CodeGen/arm-mve-intrinsics/admin.c:66
 // CHECK-NEXT:[[TMP1:%.*]] = insertelement <2 x i64> [[TMP0]], i64 
[[B:%.*]], i64 1
 // CHECK-NEXT:ret <2 x i64> [[TMP1]]
 //

Surely there is a problem here also?



Comment at: clang/test/CodeGen/arm-mve-intrinsics/admin.c:116
 // CHECK-NEXT:[[TMP1:%.*]] = insertelement <2 x i64> [[TMP0]], i64 
[[B:%.*]], i64 1
 // CHECK-NEXT:ret <2 x i64> [[TMP1]]
 //

And a problem here also (with BE)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101606/new/

https://reviews.llvm.org/D101606

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D101606: [ARM][MVE] vcreateq lane ordering for big endian

2021-04-30 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added inline comments.



Comment at: clang/test/CodeGen/arm-mve-intrinsics/admin.c:66
 // CHECK-NEXT:[[TMP1:%.*]] = insertelement <2 x i64> [[TMP0]], i64 
[[B:%.*]], i64 1
 // CHECK-NEXT:ret <2 x i64> [[TMP1]]
 //

dmgreen wrote:
> MarkMurrayARM wrote:
> > Surely there is a problem here also?
> I don't see why these would be a problem. Can you elaborate?
I'm wondering if they need to be swapped in the BE case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101606/new/

https://reviews.llvm.org/D101606

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94081: [AArch64] Add +flagm archictecture option, allowing the v8.4a flag modification extension.

2021-01-05 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
Herald added subscribers: danielkiss, hiraditya, kristof.beyls.
MarkMurrayARM requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D94081

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/AArch64TargetParser.h
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/test/MC/AArch64/armv8.4a-flag.s
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -1408,6 +1408,7 @@
 TEST(TargetParserTest, AArch64ArchExtFeature) {
   const char *ArchExt[][4] = {{"crc", "nocrc", "+crc", "-crc"},
   {"crypto", "nocrypto", "+crypto", "-crypto"},
+  {"flagm", "noflagm", "+flagm", "-flagm"},
   {"fp", "nofp", "+fp-armv8", "-fp-armv8"},
   {"simd", "nosimd", "+neon", "-neon"},
   {"fp16", "nofp16", "+fullfp16", "-fullfp16"},
Index: llvm/test/MC/AArch64/armv8.4a-flag.s
===
--- llvm/test/MC/AArch64/armv8.4a-flag.s
+++ llvm/test/MC/AArch64/armv8.4a-flag.s
@@ -1,13 +1,13 @@
 // RUN: llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+v8.4a %s -o - | \
 // RUN: FileCheck %s
 
-// RUN: llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+fmi %s -o - 2>&1 | \
+// RUN: llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+flagm %s -o - 2>&1 | \
 // RUN: FileCheck %s
 
 // RUN: not llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=-v8.4a %s -o - 2>&1 | \
 // RUN: FileCheck %s --check-prefix=ERROR
 
-// RUN: not llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+v8.4a,-fmi %s -o - 2>&1 | \
+// RUN: not llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+v8.4a,-flagm %s -o - 2>&1 | \
 // RUN: FileCheck %s --check-prefix=ERROR
 
 //--
@@ -30,24 +30,24 @@
 //CHECK-NEXT: rmif x1, #63, #15// encoding: [0x2f,0x84,0x1f,0xba]
 //CHECK-NEXT: rmif xzr, #63, #15   // encoding: [0xef,0x87,0x1f,0xba]
 
-//ERROR:  error: instruction requires: fmi
+//ERROR:  error: instruction requires: flagm
 //ERROR-NEXT: cfinv
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: setf8 w1
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: setf8 wzr
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: setf16 w1
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: setf16 wzr
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: rmif x1, #63, #15
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: rmif xzr, #63, #15
 //ERROR-NEXT: ^
Index: llvm/lib/Target/AArch64/AArch64Subtarget.h
===
--- llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -140,7 +140,7 @@
   bool HasSEL2 = false;
   bool HasPMU = false;
   bool HasTLB_RMI = false;
-  bool HasFMI = false;
+  bool HasFlagM = false;
   bool HasRCPC_IMMO = false;
 
   bool HasLSLFast = false;
@@ -513,7 +513,7 @@
   bool hasSEL2() const { return HasSEL2; }
   bool hasPMU() const { return HasPMU; }
   bool hasTLB_RMI() const { return HasTLB_RMI; }
-  bool hasFMI() const { return HasFMI; }
+  bool hasFlagM() const { return HasFlagM; }
   bool hasRCPC_IMMO() const { return HasRCPC_IMMO; }
 
   bool addrSinkUsingGEPs() const override {
Index: llvm/lib/Target/AArch64/AArch64InstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -69,8 +69,8 @@
 def HasTLB_RMI  : Predicate<"Subtarget->hasTLB_RMI()">,
AssemblerPredicate<(all_of FeatureTLB_RMI), "tlb-rmi">;
 
-def HasFMI   : Predicate<"Subtarget->hasFMI()">,
-   AssemblerPredicate<(all_of FeatureFMI), "fmi">;
+def HasFlagM : Predicate<"Subtarget->hasFlagM()">,
+  

[PATCH] D94083: [AArch64] Add +pauth archictecture option, allowing the v8.3a pointer authentication extension.

2021-01-05 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM created this revision.
Herald added subscribers: danielkiss, hiraditya, kristof.beyls.
MarkMurrayARM requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D94083

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/AArch64TargetParser.h
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -1431,6 +1431,7 @@
   {"rng", "norng", "+rand", "-rand"},
   {"memtag", "nomemtag", "+mte", "-mte"},
   {"tme", "notme", "+tme", "-tme"},
+  {"pauth", "nopauth", "+pauth", "-pauth"},
   {"ssbs", "nossbs", "+ssbs", "-ssbs"},
   {"sb", "nosb", "+sb", "-sb"},
   {"predres", "nopredres", "+predres", "-predres"},
Index: llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
===
--- llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -4958,7 +4958,7 @@
 AArch64::GPR64RegClass);
   }
 
-  if (STI.hasV8_3aOps()) {
+  if (STI.hasPAuth()) {
 MIRBuilder.buildInstr(AArch64::XPACI, {DstReg}, {MFReturnAddr});
   } else {
 MIRBuilder.buildCopy({Register(AArch64::LR)}, {MFReturnAddr});
@@ -4986,7 +4986,7 @@
 else {
   MFI.setReturnAddressIsTaken(true);
 
-  if (STI.hasV8_3aOps()) {
+  if (STI.hasPAuth()) {
 Register TmpReg = MRI.createVirtualRegister(&AArch64::GPR64RegClass);
 MIRBuilder.buildInstr(AArch64::LDRXui, {TmpReg}, {FrameAddr}).addImm(1);
 MIRBuilder.buildInstr(AArch64::XPACI, {DstReg}, {TmpReg});
Index: llvm/lib/Target/AArch64/AArch64SystemOperands.td
===
--- llvm/lib/Target/AArch64/AArch64SystemOperands.td
+++ llvm/lib/Target/AArch64/AArch64SystemOperands.td
@@ -1336,7 +1336,7 @@
 
 // v8.3a "Pointer authentication extension" registers
 //  Op0Op1 CRn CRmOp2
-let Requires = [{ {AArch64::FeaturePA} }] in {
+let Requires = [{ {AArch64::FeaturePAuth} }] in {
 def : RWSysReg<"APIAKeyLo_EL1", 0b11, 0b000, 0b0010, 0b0001, 0b000>;
 def : RWSysReg<"APIAKeyHi_EL1", 0b11, 0b000, 0b0010, 0b0001, 0b001>;
 def : RWSysReg<"APIBKeyLo_EL1", 0b11, 0b000, 0b0010, 0b0001, 0b010>;
Index: llvm/lib/Target/AArch64/AArch64Subtarget.h
===
--- llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -126,7 +126,7 @@
   bool HasAES = false;
 
   // ARMv8.3 extensions
-  bool HasPA = false;
+  bool HasPAuth = false;
   bool HasJS = false;
   bool HasCCIDX = false;
   bool HasComplxNum = false;
@@ -186,6 +186,7 @@
   bool HasETE = false;
   bool HasTRBE = false;
   bool HasBRBE = false;
+  bool HasPAUTH = false;
   bool HasSPE_EEF = false;
 
   // HasZeroCycleRegMove - Has zero-cycle register mov instructions.
@@ -450,6 +451,7 @@
   bool hasRandGen() const { return HasRandGen; }
   bool hasMTE() const { return HasMTE; }
   bool hasTME() const { return HasTME; }
+  bool hasPAUTH() const { return HasPAUTH; }
   // Arm SVE2 extensions
   bool hasSVE2AES() const { return HasSVE2AES; }
   bool hasSVE2SM4() const { return HasSVE2SM4; }
@@ -493,7 +495,7 @@
   bool hasPAN_RWV() const { return HasPAN_RWV; }
   bool hasCCPP() const { return HasCCPP; }
 
-  bool hasPA() const { return HasPA; }
+  bool hasPAuth() const { return HasPAuth; }
   bool hasJS() const { return HasJS; }
   bool hasCCIDX() const { return HasCCIDX; }
   bool hasComplxNum() const { return HasComplxNum; }
Index: llvm/lib/Target/AArch64/AArch64InstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -33,8 +33,8 @@
 def HasLOR   : Predicate<"Subtarget->hasLOR()">,
AssemblerPredicate<(all_of FeatureLOR

[PATCH] D94083: [AArch64] Add +pauth archictecture option, allowing the v8.3a pointer authentication extension.

2021-01-05 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM updated this revision to Diff 314579.
MarkMurrayARM added a comment.

Fixed the parent commit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94083/new/

https://reviews.llvm.org/D94083

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/AArch64TargetParser.h
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -1431,6 +1431,7 @@
   {"rng", "norng", "+rand", "-rand"},
   {"memtag", "nomemtag", "+mte", "-mte"},
   {"tme", "notme", "+tme", "-tme"},
+  {"pauth", "nopauth", "+pauth", "-pauth"},
   {"ssbs", "nossbs", "+ssbs", "-ssbs"},
   {"sb", "nosb", "+sb", "-sb"},
   {"predres", "nopredres", "+predres", "-predres"},
Index: llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
===
--- llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -4958,7 +4958,7 @@
 AArch64::GPR64RegClass);
   }
 
-  if (STI.hasV8_3aOps()) {
+  if (STI.hasPAuth()) {
 MIRBuilder.buildInstr(AArch64::XPACI, {DstReg}, {MFReturnAddr});
   } else {
 MIRBuilder.buildCopy({Register(AArch64::LR)}, {MFReturnAddr});
@@ -4986,7 +4986,7 @@
 else {
   MFI.setReturnAddressIsTaken(true);
 
-  if (STI.hasV8_3aOps()) {
+  if (STI.hasPAuth()) {
 Register TmpReg = MRI.createVirtualRegister(&AArch64::GPR64RegClass);
 MIRBuilder.buildInstr(AArch64::LDRXui, {TmpReg}, {FrameAddr}).addImm(1);
 MIRBuilder.buildInstr(AArch64::XPACI, {DstReg}, {TmpReg});
Index: llvm/lib/Target/AArch64/AArch64SystemOperands.td
===
--- llvm/lib/Target/AArch64/AArch64SystemOperands.td
+++ llvm/lib/Target/AArch64/AArch64SystemOperands.td
@@ -1336,7 +1336,7 @@
 
 // v8.3a "Pointer authentication extension" registers
 //  Op0Op1 CRn CRmOp2
-let Requires = [{ {AArch64::FeaturePA} }] in {
+let Requires = [{ {AArch64::FeaturePAuth} }] in {
 def : RWSysReg<"APIAKeyLo_EL1", 0b11, 0b000, 0b0010, 0b0001, 0b000>;
 def : RWSysReg<"APIAKeyHi_EL1", 0b11, 0b000, 0b0010, 0b0001, 0b001>;
 def : RWSysReg<"APIBKeyLo_EL1", 0b11, 0b000, 0b0010, 0b0001, 0b010>;
Index: llvm/lib/Target/AArch64/AArch64Subtarget.h
===
--- llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -126,7 +126,7 @@
   bool HasAES = false;
 
   // ARMv8.3 extensions
-  bool HasPA = false;
+  bool HasPAuth = false;
   bool HasJS = false;
   bool HasCCIDX = false;
   bool HasComplxNum = false;
@@ -186,6 +186,7 @@
   bool HasETE = false;
   bool HasTRBE = false;
   bool HasBRBE = false;
+  bool HasPAUTH = false;
   bool HasSPE_EEF = false;
 
   // HasZeroCycleRegMove - Has zero-cycle register mov instructions.
@@ -450,6 +451,7 @@
   bool hasRandGen() const { return HasRandGen; }
   bool hasMTE() const { return HasMTE; }
   bool hasTME() const { return HasTME; }
+  bool hasPAUTH() const { return HasPAUTH; }
   // Arm SVE2 extensions
   bool hasSVE2AES() const { return HasSVE2AES; }
   bool hasSVE2SM4() const { return HasSVE2SM4; }
@@ -493,7 +495,7 @@
   bool hasPAN_RWV() const { return HasPAN_RWV; }
   bool hasCCPP() const { return HasCCPP; }
 
-  bool hasPA() const { return HasPA; }
+  bool hasPAuth() const { return HasPAuth; }
   bool hasJS() const { return HasJS; }
   bool hasCCIDX() const { return HasCCIDX; }
   bool hasComplxNum() const { return HasComplxNum; }
Index: llvm/lib/Target/AArch64/AArch64InstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -33,8 +33,8 @@
 def HasLOR   : Predicate<"Subtarget->hasLOR()">,
AssemblerPredicate<(all_of FeatureLOR), "lor">;
 
-def HasPA: Predicate<"Subtarget->hasP

[PATCH] D94081: [AArch64] Add +flagm archictecture option, allowing the v8.4a flag modification extension.

2021-01-08 Thread Mark Murray via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG7d4a8bc417bf: [AArch64] Add +flagm archictecture option, 
allowing the v8.4a flag modification… (authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94081/new/

https://reviews.llvm.org/D94081

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/AArch64TargetParser.h
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/test/MC/AArch64/armv8.4a-flag.s
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -1408,6 +1408,7 @@
 TEST(TargetParserTest, AArch64ArchExtFeature) {
   const char *ArchExt[][4] = {{"crc", "nocrc", "+crc", "-crc"},
   {"crypto", "nocrypto", "+crypto", "-crypto"},
+  {"flagm", "noflagm", "+flagm", "-flagm"},
   {"fp", "nofp", "+fp-armv8", "-fp-armv8"},
   {"simd", "nosimd", "+neon", "-neon"},
   {"fp16", "nofp16", "+fullfp16", "-fullfp16"},
Index: llvm/test/MC/AArch64/armv8.4a-flag.s
===
--- llvm/test/MC/AArch64/armv8.4a-flag.s
+++ llvm/test/MC/AArch64/armv8.4a-flag.s
@@ -1,13 +1,13 @@
 // RUN: llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+v8.4a %s -o - | \
 // RUN: FileCheck %s
 
-// RUN: llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+fmi %s -o - 2>&1 | \
+// RUN: llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+flagm %s -o - 2>&1 | \
 // RUN: FileCheck %s
 
 // RUN: not llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=-v8.4a %s -o - 2>&1 | \
 // RUN: FileCheck %s --check-prefix=ERROR
 
-// RUN: not llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+v8.4a,-fmi %s -o - 2>&1 | \
+// RUN: not llvm-mc -triple aarch64-none-linux-gnu -show-encoding -mattr=+v8.4a,-flagm %s -o - 2>&1 | \
 // RUN: FileCheck %s --check-prefix=ERROR
 
 //--
@@ -30,24 +30,24 @@
 //CHECK-NEXT: rmif x1, #63, #15// encoding: [0x2f,0x84,0x1f,0xba]
 //CHECK-NEXT: rmif xzr, #63, #15   // encoding: [0xef,0x87,0x1f,0xba]
 
-//ERROR:  error: instruction requires: fmi
+//ERROR:  error: instruction requires: flagm
 //ERROR-NEXT: cfinv
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: setf8 w1
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: setf8 wzr
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: setf16 w1
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: setf16 wzr
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: rmif x1, #63, #15
 //ERROR-NEXT: ^
-//ERROR-NEXT: error: instruction requires: fmi
+//ERROR-NEXT: error: instruction requires: flagm
 //ERROR-NEXT: rmif xzr, #63, #15
 //ERROR-NEXT: ^
Index: llvm/lib/Target/AArch64/AArch64Subtarget.h
===
--- llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -140,7 +140,7 @@
   bool HasSEL2 = false;
   bool HasPMU = false;
   bool HasTLB_RMI = false;
-  bool HasFMI = false;
+  bool HasFlagM = false;
   bool HasRCPC_IMMO = false;
 
   bool HasLSLFast = false;
@@ -513,7 +513,7 @@
   bool hasSEL2() const { return HasSEL2; }
   bool hasPMU() const { return HasPMU; }
   bool hasTLB_RMI() const { return HasTLB_RMI; }
-  bool hasFMI() const { return HasFMI; }
+  bool hasFlagM() const { return HasFlagM; }
   bool hasRCPC_IMMO() const { return HasRCPC_IMMO; }
 
   bool addrSinkUsingGEPs() const override {
Index: llvm/lib/Target/AArch64/AArch64InstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -69,8 +69,8 @@
 def HasTLB_RMI  : Predicate<"Subtarget->hasTLB_RMI()">,
AssemblerPredicate<(all_of FeatureTLB_RMI), "tlb-rmi">;
 
-def HasFMI   : Predicate<"Subtarget->hasFMI()">,
-   AssemblerPredi

[PATCH] D94083: [AArch64] Add +pauth archictecture option, allowing the v8.3a pointer authentication extension.

2021-01-08 Thread Mark Murray via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGaf7cce2fa4d1: [AArch64] Add +pauth archictecture option, 
allowing the v8.3a pointer… (authored by MarkMurrayARM).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94083/new/

https://reviews.llvm.org/D94083

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  llvm/include/llvm/Support/AArch64TargetParser.def
  llvm/include/llvm/Support/AArch64TargetParser.h
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -1431,6 +1431,7 @@
   {"rng", "norng", "+rand", "-rand"},
   {"memtag", "nomemtag", "+mte", "-mte"},
   {"tme", "notme", "+tme", "-tme"},
+  {"pauth", "nopauth", "+pauth", "-pauth"},
   {"ssbs", "nossbs", "+ssbs", "-ssbs"},
   {"sb", "nosb", "+sb", "-sb"},
   {"predres", "nopredres", "+predres", "-predres"},
Index: llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
===
--- llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -4958,7 +4958,7 @@
 AArch64::GPR64RegClass);
   }
 
-  if (STI.hasV8_3aOps()) {
+  if (STI.hasPAuth()) {
 MIRBuilder.buildInstr(AArch64::XPACI, {DstReg}, {MFReturnAddr});
   } else {
 MIRBuilder.buildCopy({Register(AArch64::LR)}, {MFReturnAddr});
@@ -4986,7 +4986,7 @@
 else {
   MFI.setReturnAddressIsTaken(true);
 
-  if (STI.hasV8_3aOps()) {
+  if (STI.hasPAuth()) {
 Register TmpReg = MRI.createVirtualRegister(&AArch64::GPR64RegClass);
 MIRBuilder.buildInstr(AArch64::LDRXui, {TmpReg}, {FrameAddr}).addImm(1);
 MIRBuilder.buildInstr(AArch64::XPACI, {DstReg}, {TmpReg});
Index: llvm/lib/Target/AArch64/AArch64SystemOperands.td
===
--- llvm/lib/Target/AArch64/AArch64SystemOperands.td
+++ llvm/lib/Target/AArch64/AArch64SystemOperands.td
@@ -1336,7 +1336,7 @@
 
 // v8.3a "Pointer authentication extension" registers
 //  Op0Op1 CRn CRmOp2
-let Requires = [{ {AArch64::FeaturePA} }] in {
+let Requires = [{ {AArch64::FeaturePAuth} }] in {
 def : RWSysReg<"APIAKeyLo_EL1", 0b11, 0b000, 0b0010, 0b0001, 0b000>;
 def : RWSysReg<"APIAKeyHi_EL1", 0b11, 0b000, 0b0010, 0b0001, 0b001>;
 def : RWSysReg<"APIBKeyLo_EL1", 0b11, 0b000, 0b0010, 0b0001, 0b010>;
Index: llvm/lib/Target/AArch64/AArch64Subtarget.h
===
--- llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -126,7 +126,7 @@
   bool HasAES = false;
 
   // ARMv8.3 extensions
-  bool HasPA = false;
+  bool HasPAuth = false;
   bool HasJS = false;
   bool HasCCIDX = false;
   bool HasComplxNum = false;
@@ -186,6 +186,7 @@
   bool HasETE = false;
   bool HasTRBE = false;
   bool HasBRBE = false;
+  bool HasPAUTH = false;
   bool HasSPE_EEF = false;
 
   // HasZeroCycleRegMove - Has zero-cycle register mov instructions.
@@ -450,6 +451,7 @@
   bool hasRandGen() const { return HasRandGen; }
   bool hasMTE() const { return HasMTE; }
   bool hasTME() const { return HasTME; }
+  bool hasPAUTH() const { return HasPAUTH; }
   // Arm SVE2 extensions
   bool hasSVE2AES() const { return HasSVE2AES; }
   bool hasSVE2SM4() const { return HasSVE2SM4; }
@@ -493,7 +495,7 @@
   bool hasPAN_RWV() const { return HasPAN_RWV; }
   bool hasCCPP() const { return HasCCPP; }
 
-  bool hasPA() const { return HasPA; }
+  bool hasPAuth() const { return HasPAuth; }
   bool hasJS() const { return HasJS; }
   bool hasCCIDX() const { return HasCCIDX; }
   bool hasComplxNum() const { return HasComplxNum; }
Index: llvm/lib/Target/AArch64/AArch64InstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -33,8 +33,8 @@
 def HasLOR   : Pred

[PATCH] D109825: [AArch64]Enabling Cortex-A510 Support

2021-09-15 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM accepted this revision.
MarkMurrayARM added a comment.
This revision is now accepted and ready to land.

As I did the downstream work for this, I'm happy with it to go in in this form.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109825/new/

https://reviews.llvm.org/D109825

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109825: [AArch64]Enabling Cortex-A510 Support

2021-09-15 Thread Mark Murray via Phabricator via cfe-commits
MarkMurrayARM added a comment.

In D109825#3001622 , @dmgreen wrote:

>> As I did the downstream work for this, I'm happy with it to go in in this 
>> form.
>
> This doesn't seem.. wise. Please make sure the reviews you do are at a 
> sufficient quality, and it is probably best not to review patches you write 
> yourself.

Very good point!

I reduce this to the claim that it has been stable downstream for a couple of 
weeks now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109825/new/

https://reviews.llvm.org/D109825

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits