[PATCH] D60583: [AArch64] Implement Vector Funtion ABI name mangling.

2019-06-04 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

In D60583#1529878 , @jdoerfert wrote:

> Why/Where did we decide to clobber the attribute list with "non-existent 
> function names"?
>  I don't think an attribute list like this:
>  `attributes #1 = { "_ZGVsM2v_foo" "_ZGVsM32v_foo" "_ZGVsM4v_foo" 
> "_ZGVsM6v_foo" "_ZGVsM8v_foo" "_ZGVsMxv_foo" ... `
>  is helpful in any way, e.g., this would require us to search through all 
> attributes and interpret them one by one.


Agree. This is what was agreed : 
http://lists.llvm.org/pipermail/cfe-dev/2016-March/047732.html

The new RFC will get rid of this list of string attributes. It will become 
something like:

  attribute #0 = { 
declare-variant="comma,separated,list,of,vector,function,ABI,mangled,names" }.



> This seems to me like an ad-hoc implementation of the RFC that is currently 
> discussed but committed before the discussion is finished.

I can assure you that's not the case.

The code in this patch is what it is because it is based on previous (accepted) 
RFC originally proposed by other people and used by VecClone: 
https://reviews.llvm.org/D22792

As you can see in the unit tests of the VecClone pass, the variant attribute is 
added as follows:

  attributes #0 = { nounwind uwtable 
"vector-variants"="_ZGVbM4_foo1,_ZGVbN4_foo1,_ZGVcM8_foo1,_ZGVcN8_foo1,_ZGVdM8_foo1,_ZGVdN8_foo1,_ZGVeM16_foo1,_ZGVeN16_foo1"
   }

Nothing in LLVM is using those attributes at the moment, that might be the 
reason why the string attribute have not yet been moved to a single attribute.

Kind regards,

Francesco


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60583/new/

https://reviews.llvm.org/D60583



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79708: [clang][BFloat] add NEON emitter for bfloat

2020-05-21 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/utils/TableGen/NeonEmitter.cpp:2198
 
+static void emitNeonTypeDefs(const std::string& types, raw_ostream &OS) {
+  std::string TypedefTypes(types);

stuij wrote:
> fpetrogalli wrote:
> > Is this related to the changes for bfloat? Or is it a just a refactoring 
> > that it is nice to have? If the latter, please consider submitting it as a 
> > separate patch. If both refactoring and BF16 related, at the moment it is 
> > not possible to see clearly which changes are BF16 specific, so please do 
> > submit the refactoring first.
> Yes, related to bfloat. We're emitting that code twice now.
> 
> I can make a new patch, but I'm not sure if the effort justifies the in my 
> mind small amount of gain in clarity. It's basically just pasting the removed 
> part on the left into this function. If you disagree, tell me, and I will 
> create the extra patch.
Thank you. I didn't notice you were invoking this twice. Make sense to have it 
in a separate function, in this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79708/new/

https://reviews.llvm.org/D79708



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80323: [SVE] Eliminate calls to default-false VectorType::get() from Clang

2020-05-21 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM, it would be great if you could address the two comments I added, before 
submitting.




Comment at: clang/lib/CodeGen/CGBuiltin.cpp:10775
 
-  llvm::VectorType *MaskTy = llvm::VectorType::get(CGF.Builder.getInt1Ty(),
- cast(Mask->getType())->getBitWidth());
+  llvm::VectorType *MaskTy = llvm::FixedVectorType::get(
+  CGF.Builder.getInt1Ty(),

`auto`?



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:4670
 if (!CGF.CGM.getCodeGenOpts().PreserveVec3Type) {
-  auto Vec4Ty = llvm::VectorType::get(
+  auto Vec4Ty = llvm::FixedVectorType::get(
   cast(DstTy)->getElementType(), 4);

`auto *`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80323/new/

https://reviews.llvm.org/D80323



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80323: [SVE] Eliminate calls to default-false VectorType::get() from Clang

2020-05-21 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

In D80323#2048457 , @rjmccall wrote:

> I'm sympathetic to wanting to get rid of the boolean flag, but this is a 
> really invasive change for pretty minimal benefit.  Why not leave 
> `VectorType::get` as meaning a non-scalable vector type and come up with a 
> different method name to get a scalable vector?


Sorry @rjmccall , I saw your comments only after approving this patch. I didn't 
mean to enforce my views over yours.

@ctetreau , please make sure you address @rjmccall comment before submitting.

Kind regards,

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80323/new/

https://reviews.llvm.org/D80323



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79708: [clang][BFloat] add NEON emitter for bfloat

2020-05-27 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM! Might be worth waiting an extra day or two before submitting to make sure 
the people who provided extra feedback are happy.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79708/new/

https://reviews.llvm.org/D79708



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80740: [SveEmitter] Add SVE ACLE for svld1ro.

2020-05-28 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: sdesmalen, efriedma.
Herald added subscribers: cfe-commits, tschuett.
Herald added a project: clang.
fpetrogalli added a parent revision: D80738: [llvm][SVE] IR intrinsic for 
LD1RO..

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D80740

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro.c

Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro.c
@@ -0,0 +1,97 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 -D__ARM_FEATURE_SVE -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 -D__ARM_FEATURE_SVE -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svint8_t test_svld1ro_s8(svbool_t pg, const int8_t *base) {
+  // CHECK-LABEL: test_svld1ro_s8
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv16i8( %pg, i8* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _s8, , )(pg, base);
+}
+
+svint16_t test_svld1ro_s16(svbool_t pg, const int16_t *base) {
+  // CHECK-LABEL: test_svld1ro_s16
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv8i16( %[[PG]], i16* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _s16, , )(pg, base);
+}
+
+svint32_t test_svld1ro_s32(svbool_t pg, const int32_t *base) {
+  // CHECK-LABEL: test_svld1ro_s32
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv4i32( %[[PG]], i32* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _s32, , )(pg, base);
+}
+
+svint64_t test_svld1ro_s64(svbool_t pg, const int64_t *base) {
+  // CHECK-LABEL: test_svld1ro_s64
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv2i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv2i64( %[[PG]], i64* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _s64, , )(pg, base);
+}
+
+svuint8_t test_svld1ro_u8(svbool_t pg, const uint8_t *base) {
+  // CHECK-LABEL: test_svld1ro_u8
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv16i8( %pg, i8* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _u8, , )(pg, base);
+}
+
+svuint16_t test_svld1ro_u16(svbool_t pg, const uint16_t *base) {
+  // CHECK-LABEL: test_svld1ro_u16
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv8i16( %[[PG]], i16* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _u16, , )(pg, base);
+}
+
+svuint32_t test_svld1ro_u32(svbool_t pg, const uint32_t *base) {
+  // CHECK-LABEL: test_svld1ro_u32
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv4i32( %[[PG]], i32* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _u32, , )(pg, base);
+}
+
+svuint64_t test_svld1ro_u64(svbool_t pg, const uint64_t *base) {
+  // CHECK-LABEL: test_svld1ro_u64
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv2i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv2i64( %[[PG]], i64* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _u64, , )(pg, base);
+}
+
+svfloat16_t test_svld1ro_f16(svbool_t pg, const float16_t *base) {
+  // CHECK-LABEL: test_svld1ro_f16
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv8f16( %[[PG]], half* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _f16, , )(pg, base);
+}
+
+svfloat32_t test_svld1ro_f32(svbool_t pg, const float32_t *base) {
+  // CHECK-LABEL: test_svld1ro_f32
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv4f32( %[[PG]], float* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _f32, , )(pg, base);
+}
+
+svfloat64_t test_svld1ro_f64(svbool_t pg, const float64_t *base) {
+  // CHECK-LABEL: test_svld1ro_f64
+  // CHECK: %[[PG:.*]

[PATCH] D80851: [llvm][SveEmitter] SVE ACLE for quadword permute intrinsics.

2020-05-29 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: efriedma, sdesmalen, kmclaughlin.
Herald added subscribers: cfe-commits, kristof.beyls, tschuett.
Herald added a reviewer: rengolin.
Herald added a project: clang.

The following intrinsics have been added, guarded by the macro
`__ARM_FEATURE_SVE_MATMUL_FP64`:

- svtrn1q[_*]
- svtrn2q[_*]
- svuzp1q[_*]
- svuzp2q[_*]
- svzip1q[_*]
- svzip2q[_*]

Supported types:

- svint[8|16|32|64]_t
- svuint[8|16|32|64]_t
- svfloat[16|32|64]_t

TODO: add support for svbfloat16_t


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D80851

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn1-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn2-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp1-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp2-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip1-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip2-fp64.c

Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip2-fp64.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip2-fp64.c
@@ -0,0 +1,88 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 -D__ARM_FEATURE_SVE -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 -D__ARM_FEATURE_SVE -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svint8_t test_svzip2_s8(svint8_t op1, svint8_t op2) {
+  // CHECK-LABEL: test_svzip2_s8
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv16i8( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _s8, , )(op1, op2);
+}
+
+svint16_t test_svzip2_s16(svint16_t op1, svint16_t op2) {
+  // CHECK-LABEL: test_svzip2_s16
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv8i16( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _s16, , )(op1, op2);
+}
+
+svint32_t test_svzip2_s32(svint32_t op1, svint32_t op2) {
+  // CHECK-LABEL: test_svzip2_s32
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv4i32( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _s32, , )(op1, op2);
+}
+
+svint64_t test_svzip2_s64(svint64_t op1, svint64_t op2) {
+  // CHECK-LABEL: test_svzip2_s64
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv2i64( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _s64, , )(op1, op2);
+}
+
+svuint8_t test_svzip2_u8(svuint8_t op1, svuint8_t op2) {
+  // CHECK-LABEL: test_svzip2_u8
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv16i8( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _u8, , )(op1, op2);
+}
+
+svuint16_t test_svzip2_u16(svuint16_t op1, svuint16_t op2) {
+  // CHECK-LABEL: test_svzip2_u16
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv8i16( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _u16, , )(op1, op2);
+}
+
+svuint32_t test_svzip2_u32(svuint32_t op1, svuint32_t op2) {
+  // CHECK-LABEL: test_svzip2_u32
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv4i32( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _u32, , )(op1, op2);
+}
+
+svuint64_t test_svzip2_u64(svuint64_t op1, svuint64_t op2) {
+  // CHECK-LABEL: test_svzip2_u64
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv2i64( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _u64, , )(op1, op2);
+}
+
+svfloat16_t test_svzip2_f16(svfloat16_t op1, svfloat16_t op2) {
+  // CHECK-LABEL: test_svzip2_f16
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv8f16( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _f16, , )(op1, op2);
+}
+
+svfloat32_t test_svzip2_f32(svfloat32_t op1, svfloat32_t op2) {
+  // CHECK-LABEL: test_svzip2_f32
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv4f32( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _f32, , )(op1, op2);
+}
+
+svfloat64_t test_svzip2_f64(svfloat64_t op1, svfloat64_t op2) {
+  // CHECK-LABEL: test_svzip2_f64
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv2f64( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _f64, , )(op1, op2);
+}
Index: clang/test/Co

[PATCH] D80740: [SveEmitter] Add SVE ACLE for svld1ro.

2020-06-03 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG36b8af11d343: [SveEmitter] Add SVE ACLE for svld1ro. 
(authored by fpetrogalli).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80740/new/

https://reviews.llvm.org/D80740

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro.c

Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro.c
@@ -0,0 +1,97 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 -D__ARM_FEATURE_SVE -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 -D__ARM_FEATURE_SVE -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svint8_t test_svld1ro_s8(svbool_t pg, const int8_t *base) {
+  // CHECK-LABEL: test_svld1ro_s8
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv16i8( %pg, i8* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _s8, , )(pg, base);
+}
+
+svint16_t test_svld1ro_s16(svbool_t pg, const int16_t *base) {
+  // CHECK-LABEL: test_svld1ro_s16
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv8i16( %[[PG]], i16* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _s16, , )(pg, base);
+}
+
+svint32_t test_svld1ro_s32(svbool_t pg, const int32_t *base) {
+  // CHECK-LABEL: test_svld1ro_s32
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv4i32( %[[PG]], i32* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _s32, , )(pg, base);
+}
+
+svint64_t test_svld1ro_s64(svbool_t pg, const int64_t *base) {
+  // CHECK-LABEL: test_svld1ro_s64
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv2i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv2i64( %[[PG]], i64* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _s64, , )(pg, base);
+}
+
+svuint8_t test_svld1ro_u8(svbool_t pg, const uint8_t *base) {
+  // CHECK-LABEL: test_svld1ro_u8
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv16i8( %pg, i8* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _u8, , )(pg, base);
+}
+
+svuint16_t test_svld1ro_u16(svbool_t pg, const uint16_t *base) {
+  // CHECK-LABEL: test_svld1ro_u16
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv8i16( %[[PG]], i16* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _u16, , )(pg, base);
+}
+
+svuint32_t test_svld1ro_u32(svbool_t pg, const uint32_t *base) {
+  // CHECK-LABEL: test_svld1ro_u32
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv4i32( %[[PG]], i32* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _u32, , )(pg, base);
+}
+
+svuint64_t test_svld1ro_u64(svbool_t pg, const uint64_t *base) {
+  // CHECK-LABEL: test_svld1ro_u64
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv2i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv2i64( %[[PG]], i64* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _u64, , )(pg, base);
+}
+
+svfloat16_t test_svld1ro_f16(svbool_t pg, const float16_t *base) {
+  // CHECK-LABEL: test_svld1ro_f16
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv8i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv8f16( %[[PG]], half* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _f16, , )(pg, base);
+}
+
+svfloat32_t test_svld1ro_f32(svbool_t pg, const float32_t *base) {
+  // CHECK-LABEL: test_svld1ro_f32
+  // CHECK: %[[PG:.*]] = call  @llvm.aarch64.sve.convert.from.svbool.nxv4i1( %pg)
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.ld1ro.nxv4f32( %[[PG]], float* %base)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svld1ro, _f32, , )(pg, base);
+}
+
+svfloat64_t test_svld1ro_f64(svbool_t pg, const float64_t *base) {
+  // CHECK-LABEL: test_svld1ro_f64
+  // CHECK: %[[PG:.*]] = call

[PATCH] D81304: [llvm][SveEmitter] Emit the bfloat version of `svld1ro`.

2020-06-05 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: sdesmalen, efriedma, stuij, ctetreau.
Herald added subscribers: cfe-commits, tschuett.
Herald added a reviewer: shafik.
Herald added a reviewer: rengolin.
Herald added a project: clang.

The new SVE builtin type __SVBFloat16_t` is used to represent scalable
vectors of bfloat elements.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D81304

Files:
  clang/include/clang/Basic/AArch64SVEACLETypes.def
  clang/include/clang/Basic/arm_sve.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/CodeGen/CodeGenTypes.cpp
  clang/test/AST/ast-dump-aarch64-sve-types.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro-bfloat.c
  clang/unittests/AST/ASTImporterTest.cpp
  clang/unittests/AST/SizelessTypesTest.cpp
  clang/utils/TableGen/SveEmitter.cpp

Index: clang/utils/TableGen/SveEmitter.cpp
===
--- clang/utils/TableGen/SveEmitter.cpp
+++ clang/utils/TableGen/SveEmitter.cpp
@@ -65,7 +65,7 @@
 
 class SVEType {
   TypeSpec TS;
-  bool Float, Signed, Immediate, Void, Constant, Pointer;
+  bool Float, Signed, Immediate, Void, Constant, Pointer, BFloat;
   bool DefaultType, IsScalable, Predicate, PredicatePattern, PrefetchOp;
   unsigned Bitwidth, ElementBitwidth, NumVectors;
 
@@ -74,9 +74,9 @@
 
   SVEType(TypeSpec TS, char CharMod)
   : TS(TS), Float(false), Signed(true), Immediate(false), Void(false),
-Constant(false), Pointer(false), DefaultType(false), IsScalable(true),
-Predicate(false), PredicatePattern(false), PrefetchOp(false),
-Bitwidth(128), ElementBitwidth(~0U), NumVectors(1) {
+Constant(false), Pointer(false), BFloat(false), DefaultType(false),
+IsScalable(true), Predicate(false), PredicatePattern(false),
+PrefetchOp(false), Bitwidth(128), ElementBitwidth(~0U), NumVectors(1) {
 if (!TS.empty())
   applyTypespec();
 applyModifier(CharMod);
@@ -93,9 +93,10 @@
   bool isVoid() const { return Void & !Pointer; }
   bool isDefault() const { return DefaultType; }
   bool isFloat() const { return Float; }
-  bool isInteger() const { return !Float && !Predicate; }
+  bool isBFloat() const { return BFloat; }
+  bool isInteger() const { return !Float && !Predicate && !BFloat; }
   bool isScalarPredicate() const {
-return !Float && Predicate && NumVectors == 0;
+return !BFloat && !Float && Predicate && NumVectors == 0;
   }
   bool isPredicateVector() const { return Predicate; }
   bool isPredicatePattern() const { return PredicatePattern; }
@@ -362,7 +363,7 @@
 
   if (isVoidPointer())
 S += "v";
-  else if (!Float)
+  else if (!isFloat() && !isBFloat())
 switch (ElementBitwidth) {
 case 1: S += "b"; break;
 case 8: S += "c"; break;
@@ -372,15 +373,20 @@
 case 128: S += "LLLi"; break;
 default: llvm_unreachable("Unhandled case!");
 }
-  else
+  else if (isFloat())
 switch (ElementBitwidth) {
 case 16: S += "h"; break;
 case 32: S += "f"; break;
 case 64: S += "d"; break;
 default: llvm_unreachable("Unhandled case!");
 }
+  else if (isBFloat())
+switch (ElementBitwidth) {
+case 16: S += "y"; break;
+default: llvm_unreachable("Unhandled case!");
+}
 
-  if (!isFloat()) {
+  if (!isFloat() && !isBFloat()) {
 if ((isChar() || isPointer()) && !isVoidPointer()) {
   // Make chars and typed pointers explicitly signed.
   if (Signed)
@@ -421,13 +427,15 @@
   else {
 if (isScalableVector())
   S += "sv";
-if (!Signed && !Float)
+if (!Signed && !Float && !isBFloat())
   S += "u";
 
 if (Float)
   S += "float";
 else if (isScalarPredicate() || isPredicateVector())
   S += "bool";
+else if (isBFloat())
+  S += "bfloat";
 else
   S += "int";
 
@@ -480,6 +488,10 @@
   Float = true;
   ElementBitwidth = 64;
   break;
+case 'b':
+  BFloat = true;
+  ElementBitwidth = 16;
+  break;
 default:
   llvm_unreachable("Unhandled type code!");
 }
@@ -524,6 +536,7 @@
   case 'P':
 Signed = true;
 Float = false;
+BFloat = false;
 Predicate = true;
 Bitwidth = 16;
 ElementBitwidth = 1;
@@ -774,7 +787,6 @@
   BaseTypeSpec(BT), Class(Class), Guard(Guard.str()),
   MergeSuffix(MergeSuffix.str()), BaseType(BT, 'd'), Flags(Flags),
   ImmChecks(Checks.begin(), Checks.end()) {
-
   // Types[0] is the return value.
   for (unsigned I = 0; I < Proto.size(); ++I) {
 SVEType T(BaseTypeSpec, Proto[I]);
@@ -849,6 +861,8 @@
   TypeCode = T.isSigned() ? 's' : 'u';
 else if (T.isPredicateVector())
   TypeCode = 'b';
+else if (T.isBFloat())
+  TypeCode = "bf";
 else
   TypeCode = 'f';
 Ret.replace(Pos, NumChars, TypeCode + utostr(T.getElementSizeInBits()));
@@ -924,6 +938,15 @@
 }
   }
 
+  if (T.isBFloat()) {
+switch (T.getElementSizeInBits()) {
+case 16:
+  re

[PATCH] D80851: [llvm][SveEmitter] SVE ACLE for quadword permute intrinsics.

2020-06-15 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG017969de7662: [llvm][SveEmitter] SVE ACLE for quadword 
permute intrinsics. (authored by fpetrogalli).

Changed prior to commit:
  https://reviews.llvm.org/D80851?vs=267392&id=270793#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80851/new/

https://reviews.llvm.org/D80851

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn1-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn2-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp1-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp2-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip1-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip2-fp64.c

Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip2-fp64.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip2-fp64.c
@@ -0,0 +1,88 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 -D__ARM_FEATURE_SVE -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 -D__ARM_FEATURE_SVE -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svint8_t test_svzip2_s8(svint8_t op1, svint8_t op2) {
+  // CHECK-LABEL: test_svzip2_s8
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv16i8( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _s8, , )(op1, op2);
+}
+
+svint16_t test_svzip2_s16(svint16_t op1, svint16_t op2) {
+  // CHECK-LABEL: test_svzip2_s16
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv8i16( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _s16, , )(op1, op2);
+}
+
+svint32_t test_svzip2_s32(svint32_t op1, svint32_t op2) {
+  // CHECK-LABEL: test_svzip2_s32
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv4i32( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _s32, , )(op1, op2);
+}
+
+svint64_t test_svzip2_s64(svint64_t op1, svint64_t op2) {
+  // CHECK-LABEL: test_svzip2_s64
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv2i64( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _s64, , )(op1, op2);
+}
+
+svuint8_t test_svzip2_u8(svuint8_t op1, svuint8_t op2) {
+  // CHECK-LABEL: test_svzip2_u8
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv16i8( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _u8, , )(op1, op2);
+}
+
+svuint16_t test_svzip2_u16(svuint16_t op1, svuint16_t op2) {
+  // CHECK-LABEL: test_svzip2_u16
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv8i16( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _u16, , )(op1, op2);
+}
+
+svuint32_t test_svzip2_u32(svuint32_t op1, svuint32_t op2) {
+  // CHECK-LABEL: test_svzip2_u32
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv4i32( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _u32, , )(op1, op2);
+}
+
+svuint64_t test_svzip2_u64(svuint64_t op1, svuint64_t op2) {
+  // CHECK-LABEL: test_svzip2_u64
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv2i64( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _u64, , )(op1, op2);
+}
+
+svfloat16_t test_svzip2_f16(svfloat16_t op1, svfloat16_t op2) {
+  // CHECK-LABEL: test_svzip2_f16
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv8f16( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _f16, , )(op1, op2);
+}
+
+svfloat32_t test_svzip2_f32(svfloat32_t op1, svfloat32_t op2) {
+  // CHECK-LABEL: test_svzip2_f32
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv4f32( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _f32, , )(op1, op2);
+}
+
+svfloat64_t test_svzip2_f64(svfloat64_t op1, svfloat64_t op2) {
+  // CHECK-LABEL: test_svzip2_f64
+  // CHECK: %[[INTRINSIC:.*]] = call  @llvm.aarch64.sve.zip2q.nxv2f64( %op1,  %op2)
+  // CHECK: ret  %[[INTRINSIC]]
+  return SVE_ACLE_FUNC(svzip2q, _f64, , )(op1, op2);
+}
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip1-fp64.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_

[PATCH] D81304: [llvm][SveEmitter] Emit the bfloat version of `svld1ro`.

2020-06-15 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 270911.
fpetrogalli marked 4 inline comments as done.
fpetrogalli added a comment.

Thank you for the review @sdesmalen.

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81304/new/

https://reviews.llvm.org/D81304

Files:
  clang/include/clang/Basic/AArch64SVEACLETypes.def
  clang/include/clang/Basic/arm_sve.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/AST/ItaniumMangle.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/CodeGen/CodeGenTypes.cpp
  clang/test/AST/ast-dump-aarch64-sve-types.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro-bfloat.c
  clang/unittests/AST/ASTImporterTest.cpp
  clang/unittests/AST/SizelessTypesTest.cpp
  clang/utils/TableGen/SveEmitter.cpp

Index: clang/utils/TableGen/SveEmitter.cpp
===
--- clang/utils/TableGen/SveEmitter.cpp
+++ clang/utils/TableGen/SveEmitter.cpp
@@ -65,7 +65,7 @@
 
 class SVEType {
   TypeSpec TS;
-  bool Float, Signed, Immediate, Void, Constant, Pointer;
+  bool Float, Signed, Immediate, Void, Constant, Pointer, BFloat;
   bool DefaultType, IsScalable, Predicate, PredicatePattern, PrefetchOp;
   unsigned Bitwidth, ElementBitwidth, NumVectors;
 
@@ -74,9 +74,9 @@
 
   SVEType(TypeSpec TS, char CharMod)
   : TS(TS), Float(false), Signed(true), Immediate(false), Void(false),
-Constant(false), Pointer(false), DefaultType(false), IsScalable(true),
-Predicate(false), PredicatePattern(false), PrefetchOp(false),
-Bitwidth(128), ElementBitwidth(~0U), NumVectors(1) {
+Constant(false), Pointer(false), BFloat(false), DefaultType(false),
+IsScalable(true), Predicate(false), PredicatePattern(false),
+PrefetchOp(false), Bitwidth(128), ElementBitwidth(~0U), NumVectors(1) {
 if (!TS.empty())
   applyTypespec();
 applyModifier(CharMod);
@@ -93,9 +93,11 @@
   bool isVoid() const { return Void & !Pointer; }
   bool isDefault() const { return DefaultType; }
   bool isFloat() const { return Float; }
-  bool isInteger() const { return !Float && !Predicate; }
+  bool isBFloat() const { return BFloat; }
+  bool isFloatingPoint() const { return Float || BFloat; }
+  bool isInteger() const { return !isFloatingPoint() && !Predicate; }
   bool isScalarPredicate() const {
-return !Float && Predicate && NumVectors == 0;
+return !isFloatingPoint() && Predicate && NumVectors == 0;
   }
   bool isPredicateVector() const { return Predicate; }
   bool isPredicatePattern() const { return PredicatePattern; }
@@ -362,7 +364,7 @@
 
   if (isVoidPointer())
 S += "v";
-  else if (!Float)
+  else if (!isFloatingPoint())
 switch (ElementBitwidth) {
 case 1: S += "b"; break;
 case 8: S += "c"; break;
@@ -372,15 +374,19 @@
 case 128: S += "LLLi"; break;
 default: llvm_unreachable("Unhandled case!");
 }
-  else
+  else if (isFloat())
 switch (ElementBitwidth) {
 case 16: S += "h"; break;
 case 32: S += "f"; break;
 case 64: S += "d"; break;
 default: llvm_unreachable("Unhandled case!");
 }
+  else if (isBFloat()) {
+assert(ElementBitwidth == 16 && "Not a valid BFloat.");
+S += "y";
+  }
 
-  if (!isFloat()) {
+  if (!isFloatingPoint()) {
 if ((isChar() || isPointer()) && !isVoidPointer()) {
   // Make chars and typed pointers explicitly signed.
   if (Signed)
@@ -421,13 +427,15 @@
   else {
 if (isScalableVector())
   S += "sv";
-if (!Signed && !Float)
+if (!Signed && !isFloatingPoint())
   S += "u";
 
 if (Float)
   S += "float";
 else if (isScalarPredicate() || isPredicateVector())
   S += "bool";
+else if (isBFloat())
+  S += "bfloat";
 else
   S += "int";
 
@@ -481,6 +489,10 @@
   Float = true;
   ElementBitwidth = 64;
   break;
+case 'b':
+  BFloat = true;
+  ElementBitwidth = 16;
+  break;
 default:
   llvm_unreachable("Unhandled type code!");
 }
@@ -534,6 +546,7 @@
   case 'P':
 Signed = true;
 Float = false;
+BFloat = false;
 Predicate = true;
 Bitwidth = 16;
 ElementBitwidth = 1;
@@ -784,7 +797,6 @@
   BaseTypeSpec(BT), Class(Class), Guard(Guard.str()),
   MergeSuffix(MergeSuffix.str()), BaseType(BT, 'd'), Flags(Flags),
   ImmChecks(Checks.begin(), Checks.end()) {
-
   // Types[0] is the return value.
   for (unsigned I = 0; I < Proto.size(); ++I) {
 SVEType T(BaseTypeSpec, Proto[I]);
@@ -848,6 +860,8 @@
   TypeCode = T.isSigned() ? 's' : 'u';
 else if (T.isPredicateVector())
   TypeCode = 'b';
+else if (T.isBFloat())
+  TypeCode = "bf";
 else
   TypeCode = 'f';
 Ret.replace(Pos, NumChars, TypeCode + utostr(T.getElementSizeInBits()));
@@ -923,6 +937,11 @@
 }
   }
 
+  if (T.isBFloat()) {
+assert(T.getElementSizeInBits() == 16 && "Not a valid BFloat.");
+return encodeEltType("EltTyBFloat16");
+  }
+
   if (T.isP

[PATCH] D81304: [llvm][SveEmitter] Emit the bfloat version of `svld1ro`.

2020-06-15 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81304/new/

https://reviews.llvm.org/D81304



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81304: [llvm][SveEmitter] Emit the bfloat version of `svld1ro`.

2020-06-16 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/include/clang/Basic/AArch64SVEACLETypes.def:44
 #define SVE_VECTOR_TYPE(Name, MangledName, Id, SingletonId, NumEls, ElBits,
\
-IsSigned, IsFP)
\
+IsSigned, IsFP, IsBF)\
   SVE_TYPE(Name, Id, SingletonId)

sdesmalen wrote:
> nit: odd formatting (of the last `\`), did you use clang-format?
Yes, but `def` files are left untouched. I fixed it.



Comment at: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro-bfloat.c:1
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 
-D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -D__ARM_FEATURE_SVE -triple 
aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 
-fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | 
FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 
-D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -D__ARM_FEATURE_SVE 
-DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +bf16 -fallow-half-arguments-and-returns -S -O1 -Werror -Wall 
-emit-llvm -o - %s | FileCheck %s

sdesmalen wrote:
> stuij wrote:
> > There should be no dependency on `-fallow-half-arguments-and-returns`. For 
> > bfloat we should use `-mfloat-abi hard`. Does this work for `-mfloat-abi 
> > softfp`?
> `-fallow-half-arguments-and-returns` isn't strictly needed for this test, we 
> just use the same RUN line for all the ACLE tests and we needed this for 
> `__fp16` in some of the tests.
> 
> I don't believe that `-mfloat-abi softfp` is supported for AArch64.
@stuij - the following lines work, one with `softfp` and one with `hard`:

```
// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 
-D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -D__ARM_FEATURE_SVE 
-DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve 
-target\
-feature +bf16 -mfloat-abi softfp -fallow-half-arguments-and-returns -S -O1 
-Werror -Wall -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -D__ARM_FEATURE_SVE_MATMUL_FP64 
-D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -D__ARM_FEATURE_SVE 
-DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve 
-target\
-feature +bf16 -mfloat-abi hard -fallow-half-arguments-and-returns -S -O1 
-Werror -Wall -emit-llvm -o - %s | FileCheck %s
```

@sdesmalen I am not an experer here, but there is a test which targets aarch64 
that uses `softfp` (see `clang/test/CodeGen/arm-bf16-params-returns.c`). The 
following line in that test clearly targets `aarch64`:

```
// RUN: %clang_cc1 -triple aarch64-arm-none-eabi -target-abi aapcs -mfloat-abi 
softfp -target-feature +bf16 -target-feature +neon -emit-llvm -O2 -o - %s | opt 
-S -mem2reg -sroa | FileCheck %s \
--check-prefix=CHECK64-SOFTFP
```

@both - should I update the test with the two extra RUN lines mentioned up in 
the message?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81304/new/

https://reviews.llvm.org/D81304



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81304: [llvm][SveEmitter] Emit the bfloat version of `svld1ro`.

2020-06-16 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 271178.
fpetrogalli marked 3 inline comments as done.
fpetrogalli added a comment.

Fix formatting issues, remove an unnecessary empty line.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81304/new/

https://reviews.llvm.org/D81304

Files:
  clang/include/clang/Basic/AArch64SVEACLETypes.def
  clang/include/clang/Basic/arm_sve.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/AST/ItaniumMangle.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/CodeGen/CodeGenTypes.cpp
  clang/test/AST/ast-dump-aarch64-sve-types.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro-bfloat.c
  clang/unittests/AST/ASTImporterTest.cpp
  clang/unittests/AST/SizelessTypesTest.cpp
  clang/utils/TableGen/SveEmitter.cpp

Index: clang/utils/TableGen/SveEmitter.cpp
===
--- clang/utils/TableGen/SveEmitter.cpp
+++ clang/utils/TableGen/SveEmitter.cpp
@@ -65,7 +65,7 @@
 
 class SVEType {
   TypeSpec TS;
-  bool Float, Signed, Immediate, Void, Constant, Pointer;
+  bool Float, Signed, Immediate, Void, Constant, Pointer, BFloat;
   bool DefaultType, IsScalable, Predicate, PredicatePattern, PrefetchOp;
   unsigned Bitwidth, ElementBitwidth, NumVectors;
 
@@ -74,9 +74,9 @@
 
   SVEType(TypeSpec TS, char CharMod)
   : TS(TS), Float(false), Signed(true), Immediate(false), Void(false),
-Constant(false), Pointer(false), DefaultType(false), IsScalable(true),
-Predicate(false), PredicatePattern(false), PrefetchOp(false),
-Bitwidth(128), ElementBitwidth(~0U), NumVectors(1) {
+Constant(false), Pointer(false), BFloat(false), DefaultType(false),
+IsScalable(true), Predicate(false), PredicatePattern(false),
+PrefetchOp(false), Bitwidth(128), ElementBitwidth(~0U), NumVectors(1) {
 if (!TS.empty())
   applyTypespec();
 applyModifier(CharMod);
@@ -93,9 +93,11 @@
   bool isVoid() const { return Void & !Pointer; }
   bool isDefault() const { return DefaultType; }
   bool isFloat() const { return Float; }
-  bool isInteger() const { return !Float && !Predicate; }
+  bool isBFloat() const { return BFloat; }
+  bool isFloatingPoint() const { return Float || BFloat; }
+  bool isInteger() const { return !isFloatingPoint() && !Predicate; }
   bool isScalarPredicate() const {
-return !Float && Predicate && NumVectors == 0;
+return !isFloatingPoint() && Predicate && NumVectors == 0;
   }
   bool isPredicateVector() const { return Predicate; }
   bool isPredicatePattern() const { return PredicatePattern; }
@@ -362,7 +364,7 @@
 
   if (isVoidPointer())
 S += "v";
-  else if (!Float)
+  else if (!isFloatingPoint())
 switch (ElementBitwidth) {
 case 1: S += "b"; break;
 case 8: S += "c"; break;
@@ -372,15 +374,19 @@
 case 128: S += "LLLi"; break;
 default: llvm_unreachable("Unhandled case!");
 }
-  else
+  else if (isFloat())
 switch (ElementBitwidth) {
 case 16: S += "h"; break;
 case 32: S += "f"; break;
 case 64: S += "d"; break;
 default: llvm_unreachable("Unhandled case!");
 }
+  else if (isBFloat()) {
+assert(ElementBitwidth == 16 && "Not a valid BFloat.");
+S += "y";
+  }
 
-  if (!isFloat()) {
+  if (!isFloatingPoint()) {
 if ((isChar() || isPointer()) && !isVoidPointer()) {
   // Make chars and typed pointers explicitly signed.
   if (Signed)
@@ -421,13 +427,15 @@
   else {
 if (isScalableVector())
   S += "sv";
-if (!Signed && !Float)
+if (!Signed && !isFloatingPoint())
   S += "u";
 
 if (Float)
   S += "float";
 else if (isScalarPredicate() || isPredicateVector())
   S += "bool";
+else if (isBFloat())
+  S += "bfloat";
 else
   S += "int";
 
@@ -481,6 +489,10 @@
   Float = true;
   ElementBitwidth = 64;
   break;
+case 'b':
+  BFloat = true;
+  ElementBitwidth = 16;
+  break;
 default:
   llvm_unreachable("Unhandled type code!");
 }
@@ -534,6 +546,7 @@
   case 'P':
 Signed = true;
 Float = false;
+BFloat = false;
 Predicate = true;
 Bitwidth = 16;
 ElementBitwidth = 1;
@@ -784,7 +797,6 @@
   BaseTypeSpec(BT), Class(Class), Guard(Guard.str()),
   MergeSuffix(MergeSuffix.str()), BaseType(BT, 'd'), Flags(Flags),
   ImmChecks(Checks.begin(), Checks.end()) {
-
   // Types[0] is the return value.
   for (unsigned I = 0; I < Proto.size(); ++I) {
 SVEType T(BaseTypeSpec, Proto[I]);
@@ -848,6 +860,8 @@
   TypeCode = T.isSigned() ? 's' : 'u';
 else if (T.isPredicateVector())
   TypeCode = 'b';
+else if (T.isBFloat())
+  TypeCode = "bf";
 else
   TypeCode = 'f';
 Ret.replace(Pos, NumChars, TypeCode + utostr(T.getElementSizeInBits()));
@@ -923,6 +937,11 @@
 }
   }
 
+  if (T.isBFloat()) {
+assert(T.getElementSizeInBits() == 16 && "Not a valid BFloat.");
+return encodeEltType("EltTyBFloat16");
+  }
+
   

[PATCH] D82908: [SVE] ACLE: Fix builtins for svdup_lane_bf16 and svcvtnt_bf16_f32_x

2020-07-01 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM, thank you.




Comment at: clang/utils/TableGen/SveEmitter.cpp:1265-1279
+  OS << "#if defined(__ARM_FEATURE_SVE_BF16)\n";
+  OS << "#define svcvtnt_bf16_x  svcvtnt_bf16_m\n";
+  OS << "#define svcvtnt_bf16_f32_x  svcvtnt_bf16_f32_m\n";
+  OS << "#endif /*__ARM_FEATURE_SVE_BF16 */\n\n";
+
   OS << "#if defined(__ARM_FEATURE_SVE2)\n";
   OS << "#define svcvtnt_f16_x  svcvtnt_f16_m\n";

nit: worth adding a comment in the emitter  explaining why these redirections 
are needed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82908/new/

https://reviews.llvm.org/D82908



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50229: [ARM][AArch64] Add feature +fp16fml

2020-07-01 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.
Herald added a subscriber: danielkiss.



Comment at: test/Driver/aarch64-cpus.c:518
+// RUN: %clang -target aarch64 -march=armv8.4-a -### -c %s 2>&1 | FileCheck 
-check-prefix=GENERICV84A-NO-FP16FML %s
+// GENERICV84A-NO-FP16FML-NOT: "-target-feature" "{{[+-]}}fp16fml"
+// GENERICV84A-NO-FP16FML-NOT: "-target-feature" "{{[+-]}}fullfp16"

Hi @SjoerdMeijer , I have noticed that this test does something different from 
what gcc does (well, claims to do, I haven't checked the actual behavior on 
gcc).

From the table in [1], it seems that `armv8.4-a` implies `fp16fml`... who got 
it right? GCC or clang? Or am I missing something?

Francesco


[1] https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#AArch64-Options 
(see the description of `-march=name`)


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D50229/new/

https://reviews.llvm.org/D50229



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83079: [clang] Default target features implied by `-march` on AArch64.

2020-07-02 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: sdesmalen, efriedma, SjoerdMeijer.
Herald added subscribers: cfe-commits, danielkiss, kristof.beyls.
Herald added a project: clang.

This patch is trying to align the interpretation of `-march` on
AArch64 to what GCC has, in terms of the target features associated to
a value of the architecture version, whether armv8a, armv8.1a, ..., up
to armv8.6.

Ideally we would like to fully implement the recursive function

features(8.X) =  {new features of 8.X} U features(8.(X-1))

as GCC describes it in
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#AArch64-Options,
under the option `-march=name`.

However, in this initial patch we had to stop at version 8.5 of the
architecture because there seem to be some disagreement on what is the
right set of features for armv8.4a.

Once the disagreement is sorted, we will extent the behavior of
`AddAArch64DefaultFeatures` to cascade across all the architecture
versions.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D83079

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Driver/aarch64-march-default-features.c

Index: clang/test/Driver/aarch64-march-default-features.c
===
--- /dev/null
+++ clang/test/Driver/aarch64-march-default-features.c
@@ -0,0 +1,27 @@
+// RUN: %clang -target aarch64-linux-gnu -march=armv8.6a-### -c %s 2>&1 | FileCheck %s --check-prefixes=CHECK-8_6,CHECK-8_5
+// RUN: %clang -target aarch64-linux-gnu -march=armv8.6a+nobf16 -### -c %s 2>&1 | FileCheck %s --check-prefix=CHECK-8_6-NOBF16
+// RUN: %clang -target aarch64-linux-gnu -march=armv8.6a+noi8mm -### -c %s 2>&1 | FileCheck %s --check-prefix=CHECK-8_6-NOI8MM
+
+// CHECK-8_6: "-target-feature" "+i8mm" "-target-feature" "+bf16"
+// CHECK-8_6-NOBF16-NOT:"-target-feature" "+bf16"
+// CHECK-8_6-NOBF16:  "-target-feature" "+i8mm"
+// CHECK-8_6-NOBF16-SAME:   "-target-feature" "-bf16"
+// CHECK-8_6-NOI8MM-NOT:  "-target-feature" "+i8mm"
+// CHECK-8_6-NOI8MM:"-target-feature" "+bf16"
+// CHECK-8_6-NOI8MM-SAME: "-target-feature" "-i8mm"
+
+// RUN: %clang -target aarch64-linux-gnu -march=armv8.5a   -### -c %s 2>&1 | FileCheck %s --check-prefixes=CHECK-8_5
+// RUN: %clang -target aarch64-linux-gnu -march=armv8.5a+nosb  -### -c %s 2>&1 | FileCheck %s --check-prefix=CHECK-8_5-NOSB
+// RUN: %clang -target aarch64-linux-gnu -march=armv8.5a+nossbs-### -c %s 2>&1 | FileCheck %s --check-prefix=CHECK-8_5-NOSSBS
+// RUN: %clang -target aarch64-linux-gnu -march=armv8.5a+nopredres -### -c %s 2>&1 | FileCheck %s --check-prefix=CHECK-8_5-NOPREDRES
+
+// CHECK-8_5:"-target-feature" "+sb" "-target-feature" "+ssbs" "-target-feature" "+predres"
+// CHECK-8_5-NOSB-NOT:   "-target-feature" "+sb"
+// CHECK-8_5-NOSB:   "-target-feature" "+ssbs" "-target-feature" "+predres"
+// CHECK-8_5-NOSB-SAME:  "-target-feature" "-sb"
+// CHECK-8_5-NOSSBS-NOT: "-target-feature" "+ssbs"
+// CHECK-8_5-NOSSBS: "-target-feature" "+sb"   "-target-feature" "+predres"
+// CHECK-8_5-NOSSBS-SAME:"-target-feature" "-ssbs"
+// CHECK-8_5-NOPREDRES-NOT:"-target-feature" "+predres"
+// CHECK-8_5-NOPREDRES:  "-target-feature" "+sb" "-target-feature" "+ssbs"
+// CHECK-8_5-NOPREDRES-SAME:   "-target-feature" "-predres"
Index: clang/test/Driver/aarch64-cpus.c
===
--- clang/test/Driver/aarch64-cpus.c
+++ clang/test/Driver/aarch64-cpus.c
@@ -643,7 +643,7 @@
 // GENERICV85A-BE: "-cc1"{{.*}} "-triple" "aarch64_be{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.5a"
 
 // RUN: %clang -target aarch64 -march=armv8.5-a+fp16 -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV85A-FP16 %s
-// GENERICV85A-FP16: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.5a" "-target-feature" "+fullfp16"
+// GENERICV85A-FP16: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.5a" "-target-feature" "+sb" "-target-feature" "+ssbs" "-target-feature" "+predres" "-target-feature" "+fullfp16"
 
 // RUN: %clang -target aarch64 -march=armv8.6a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV86A %s
 // RUN: %clang -target aarch64 -march=armv8.6-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV86A %s
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64

[PATCH] D83079: [clang] Default target features implied by `-march` on AArch64.

2020-07-02 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli marked an inline comment as done.
fpetrogalli added a comment.

In D83079#2129371 , @SjoerdMeijer 
wrote:

> I haven't looked into details of these arch extensions yet, will do that 
> tomorrow, but I don't think there's any disagreement, i.e., these options and 
> their behaviour are synchronised with the GCC community.  Thus, it's better 
> if you remove this wording from both description and comments in the source 
> code. If there is divergence in behaviour, then that is probably be an 
> oversight somewhere. I am also pretty sure that this 8.5 stuff is 
> implemented, not sure if that is upstreamed. And also, I think we need to be 
> more specific than "Default target features implied by `-march` on AArch64", 
> because this is a big topic, which is not addressed in this patch.  This 
> patch is only about a few architecture extension. I am also for example 
> surprised to see bfloat there, as I saw quite some activity in this area.


I @SjoerdMeijer  - I think I misinterpreted the way the target feature are 
generated, and which part of the codebase handles that. I wasn't aware about 
the common code pointed out by @sdesmalen that was translating architecture 
version into target feature - I thought everything needed to be translated by 
the driver into a list of `-target-feature`, that is why I thought there was 
discrepancy. If there is any, this patch is probably not the right way to fix 
it. The best thing to do is abandon this patch.




Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:118
+  case llvm::AArch64::ArchKind::ARMV8_6A:
+Features.push_back("+i8mm");
+Features.push_back("+bf16");

sdesmalen wrote:
> Looking at what Clang emits for e.g. `-march=armv8.5-a`, it just adds a 
> target-feature `+v8.5a`. The definitions in 
> `llvm/lib/Target/AArch64/AArch64.td`. suggests that LLVM is already able to 
> infer all supported features from that. e.g.
> ```
> def HasV8_4aOps : SubtargetFeature<
>:
>:
> 
> def HasV8_5aOps : SubtargetFeature<
>   "v8.5a", "HasV8_5aOps", "true", "Support ARM v8.5a instructions",
>   [HasV8_4aOps, FeatureAltFPCmp, FeatureFRInt3264, FeatureSpecRestrict,
>FeatureSSBS, FeatureSB, FeaturePredRes, FeatureCacheDeepPersist,
>FeatureBranchTargetId]>;
> 
> def HasV8_6aOps : SubtargetFeature<
>   "v8.6a", "HasV8_6aOps", "true", "Support ARM v8.6a instructions",
> 
>   [HasV8_5aOps, FeatureAMVS, FeatureBF16, FeatureFineGrainedTraps,
>FeatureEnhancedCounterVirtualization, FeatureMatMulInt8]>;
> ```
> So I don't think you necessarily have to decompose the architecture version 
> into target-features in the Clang driver as well. For Clang it matters that 
> the right set of feature macros are defined so that the ACLE header file 
> exposes the correct set of functions for the given architecture version. At 
> least for the SVE ACLE that is just a small handful of features.
Thank you for explaining this @sdesmalen, I thought that the features needed to 
get explicitly generated by clang.
I will create a separate patch that adds only the feature macros needed for the 
SVE ACLEs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83079: [clang][aarch64] Generate preprocessor macros for -march=armv8.6a+sve.

2020-07-02 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 275283.
fpetrogalli added a comment.

Update the patch to limit its scope to generate the feature macros for 
`-march=armv8.6a+sve`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Preprocessor/aarch64-target-features.c


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,12 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
@@ -412,4 +418,3 @@
 // CHECK-BFLOAT: __ARM_BF16_FORMAT_ALTERNATIVE 1
 // CHECK-BFLOAT: __ARM_FEATURE_BF16 1
 // CHECK-BFLOAT: __ARM_FEATURE_BF16_VECTOR_ARITHMETIC 1
-
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -114,10 +114,16 @@
   std::pair Split = StringRef(MarchLowerCase).split("+");
 
   llvm::AArch64::ArchKind ArchKind = llvm::AArch64::parseArch(Split.first);
-  if (ArchKind == llvm::AArch64::ArchKind::INVALID ||
-  !llvm::AArch64::getArchFeatures(ArchKind, Features) ||
-  (Split.second.size() &&
-   !DecodeAArch64Features(D, Split.second, Features, ArchKind)))
+
+  if (!llvm::AArch64::getArchFeatures(ArchKind, Features))
+return false;
+
+  if (ArchKind == llvm::AArch64::ArchKind::ARMV8_6A) {
+Features.push_back("+i8mm");
+Features.push_back("+bf16");
+  }
+
+  if (!DecodeAArch64Features(D, Split.second, Features, ArchKind))
 return false;
 
   return true;


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,12 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
@@ -412,4 +418,3 @@
 // CHECK-BFLOAT: __ARM_BF16_FORMAT_ALTERNATIVE 1
 // CHECK-BFLOAT: __ARM_FEATURE_BF16 1
 // CHECK-BFLOAT: __ARM_FEATURE_BF16_VECTOR_ARITHMETIC 1
-
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -114,10 +114,16 @@
   std::pair Split = StringRef(MarchLowerCase).split("+");
 
   llvm::AArch64::ArchKind ArchKind = llvm::AArch64::parseArch(Split.first);
-  if (ArchKind == llvm::AArch64::ArchKind::INVALID ||
-  !llvm::AArch64::getArchFeatures(ArchKind, Features) ||
-  (Split.second.size() &&
-   !DecodeAArch64Features(D, Split.second, Features, ArchKind)))
+
+  if (!llvm::AArch64::getArchFeatures(ArchKind, Features))
+return false;
+
+  if (ArchKind == llvm::AArch64::ArchKind::ARMV8_6A) {
+Features.push_back("+i8mm");
+Features.push_back("+bf16");
+  }
+
+  if (!DecodeAArch64Features(D, Split.second, Features, ArchKind))
 return false;
 
   return true;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83079: [clang][aarch64] Generate preprocessor macros for -march=armv8.6a+sve.

2020-07-07 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 276106.
fpetrogalli added a comment.

Removed the unrelated change of the empty line.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Preprocessor/aarch64-target-features.c


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,12 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -114,10 +114,16 @@
   std::pair Split = StringRef(MarchLowerCase).split("+");
 
   llvm::AArch64::ArchKind ArchKind = llvm::AArch64::parseArch(Split.first);
-  if (ArchKind == llvm::AArch64::ArchKind::INVALID ||
-  !llvm::AArch64::getArchFeatures(ArchKind, Features) ||
-  (Split.second.size() &&
-   !DecodeAArch64Features(D, Split.second, Features, ArchKind)))
+
+  if (!llvm::AArch64::getArchFeatures(ArchKind, Features))
+return false;
+
+  if (ArchKind == llvm::AArch64::ArchKind::ARMV8_6A) {
+Features.push_back("+i8mm");
+Features.push_back("+bf16");
+  }
+
+  if (!DecodeAArch64Features(D, Split.second, Features, ArchKind))
 return false;
 
   return true;


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,12 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -114,10 +114,16 @@
   std::pair Split = StringRef(MarchLowerCase).split("+");
 
   llvm::AArch64::ArchKind ArchKind = llvm::AArch64::parseArch(Split.first);
-  if (ArchKind == llvm::AArch64::ArchKind::INVALID ||
-  !llvm::AArch64::getArchFeatures(ArchKind, Features) ||
-  (Split.second.size() &&
-   !DecodeAArch64Features(D, Split.second, Features, ArchKind)))
+
+  if (!llvm::AArch64::getArchFeatures(ArchKind, Features))
+return false;
+
+  if (ArchKind == llvm::AArch64::ArchKind::ARMV8_6A) {
+Features.push_back("+i8mm");
+Features.push_back("+bf16");
+  }
+
+  if (!DecodeAArch64Features(D, Split.second, Features, ArchKind))
 return false;
 
   return true;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83079: [clang][aarch64] Generate preprocessor macros for -march=armv8.6a+sve.

2020-07-07 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 276204.
fpetrogalli added a comment.

Addressed code review, moving the code and adding more testing, including the 
`v8.5` one.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Preprocessor/aarch64-target-features.c


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,60 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.5-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_5 %s
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_5: __ARM_FEATURE_SVE 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+noi8mm -x c 
-E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOI8MM %s
+// CHECK-SVE-8_6-NOI8MM-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOI8MM: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOI8MM: __ARM_FEATURE_SVE_MATMUL_FP32 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+nobf16 -x c 
-E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOBF16 %s
+// CHECK-SVE-8_6-NOBF16-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOBF16: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOBF16: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOBF16: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+nof32mm -x 
c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOF32MM %s
+// CHECK-SVE-8_6-NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOF32MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOF32MM: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOF32MM: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nobf16 -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOI8MMNOBF16 %s
+// CHECK-SVE-8_6-NOI8MMNOBF16-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOI8MMNOBF16-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MMNOBF16: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOI8MMNOBF16: __ARM_FEATURE_SVE_MATMUL_FP32 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOI8MMNOF32MM %s
+// CHECK-SVE-8_6-NOI8MMNOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOI8MMNOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MMNOF32MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOI8MMNOF32MM: __ARM_FEATURE_SVE_BF16 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOBF16NOF32MM %s
+// CHECK-SVE-8_6-NOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOBF16NOF32MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOBF16NOF32MM: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM %s
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM: __ARM_FEATURE_SVE 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -365,6 +365,15 @@
 }
   }
 
+  if (llvm::is_contained(Features, "+v8.6a")) {
+if (!llvm::is_contained(Features, "-i8mm") &&
+!llvm::is_contained(Features, "+noi8mm"))
+  Features.push_back("+i8mm");
+if (!llvm::is_contained(Features, "-bf16") &&
+!llvm::is_contained(Features, "

[PATCH] D83079: [clang][aarch64] Generate preprocessor macros for -march=armv8.6a+sve.

2020-07-08 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli marked 2 inline comments as done.
fpetrogalli added inline comments.



Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:369
+  if (llvm::is_contained(Features, "+v8.6a")) {
+if (!llvm::is_contained(Features, "-i8mm") &&
+!llvm::is_contained(Features, "+noi8mm"))

sdesmalen wrote:
> Is this correct and/or necessary? I would expect LLVM to just handle features 
> in the order they're passed, and the architecture version is always processed 
> first, e.g. `-march=armv8.6-a+noi8mm` will always first process `armv8.6a` 
> before processing features like `noi8mm`.
I was expecting that too, but in in this place the `+i8mm` is added after 
whatever the user have passed to -march, which means that without this extra 
check the user input `-mattr=armv8.6a+sve+noimm8` becomes broken because we are 
adding `-target-feature=+i8mm` after `-i8mm`.  This behavior is guarded by a 
regression tests that starts failing if I don't use these extra checks. This 
was not needed in the original place were I added the functionality because the 
`+i8mm` was being added right after `+v8.6a` and before splitting up the 
`+sve+noi8mm`, so that the user input was the one (un)setting the feature.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83079: [clang][aarch64] Generate preprocessor macros for -march=armv8.6a+sve.

2020-07-08 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 276542.
fpetrogalli marked an inline comment as done.
fpetrogalli added a comment.

@sdesmalen, I have followed your suggestion to use insert instead of push_back!

Code is much nicer now, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Preprocessor/aarch64-target-features.c


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,60 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.5-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_5 %s
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_5: __ARM_FEATURE_SVE 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+noi8mm -x c 
-E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOI8MM %s
+// CHECK-SVE-8_6-NOI8MM-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOI8MM: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOI8MM: __ARM_FEATURE_SVE_MATMUL_FP32 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+nobf16 -x c 
-E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOBF16 %s
+// CHECK-SVE-8_6-NOBF16-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOBF16: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOBF16: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOBF16: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+nof32mm -x 
c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOF32MM %s
+// CHECK-SVE-8_6-NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOF32MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOF32MM: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOF32MM: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nobf16 -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOI8MMNOBF16 %s
+// CHECK-SVE-8_6-NOI8MMNOBF16-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOI8MMNOBF16-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MMNOBF16: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOI8MMNOBF16: __ARM_FEATURE_SVE_MATMUL_FP32 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOI8MMNOF32MM %s
+// CHECK-SVE-8_6-NOI8MMNOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOI8MMNOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MMNOF32MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOI8MMNOF32MM: __ARM_FEATURE_SVE_BF16 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOBF16NOF32MM %s
+// CHECK-SVE-8_6-NOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOBF16NOF32MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOBF16NOF32MM: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM %s
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM: __ARM_FEATURE_SVE 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -365,6 +365,12 @@
 }
   }
 
+  auto V8_6Pos = llvm::find(Features, "+v8.6a");
+  if (V8_6Pos != std::end(Features)) {
+V8_6Pos = Features.insert(std::next(V8_6Pos), "+i8mm");
+V8_6Pos = Features.insert(V8_6Pos, "+bf16");
+  }
+
  

[PATCH] D83079: [clang][aarch64] Generate preprocessor macros for -march=armv8.6a+sve.

2020-07-13 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 277445.
fpetrogalli added a comment.

Update the `insert` invocation to use initializer list instead of calling it 
twice.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Preprocessor/aarch64-target-features.c


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,60 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.5-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_5 %s
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_5: __ARM_FEATURE_SVE 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+noi8mm -x c 
-E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOI8MM %s
+// CHECK-SVE-8_6-NOI8MM-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOI8MM: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOI8MM: __ARM_FEATURE_SVE_MATMUL_FP32 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+nobf16 -x c 
-E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOBF16 %s
+// CHECK-SVE-8_6-NOBF16-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOBF16: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOBF16: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOBF16: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+nof32mm -x 
c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOF32MM %s
+// CHECK-SVE-8_6-NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOF32MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOF32MM: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOF32MM: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nobf16 -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOI8MMNOBF16 %s
+// CHECK-SVE-8_6-NOI8MMNOBF16-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOI8MMNOBF16-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MMNOBF16: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOI8MMNOBF16: __ARM_FEATURE_SVE_MATMUL_FP32 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOI8MMNOF32MM %s
+// CHECK-SVE-8_6-NOI8MMNOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOI8MMNOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MMNOF32MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOI8MMNOF32MM: __ARM_FEATURE_SVE_BF16 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOBF16NOF32MM %s
+// CHECK-SVE-8_6-NOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOBF16NOF32MM: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6-NOBF16NOF32MM: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM %s
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOI8MMNOBF16NOF32MM: __ARM_FEATURE_SVE 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -365,6 +365,10 @@
 }
   }
 
+  auto V8_6Pos = llvm::find(Features, "+v8.6a");
+  if (V8_6Pos != std::end(Features))
+V8_6Pos = Features.insert(std::next(V8_6Pos), {"+i8mm", "+bf16"});
+
   if (Arg *A = Args.getLastArg(options::OPT_mno_unaligned_access,
options::OPT_munaligned

[PATCH] D83079: [clang][aarch64] Generate preprocessor macros for -march=armv8.6a+sve.

2020-07-14 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 277879.
fpetrogalli added a comment.

Removed extra tests that are not needed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Preprocessor/aarch64-target-features.c


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,24 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.5-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_5 %s
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_5: __ARM_FEATURE_SVE 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOFEATURES %s
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOFEATURES: __ARM_FEATURE_SVE 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -365,6 +365,10 @@
 }
   }
 
+  auto V8_6Pos = llvm::find(Features, "+v8.6a");
+  if (V8_6Pos != std::end(Features))
+V8_6Pos = Features.insert(std::next(V8_6Pos), {"+i8mm", "+bf16"});
+
   if (Arg *A = Args.getLastArg(options::OPT_mno_unaligned_access,
options::OPT_munaligned_access))
 if (A->getOption().matches(options::OPT_mno_unaligned_access))


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,24 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.5-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_5 %s
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_5: __ARM_FEATURE_SVE 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+noi8mm+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOFEATURES %s
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOFEATURES: __ARM_FEATURE_SVE 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -365,6 +365,10 @@
 }
   }
 
+  auto V8_6Pos = llvm::find(Features, "+v8.6a");
+  if (V8_6Pos != std::end(Features))
+V8_6Pos = Features.insert(std::next(V8_6Pos), {"+i8mm", "+bf16"});
+
   if (Arg *A = Args.getLastArg(options::OPT_mno_unaligned_access,
options::OPT_munaligned_access))
 if (A->getOption().matches(options::OPT_mno_unaligned_access))
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83079: [clang][aarch64] Generate preprocessor macros for -march=armv8.6a+sve.

2020-07-14 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG438e95e95bfc: [clang][aarch64] Generate preprocessor macros 
for -march=armv8.6a+sve. (authored by fpetrogalli).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83079/new/

https://reviews.llvm.org/D83079

Files:
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Preprocessor/aarch64-target-features.c


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,24 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.5-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_5 %s
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_5: __ARM_FEATURE_SVE 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM 
%s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu 
-march=armv8.6-a+sve+noi8mm+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck 
--check-prefix=CHECK-SVE-8_6-NOFEATURES %s
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOFEATURES: __ARM_FEATURE_SVE 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -365,6 +365,10 @@
 }
   }
 
+  auto V8_6Pos = llvm::find(Features, "+v8.6a");
+  if (V8_6Pos != std::end(Features))
+V8_6Pos = Features.insert(std::next(V8_6Pos), {"+i8mm", "+bf16"});
+
   if (Arg *A = Args.getLastArg(options::OPT_mno_unaligned_access,
options::OPT_munaligned_access))
 if (A->getOption().matches(options::OPT_mno_unaligned_access))


Index: clang/test/Preprocessor/aarch64-target-features.c
===
--- clang/test/Preprocessor/aarch64-target-features.c
+++ clang/test/Preprocessor/aarch64-target-features.c
@@ -112,6 +112,24 @@
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE 1
 // CHECK-SVE-F64MM: __ARM_FEATURE_SVE_MATMUL_FP64 1
 
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.5-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_5 %s
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_5-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_5: __ARM_FEATURE_SVE 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6 %s
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6: __ARM_FEATURE_SVE_MATMUL_INT8 1
+
+// RUN: %clang -target aarch64-none-linux-gnu -march=armv8.6-a+sve+noi8mm+nobf16+nof32mm -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE-8_6-NOFEATURES %s
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_BF16 1
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_MATMUL_FP32 1
+// CHECK-SVE-8_6-NOFEATURES-NOT: __ARM_FEATURE_SVE_MATMUL_INT8 1
+// CHECK-SVE-8_6-NOFEATURES: __ARM_FEATURE_SVE 1
+
 // The following tests may need to be revised in the future since
 // SVE2 is currently still part of Future Architecture Technologies
 // (https://developer.arm.com/docs/ddi0602/latest)
Index: clang/lib/Driver/ToolChains/Arch/AArch64.cpp
===
--- clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -365,6 +365,10 @@
 }
   }
 
+  auto V8_6Pos = llvm::find(Features, "+v8.6a");
+  if (V8_6Pos != std::end(Features))
+V8_6Pos = Features.insert(std::next(V8_6Pos), {"+i8mm", "+bf16"});
+
   if (Arg *A = Args.getLastArg(options::OPT_mno_unaligned_access,
options::OPT_munaligned_access))
 if (A->getOption().matches(options::OPT_mno_unaligned_access))
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://li

[PATCH] D81304: [llvm][SveEmitter] Emit the bfloat version of `svld1ro`.

2020-06-18 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG3e59dfc30124: [llvm][SveEmitter] Emit the bfloat version of 
`svld1ro`. (authored by fpetrogalli).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81304/new/

https://reviews.llvm.org/D81304

Files:
  clang/include/clang/Basic/AArch64SVEACLETypes.def
  clang/include/clang/Basic/arm_sve.td
  clang/lib/AST/ASTContext.cpp
  clang/lib/AST/ItaniumMangle.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/CodeGen/CodeGenTypes.cpp
  clang/test/AST/ast-dump-aarch64-sve-types.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro-bfloat.c
  clang/unittests/AST/ASTImporterTest.cpp
  clang/unittests/AST/SizelessTypesTest.cpp
  clang/utils/TableGen/SveEmitter.cpp

Index: clang/utils/TableGen/SveEmitter.cpp
===
--- clang/utils/TableGen/SveEmitter.cpp
+++ clang/utils/TableGen/SveEmitter.cpp
@@ -65,7 +65,7 @@
 
 class SVEType {
   TypeSpec TS;
-  bool Float, Signed, Immediate, Void, Constant, Pointer;
+  bool Float, Signed, Immediate, Void, Constant, Pointer, BFloat;
   bool DefaultType, IsScalable, Predicate, PredicatePattern, PrefetchOp;
   unsigned Bitwidth, ElementBitwidth, NumVectors;
 
@@ -74,9 +74,9 @@
 
   SVEType(TypeSpec TS, char CharMod)
   : TS(TS), Float(false), Signed(true), Immediate(false), Void(false),
-Constant(false), Pointer(false), DefaultType(false), IsScalable(true),
-Predicate(false), PredicatePattern(false), PrefetchOp(false),
-Bitwidth(128), ElementBitwidth(~0U), NumVectors(1) {
+Constant(false), Pointer(false), BFloat(false), DefaultType(false),
+IsScalable(true), Predicate(false), PredicatePattern(false),
+PrefetchOp(false), Bitwidth(128), ElementBitwidth(~0U), NumVectors(1) {
 if (!TS.empty())
   applyTypespec();
 applyModifier(CharMod);
@@ -93,9 +93,11 @@
   bool isVoid() const { return Void & !Pointer; }
   bool isDefault() const { return DefaultType; }
   bool isFloat() const { return Float; }
-  bool isInteger() const { return !Float && !Predicate; }
+  bool isBFloat() const { return BFloat; }
+  bool isFloatingPoint() const { return Float || BFloat; }
+  bool isInteger() const { return !isFloatingPoint() && !Predicate; }
   bool isScalarPredicate() const {
-return !Float && Predicate && NumVectors == 0;
+return !isFloatingPoint() && Predicate && NumVectors == 0;
   }
   bool isPredicateVector() const { return Predicate; }
   bool isPredicatePattern() const { return PredicatePattern; }
@@ -362,7 +364,7 @@
 
   if (isVoidPointer())
 S += "v";
-  else if (!Float)
+  else if (!isFloatingPoint())
 switch (ElementBitwidth) {
 case 1: S += "b"; break;
 case 8: S += "c"; break;
@@ -372,15 +374,19 @@
 case 128: S += "LLLi"; break;
 default: llvm_unreachable("Unhandled case!");
 }
-  else
+  else if (isFloat())
 switch (ElementBitwidth) {
 case 16: S += "h"; break;
 case 32: S += "f"; break;
 case 64: S += "d"; break;
 default: llvm_unreachable("Unhandled case!");
 }
+  else if (isBFloat()) {
+assert(ElementBitwidth == 16 && "Not a valid BFloat.");
+S += "y";
+  }
 
-  if (!isFloat()) {
+  if (!isFloatingPoint()) {
 if ((isChar() || isPointer()) && !isVoidPointer()) {
   // Make chars and typed pointers explicitly signed.
   if (Signed)
@@ -421,13 +427,15 @@
   else {
 if (isScalableVector())
   S += "sv";
-if (!Signed && !Float)
+if (!Signed && !isFloatingPoint())
   S += "u";
 
 if (Float)
   S += "float";
 else if (isScalarPredicate() || isPredicateVector())
   S += "bool";
+else if (isBFloat())
+  S += "bfloat";
 else
   S += "int";
 
@@ -481,6 +489,10 @@
   Float = true;
   ElementBitwidth = 64;
   break;
+case 'b':
+  BFloat = true;
+  ElementBitwidth = 16;
+  break;
 default:
   llvm_unreachable("Unhandled type code!");
 }
@@ -534,6 +546,7 @@
   case 'P':
 Signed = true;
 Float = false;
+BFloat = false;
 Predicate = true;
 Bitwidth = 16;
 ElementBitwidth = 1;
@@ -784,7 +797,6 @@
   BaseTypeSpec(BT), Class(Class), Guard(Guard.str()),
   MergeSuffix(MergeSuffix.str()), BaseType(BT, 'd'), Flags(Flags),
   ImmChecks(Checks.begin(), Checks.end()) {
-
   // Types[0] is the return value.
   for (unsigned I = 0; I < Proto.size(); ++I) {
 SVEType T(BaseTypeSpec, Proto[I]);
@@ -848,6 +860,8 @@
   TypeCode = T.isSigned() ? 's' : 'u';
 else if (T.isPredicateVector())
   TypeCode = 'b';
+else if (T.isBFloat())
+  TypeCode = "bf";
 else
   TypeCode = 'f';
 Ret.replace(Pos, NumChars, TypeCode + utostr(T.getElementSizeInBits()));
@@ -923,6 +937,11 @@
 }
   }
 
+  if (T.isBFloat()) {
+assert(T.getElementSizeInBits() == 16 && "Not a valid BFloat.");
+return encodeEltType("EltTyBFloat16");
+ 

[PATCH] D82141: [sve][acle] Add SVE BFloat16 extensions.

2020-06-18 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: sdesmalen, ctetreau, efriedma, david-arm.
Herald added subscribers: llvm-commits, cfe-commits, psnobl, rkruppe, 
hiraditya, kristof.beyls, tschuett.
Herald added a reviewer: rengolin.
Herald added projects: clang, LLVM.

List of intrinsics:

svfloat32_t svbfdot[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)
svfloat32_t svbfdot[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3)
svfloat32_t svbfdot_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t 
op3, uint64_t imm_index)

svfloat32_t svbfmmla[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)

svfloat32_t svbfmlalb[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)
svfloat32_t svbfmlalb[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3)
svfloat32_t svbfmlalb_lane[_f32](svfloat32_t op1, svbfloat16_t op2, 
svbfloat16_t op3, uint64_t imm_index)

svfloat32_t svbfmlalt[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)
svfloat32_t svbfmlalt[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3)
svfloat32_t svbfmlalt_lane[_f32](svfloat32_t op1, svbfloat16_t op2, 
svbfloat16_t op3, uint64_t imm_index)

svbfloat16_t svcvt_bf16[_f32]_m(svbfloat16_t inactive, svbool_t pg, svfloat32_t 
op)
svbfloat16_t svcvt_bf16[_f32]_x(svbool_t pg, svfloat32_t op)
svbfloat16_t svcvt_bf16[_f32]_z(svbool_t pg, svfloat32_t op)

svbfloat16_t svcvtnt_bf16[_f32]_m(svbfloat16_t even, svbool_t pg, svfloat32_t 
op)
svbfloat16_t svcvtnt_bf16[_f32]_x(svbfloat16_t even, svbool_t pg, svfloat32_t 
op)

For reference, see section 7.2 of "Arm C Language Extensions for SVE - Version 
00bet4"


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82141

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfdot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmmla.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvtnt.c
  clang/utils/TableGen/SveEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll

Index: llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll
@@ -0,0 +1,243 @@
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 -asm-verbose=0 < %s | FileCheck %s
+
+;
+; BFDOT
+;
+
+define  @bfdot_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.nxv4f32( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfdot_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_0_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane.nxv4f32( %a,  %b,  %c, i64 0)
+  ret  %out
+}
+
+define  @bfdot_lane_1_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_1_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[1]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane.nxv4f32( %a,  %b,  %c, i64 1)
+  ret  %out
+}
+
+define  @bfdot_lane_2_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_2_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[2]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane.nxv4f32( %a,  %b,  %c, i64 2)
+  ret  %out
+}
+
+define  @bfdot_lane_3_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_3_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[3]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane.nxv4f32( %a,  %b,  %c, i64 3)
+  ret  %out
+}
+
+;
+; BFMLALB
+;
+
+define  @bfmlalb_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.nxv4f32( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfmlalb_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_0_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane.nxv4f32( %a,  %b,  %c, i64 0)
+  ret  %out
+}
+
+define  @bfmlalb_lane_1_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_1_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[1]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane.nxv4f32( %a,  %b,  %c, i64 1)
+  ret  %out
+}
+
+define  @bfmlalb_lane_2_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_2_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[2]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane.nxv4f32( %a,  %b,  %c, i64 2)
+  ret  %out
+}
+
+define  @bfmlalb_lane_3_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_3_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.

[PATCH] D82178: [AArch64][SVE] Guard svbfloat16_t with feature macro in ACLE

2020-06-19 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/utils/TableGen/SveEmitter.cpp:1092
+  OS << "#if !defined(__ARM_FEATURE_BF16_SCALAR_ARITHMETIC)\n";
+  OS << "#error \"__ARM_FEATURE_BF16_SCALAR_ARITHMETIC must be defined when "
+"__ARM_FEATURE_SVE_BF16 is defined\"\n";

Does it make sense to add a regression test to make sure this error is raised?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82178/new/

https://reviews.llvm.org/D82178



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82187: [AArch64][SVE] ACLE: Add bfloat16 to struct load/stores.

2020-06-19 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/include/clang/Basic/AArch64SVEACLETypes.def:69
 
-SVE_VECTOR_TYPE("__SVBFloat16_t", "__SVBFloat16_t", SveBFloat16, 
SveBFloat16Ty, 8, 16, false, false, true)
+SVE_VECTOR_TYPE("__SVBFloat16_t", "__SVBFloat16_t", SveBFloat16, 
SveBFloat16Ty, 8, 16, true, false, true)
 

Why did you have to set `IsFP = true`? Seems like an unrelated change?



Comment at: clang/utils/TableGen/SveEmitter.cpp:541
 Float = false;
+BFloat = false;
 ElementBitwidth /= 4;

Are these needed? I don't understand the rule for when to be specific on the 
values of these variables.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82187/new/

https://reviews.llvm.org/D82187



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82186: [AArch64][SVE] Add bfloat16 support to svlen intrinsic

2020-06-19 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_len.c:2
 // REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -triple aarch64-none-linux-gnu 
-target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall 
-emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -DSVE_OVERLOADED_FORMS -triple 
aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns 
-S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -triple aarch64-none-linux-gnu 
-target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -o 
- %s >/dev/null 2>%t
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC 
-D__ARM_FEATURE_SVE_BF16 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +bf16 -fallow-half-arguments-and-returns -S -O1 -Werror -Wall 
-emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC 
-D__ARM_FEATURE_SVE_BF16 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +bf16 -fallow-half-arguments-and-returns 
-S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s

We said `ARM_FEATURE_BF16_SCALAR_ARITHMETIC` is implied by 
`__ARM_FEATURE_SVE_BF16`, so I think you should remove this macro definition.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82186/new/

https://reviews.llvm.org/D82186



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82182: [AArch64][SVE] Add bfloat16 support to perm and select intrinsics

2020-06-19 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/include/clang/Basic/arm_sve.td:1115
 
+let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16)" in {
+def SVREV_BF16: SInst<"svrev[_{d}]","dd",   "b", MergeNone, 
"aarch64_sve_rev">;

nit: could create a multiclass here like @sdesmalen have done in 
https://reviews.llvm.org/D82187, seems quite a nice way to keep the definition 
of the intrinsics together (look for `multiclass StructLoad`, for example)



Comment at: clang/include/clang/Basic/arm_sve.td:1298
 
+let ArchGuard = "defined(__ARM_FEATURE_SVE_MATMUL_FP64) && 
defined(__ARM_FEATURE_SVE_BF16)" in {
+def SVTRN1Q_BF16  : SInst<"svtrn1q[_{d}]", "ddd",  "b", MergeNone, 
"aarch64_sve_trn1q">;

Same here, could use a multiclass to merge the "regular" intrinsics definition 
with the BF ones.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82182/new/

https://reviews.llvm.org/D82182



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82141: [sve][acle] Add SVE BFloat16 extensions.

2020-06-19 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1811
 
+def int_aarch64_sve_cvt_bf16f32 : Builtin_SVCVT<"svcvt_bf16_f32_m",   
llvm_nxv8bf16_ty, llvm_nxv8i1_ty, llvm_nxv4f32_ty>;
+def int_aarch64_sve_cvtnt_bf16f32   : Builtin_SVCVT<"svcvtnt_bf16_f32_m", 
llvm_nxv8bf16_ty, llvm_nxv8i1_ty, llvm_nxv4f32_ty>;

sdesmalen wrote:
> nit:  use `fcvtbf` instead of `cvt` => `int_aarch64_sve_fcvtbf_bf16f32` ?
Renamed to `int_aarch64_sve_fcvt_bf16f32` and `int_aarch64_sve_fcvtnt_bf16f32` 
respectively, because I think it wouldn't make sense to add the `bf` suffix to 
the `cvtnt` version of the intrinsic.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82141/new/

https://reviews.llvm.org/D82141



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82141: [sve][acle] Add SVE BFloat16 extensions.

2020-06-19 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 272170.
fpetrogalli marked 7 inline comments as done.
fpetrogalli added a comment.

Thank you for the review @sdesmalen!

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82141/new/

https://reviews.llvm.org/D82141

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfdot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmmla.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvtnt.c
  clang/utils/TableGen/SveEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll

Index: llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll
@@ -0,0 +1,243 @@
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 -asm-verbose=0 < %s | FileCheck %s
+
+;
+; BFDOT
+;
+
+define  @bfdot_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfdot_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_0_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 0)
+  ret  %out
+}
+
+define  @bfdot_lane_1_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_1_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[1]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 1)
+  ret  %out
+}
+
+define  @bfdot_lane_2_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_2_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[2]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 2)
+  ret  %out
+}
+
+define  @bfdot_lane_3_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_3_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[3]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 3)
+  ret  %out
+}
+
+;
+; BFMLALB
+;
+
+define  @bfmlalb_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfmlalb_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_0_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 0)
+  ret  %out
+}
+
+define  @bfmlalb_lane_1_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_1_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[1]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 1)
+  ret  %out
+}
+
+define  @bfmlalb_lane_2_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_2_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[2]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 2)
+  ret  %out
+}
+
+define  @bfmlalb_lane_3_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_3_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[3]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 3)
+  ret  %out
+}
+
+define  @bfmlalb_lane_4_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_4_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[4]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 4)
+  ret  %out
+}
+
+define  @bfmlalb_lane_5_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_5_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[5]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 5)
+  ret  %out
+}
+
+define  @bfmlalb_lane_6_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_6_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[6]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 6)
+  ret  %out
+}
+
+define  @bfmlalb_lane_7_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_7_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[7]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 7)
+  ret  %out
+}
+
+;
+; BFMLALT
+;
+
+define  @bfmlalt_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalt_f32:
+; CHECK-NEXT:  bfmlalt z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalt( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfmlalt_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalt_lane_0_f32:
+; CHECK-NEXT:  bfmlalt z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = cal

[PATCH] D82178: [AArch64][SVE] Guard svbfloat16_t with feature macro in ACLE

2020-06-22 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM! Thanks.




Comment at: clang/utils/TableGen/SveEmitter.cpp:1092
+  OS << "#if !defined(__ARM_FEATURE_BF16_SCALAR_ARITHMETIC)\n";
+  OS << "#error \"__ARM_FEATURE_BF16_SCALAR_ARITHMETIC must be defined when "
+"__ARM_FEATURE_SVE_BF16 is defined\"\n";

fpetrogalli wrote:
> Does it make sense to add a regression test to make sure this error is raised?
As discussed, this is not needed. The ACLE tests will catch this once we have 
set up the correct macros in the clang driver.



Comment at: clang/utils/TableGen/SveEmitter.cpp:1101
+  OS << "#if defined(__ARM_FEATURE_BF16_SCALAR_ARITHMETIC)\n";
+  OS << "#include \n";
   OS << "typedef __bf16 bfloat16_t;\n";

nit: maybe add a comment specifying that this include is required in the ACLE 
specs. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82178/new/

https://reviews.llvm.org/D82178



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82298: [AArch64][SVE] Add bfloat16 support to load intrinsics

2020-06-22 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli requested changes to this revision.
fpetrogalli added inline comments.
This revision now requires changes to proceed.



Comment at: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ldnf1.c:2-4
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC 
-D__ARM_FEATURE_SVE_BF16 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +bf16 -fallow-half-arguments-and-returns -S -O1 -Werror -Wall 
-emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC 
-D__ARM_FEATURE_SVE_BF16 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +bf16 -fallow-half-arguments-and-returns 
-S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC 
-D__ARM_FEATURE_SVE_BF16 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +bf16 -fallow-half-arguments-and-returns -S -O1 -Werror -Wall 
-o - %s >/dev/null 2>%t

With @sdesmalen  we where thinking that maybe it is better to duplicate the run 
lines to have the BF16 intrinsics tested separately:

```
 RUN: %clang_cc1 -D__ARM_FEATURE_SVE  ... -target-feature +sve ...
 RUN: %clang_cc1 _DENABLE_BF16_TEST -D__ARM_FEATURE_SVE 
-D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -D__ARM_FEATURE_SVE_BF16 ... 
-target-feature +sve -target-feature +bf16 ... 
```

and wrap the BF16 tests in `#ifdef ENABLE_BF16_TEST ... #endif`.

this will make sure that the non BF16 tests will be erroneously associated to 
the BF16 flags.

Please apply these to all the run lines involving BF16 modified in this patch.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82298/new/

https://reviews.llvm.org/D82298



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82141: [sve][acle] Add SVE BFloat16 extensions.

2020-06-22 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 272480.
fpetrogalli added a comment.

Formatting changes. NFC.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82141/new/

https://reviews.llvm.org/D82141

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfdot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmmla.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvtnt.c
  clang/utils/TableGen/SveEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll

Index: llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll
@@ -0,0 +1,243 @@
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 -asm-verbose=0 < %s | FileCheck %s
+
+;
+; BFDOT
+;
+
+define  @bfdot_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfdot_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_0_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 0)
+  ret  %out
+}
+
+define  @bfdot_lane_1_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_1_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[1]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 1)
+  ret  %out
+}
+
+define  @bfdot_lane_2_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_2_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[2]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 2)
+  ret  %out
+}
+
+define  @bfdot_lane_3_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_3_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[3]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 3)
+  ret  %out
+}
+
+;
+; BFMLALB
+;
+
+define  @bfmlalb_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfmlalb_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_0_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 0)
+  ret  %out
+}
+
+define  @bfmlalb_lane_1_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_1_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[1]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 1)
+  ret  %out
+}
+
+define  @bfmlalb_lane_2_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_2_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[2]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 2)
+  ret  %out
+}
+
+define  @bfmlalb_lane_3_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_3_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[3]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 3)
+  ret  %out
+}
+
+define  @bfmlalb_lane_4_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_4_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[4]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 4)
+  ret  %out
+}
+
+define  @bfmlalb_lane_5_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_5_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[5]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 5)
+  ret  %out
+}
+
+define  @bfmlalb_lane_6_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_6_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[6]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 6)
+  ret  %out
+}
+
+define  @bfmlalb_lane_7_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_7_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[7]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 7)
+  ret  %out
+}
+
+;
+; BFMLALT
+;
+
+define  @bfmlalt_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalt_f32:
+; CHECK-NEXT:  bfmlalt z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalt( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfmlalt_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalt_lane_0_f32:
+; CHECK-NEXT:  bfmlalt z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalt.lane( %a,  %b,  %c, i64 0)
+  ret  %out

[PATCH] D82141: [sve][acle] Add SVE BFloat16 extensions.

2020-06-22 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGef597eda8efc: [sve][acle] Add SVE BFloat16 extensions. 
(authored by fpetrogalli).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82141/new/

https://reviews.llvm.org/D82141

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfdot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmmla.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvtnt.c
  clang/utils/TableGen/SveEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll

Index: llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-bfloat.ll
@@ -0,0 +1,243 @@
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 -asm-verbose=0 < %s | FileCheck %s
+
+;
+; BFDOT
+;
+
+define  @bfdot_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfdot_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_0_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 0)
+  ret  %out
+}
+
+define  @bfdot_lane_1_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_1_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[1]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 1)
+  ret  %out
+}
+
+define  @bfdot_lane_2_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_2_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[2]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 2)
+  ret  %out
+}
+
+define  @bfdot_lane_3_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfdot_lane_3_f32:
+; CHECK-NEXT:  bfdot z0.s, z1.h, z2.h[3]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfdot.lane( %a,  %b,  %c, i64 3)
+  ret  %out
+}
+
+;
+; BFMLALB
+;
+
+define  @bfmlalb_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfmlalb_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_0_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 0)
+  ret  %out
+}
+
+define  @bfmlalb_lane_1_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_1_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[1]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 1)
+  ret  %out
+}
+
+define  @bfmlalb_lane_2_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_2_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[2]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 2)
+  ret  %out
+}
+
+define  @bfmlalb_lane_3_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_3_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[3]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 3)
+  ret  %out
+}
+
+define  @bfmlalb_lane_4_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_4_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[4]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 4)
+  ret  %out
+}
+
+define  @bfmlalb_lane_5_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_5_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[5]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 5)
+  ret  %out
+}
+
+define  @bfmlalb_lane_6_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_6_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[6]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 6)
+  ret  %out
+}
+
+define  @bfmlalb_lane_7_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalb_lane_7_f32:
+; CHECK-NEXT:  bfmlalb z0.s, z1.h, z2.h[7]
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalb.lane( %a,  %b,  %c, i64 7)
+  ret  %out
+}
+
+;
+; BFMLALT
+;
+
+define  @bfmlalt_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalt_f32:
+; CHECK-NEXT:  bfmlalt z0.s, z1.h, z2.h
+; CHECK-NEXT:  ret
+  %out = call  @llvm.aarch64.sve.bfmlalt( %a,  %b,  %c)
+  ret  %out
+}
+
+define  @bfmlalt_lane_0_f32( %a,  %b,  %c) nounwind {
+; CHECK-LABEL: bfmlalt_lane_0_f32:
+; CHECK-NEXT:  bfmlalt z0.s, z1.h, z2.h[0]
+; CHECK-NEXT:  ret
+  %out = ca

[PATCH] D82345: [sve][acle] Implement some of the C intrinsics for brain float.

2020-06-22 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: sdesmalen, kmclaughlin, c-rhodes, ctetreau.
Herald added subscribers: llvm-commits, cfe-commits, psnobl, rkruppe, 
hiraditya, tschuett.
Herald added a reviewer: efriedma.
Herald added projects: clang, LLVM.

The following intrinsics have been extended to support brain float types:

svbfloat16_t svclasta[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t 
data)
bfloat16_t svclasta[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t 
data)
bfloat16_t svlasta[_bf16](svbool_t pg, svbfloat16_t op)

svbfloat16_t svclastb[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t 
data)
bfloat16_t svclastb[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t 
data)
bfloat16_t svlastb[_bf16](svbool_t pg, svbfloat16_t op)

svbfloat16_t svdup[_n]_bf16(bfloat16_t op)
svbfloat16_t svdup[_n]_bf16_m(svbfloat16_t inactive, svbool_t pg, bfloat16_t op)
svbfloat16_t svdup[_n]_bf16_x(svbool_t pg, bfloat16_t op)
svbfloat16_t svdup[_n]_bf16_z(svbool_t pg, bfloat16_t op)

svbfloat16_t svdupq[_n]_bf16(bfloat16_t x0, bfloat16_t x1, bfloat16_t x2, 
bfloat16_t x3, bfloat16_t x4, bfloat16_t x5, bfloat16_t x6, bfloat16_t x7)
svbfloat16_t svdupq_lane[_bf16](svbfloat16_t data, uint64_t index)

svbfloat16_t svinsr[_n_bf16](svbfloat16_t op1, bfloat16_t op2)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82345

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dup.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_insr.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-bfloat.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll

Index: llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
 
 ; WARN-NOT: warning
@@ -165,6 +165,14 @@
   ret  %out
 }
 
+define  @insr_bf16( %a, bfloat %b) {
+; CHECK-LABEL: insr_bf16:
+; CHECK: insr z0.h, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.insr.nxv8bf16( %a, bfloat %b)
+  ret  %out
+}
+
 define  @insr_f32( %a, float %b) {
 ; CHECK-LABEL: insr_f32:
 ; CHECK: insr z0.s, s1
@@ -348,6 +356,7 @@
 declare  @llvm.aarch64.sve.insr.nxv4i32(, i32)
 declare  @llvm.aarch64.sve.insr.nxv2i64(, i64)
 declare  @llvm.aarch64.sve.insr.nxv8f16(, half)
+declare  @llvm.aarch64.sve.insr.nxv8bf16(, bfloat)
 declare  @llvm.aarch64.sve.insr.nxv4f32(, float)
 declare  @llvm.aarch64.sve.insr.nxv2f64(, double)
 
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
 
 ; WARN-NOT: warning
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @dup_bf16( %a,  %pg, bfloat %b) {
+; CHECK-LABEL: dup_bf16:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %a,
+ %pg,
+bfloat %b)
+  ret  %out
+}
+
 define  @dup_f32( %a,  %pg, float %b) {
 ; CHECK-LABEL: dup_f32:
 ; CHECK: mov z0.s, p0/m, s1
@@ -82,5 +92,6 @@
 declare  @llvm.aarch64.sve.dup.nxv4i32(, , i32)
 declare  @llvm.aarch64.sve.dup.nxv2i64(, , i64)
 declare  @llvm.aarch64.sve.dup.nxv8f16(, , half)
+declare  @llvm.aarch64.sve.dup.nxv8bf16(, , bfloat)
 declare  @llvm.aarch64.sve.dup.nxv4f32(, , float)
 declare  @llvm.aarch64.sve.dup.nxv2f64(, , double)
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
 
 ; WARN-NOT: warning
@@ -57,6 +57,16 @@
   ret  %out
 }
 

[PATCH] D82298: [AArch64][SVE] Add bfloat16 support to load intrinsics

2020-06-23 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM, thanks!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82298/new/

https://reviews.llvm.org/D82298



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82182: [AArch64][SVE] Add bfloat16 support to perm and select intrinsics

2020-06-23 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/include/clang/Basic/arm_sve.td:1115
 
+let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16)" in {
+def SVREV_BF16: SInst<"svrev[_{d}]","dd",   "b", MergeNone, 
"aarch64_sve_rev">;

c-rhodes wrote:
> c-rhodes wrote:
> > fpetrogalli wrote:
> > > nit: could create a multiclass here like @sdesmalen have done in 
> > > https://reviews.llvm.org/D82187, seems quite a nice way to keep the 
> > > definition of the intrinsics together (look for `multiclass StructLoad`, 
> > > for example)
> > it might be a bit tedious having separate multiclasses, what do you think 
> > about:
> > ```multiclass SInstBF16 > i = "",
> >  list ft = [], list ch = []> {
> >   def : SInst;
> >   let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16)" in {
> > def : SInst;
> >   }
> > }
> > 
> > defm SVREV: SInstBF16<"svrev[_{d}]","dd",   "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_rev">;
> > defm SVSEL: SInstBF16<"svsel[_{d}]","dPdd", "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_sel">;
> > defm SVSPLICE : SInstBF16<"svsplice[_{d}]", "dPdd", "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_splice">;
> > defm SVTRN1   : SInstBF16<"svtrn1[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_trn1">;
> > defm SVTRN2   : SInstBF16<"svtrn2[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_trn2">;
> > defm SVUZP1   : SInstBF16<"svuzp1[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_uzp1">;
> > defm SVUZP2   : SInstBF16<"svuzp2[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_uzp2">;
> > defm SVZIP1   : SInstBF16<"svzip1[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_zip1">;
> > defm SVZIP2   : SInstBF16<"svzip2[_{d}]",   "ddd",  "csilUcUsUiUlhfd", 
> > MergeNone, "aarch64_sve_zip2">;```
> > 
> > ?
> I've played around with this and it works great for instructions guarded on a 
> single feature flag but falls apart for the .Q forms that also require 
> `__ARM_FEATURE_SVE_MATMUL_FP64`. I suspect there's a nice way of handling it 
> in tablegen by passing the features as a list of strings and joining them but 
> I spent long enough trying to get that to work so I'm going to keep it simple 
> for now.
> it might be a bit tedious having separate multiclasses, what do you think 
> about:

Sorry I think I misunderstood you when we last discussed this. I didn't mean to 
write a multiclass that would work for ALL intrinsics that uses regular types 
and bfloats I just meant to merge together those who were using the same 
archguard and that you are adding in this patch.

I think you could keep both macros in a single ArchGuard string:

```
multiclass SInstPerm {
  def : SInst;
  let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16)" in {
def : SInst;
  }
}

defm SVREV: SInstPerm<"svrev[_{d}]","dd",MergeNone, 
"aarch64_sve_rev">;
...

multiclass SInstPermMatmul {
  def : SInst;
  let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16) && 
defined(__ARM_FEATURE_SVE_MATMUL_FP64)" in {
def : SInst;
  }
}

def SVTRN1Q : SInstPermMatmul ...
...
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82182/new/

https://reviews.llvm.org/D82182



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82345: [sve][acle] Implement some of the C intrinsics for brain float.

2020-06-23 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 272828.
fpetrogalli added a comment.

Updates:

1. extracted bfloat C tests into separate files (`*-bfloat.c).
2. Added missing tests (`clast[a|b]`, `last[a|b]`)
3. Tested warning is raised for missing declaration when macro 
`__ARM_FEATURE_SVE_BF16` is not present.
4. Cosmetic changes to formatting.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82345/new/

https://reviews.llvm.org/D82345

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clastb-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_insr-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lastb-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-bfloat.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll

Index: llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
 
 ; WARN-NOT: warning
@@ -165,6 +165,14 @@
   ret  %out
 }
 
+define  @insr_bf16( %a, bfloat %b) {
+; CHECK-LABEL: insr_bf16:
+; CHECK: insr z0.h, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.insr.nxv8bf16( %a, bfloat %b)
+  ret  %out
+}
+
 define  @insr_f32( %a, float %b) {
 ; CHECK-LABEL: insr_f32:
 ; CHECK: insr z0.s, s1
@@ -348,6 +356,7 @@
 declare  @llvm.aarch64.sve.insr.nxv4i32(, i32)
 declare  @llvm.aarch64.sve.insr.nxv2i64(, i64)
 declare  @llvm.aarch64.sve.insr.nxv8f16(, half)
+declare  @llvm.aarch64.sve.insr.nxv8bf16(, bfloat)
 declare  @llvm.aarch64.sve.insr.nxv4f32(, float)
 declare  @llvm.aarch64.sve.insr.nxv2f64(, double)
 
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
 
 ; WARN-NOT: warning
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @dup_bf16( %a,  %pg, bfloat %b) {
+; CHECK-LABEL: dup_bf16:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %a,
+ %pg,
+bfloat %b)
+  ret  %out
+}
+
 define  @dup_f32( %a,  %pg, float %b) {
 ; CHECK-LABEL: dup_f32:
 ; CHECK: mov z0.s, p0/m, s1
@@ -82,5 +92,6 @@
 declare  @llvm.aarch64.sve.dup.nxv4i32(, , i32)
 declare  @llvm.aarch64.sve.dup.nxv2i64(, , i64)
 declare  @llvm.aarch64.sve.dup.nxv8f16(, , half)
+declare  @llvm.aarch64.sve.dup.nxv8bf16(, , bfloat)
 declare  @llvm.aarch64.sve.dup.nxv4f32(, , float)
 declare  @llvm.aarch64.sve.dup.nxv2f64(, , double)
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
 
 ; WARN-NOT: warning
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @clasta_bf16( %pg,  %a,  %b) {
+; CHECK-LABEL: clasta_bf16:
+; CHECK: clasta z0.h, p0, z0.h, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.clasta.nxv8bf16( %pg,
+   %a,
+   %b)
+  ret  %out
+}
+
 define  @clasta_f32( %pg,  %a,  %b) {
 ; CHECK-LABEL: clasta_f32:
 ; CHECK: clasta z0.s, p0, z0.s, z1.s
@@ -131,6 +141,16 @@
   ret half %out
 }
 
+define bfloat @clasta_n_bf16( %pg, bfloat %a,  %b) {
+; CHECK-LABEL: clasta_n_bf16:
+; CHECK: clasta h0, p0, h0, 

[PATCH] D82429: [sve][acle] Add some C intrinsics for brain float types.

2020-06-23 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: c-rhodes, kmclaughlin, efriedma, sdesmalen, 
ctetreau.
Herald added subscribers: llvm-commits, cfe-commits, psnobl, rkruppe, 
hiraditya, tschuett.
Herald added projects: clang, LLVM.

The following intrinsics has been added:

svuint16_t svcnt[_bf16]_m(svuint16_t inactive, svbool_t pg, svbfloat16_t op)
svuint16_t svcnt[_bf16]_x(svbool_t pg, svbfloat16_t op)
svuint16_t svcnt[_bf16]_z(svbool_t pg, svbfloat16_t op)

svbfloat16_t svtbl[_bf16](svbfloat16_t data, svuint16_t indices)

svbfloat16_t svtbl2[_bf16](svbfloat16x2_t data, svuint16_t indices)

svbfloat16_t svtbx[_bf16](svbfloat16_t fallback, svbfloat16_t data, svuint16_t 
indices)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82429

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_tbl-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbx-bfloat.c
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll

Index: llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
===
--- llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
+++ llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
@@ -122,6 +122,16 @@
   ret  %out
 }
 
+define  @ftbx_h_bf16( %a,  %b,  %c) {
+; CHECK-LABEL: ftbx_h_bf16:
+; CHECK: tbx z0.h, z1.h, z2.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbx.nxv8bf16( %a,
+%b,
+%c)
+  ret  %out
+}
+
 define  @tbx_s( %a,  %b,  %c) {
 ; CHECK-LABEL: tbx_s:
 ; CHECK: tbx z0.s, z1.s, z2.s
@@ -179,3 +189,5 @@
 declare  @llvm.aarch64.sve.tbx.nxv8f16(, , )
 declare  @llvm.aarch64.sve.tbx.nxv4f32(, , )
 declare  @llvm.aarch64.sve.tbx.nxv2f64(, , )
+
+declare  @llvm.aarch64.sve.tbx.nxv8bf16(, , )
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -1009,6 +1009,15 @@
   ret  %out
 }
 
+define  @tbl_bf16( %a,  %b) {
+; CHECK-LABEL: tbl_bf16:
+; CHECK: tbl z0.h, { z0.h }, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbl.nxv8bf16( %a,
+%b)
+  ret  %out
+}
+
 define  @tbl_f32( %a,  %b) {
 ; CHECK-LABEL: tbl_f32:
 ; CHECK: tbl z0.s, { z0.s }, z1.s
@@ -1859,6 +1868,7 @@
 declare  @llvm.aarch64.sve.tbl.nxv4i32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2i64(, )
 declare  @llvm.aarch64.sve.tbl.nxv8f16(, )
+declare  @llvm.aarch64.sve.tbl.nxv8bf16(, )
 declare  @llvm.aarch64.sve.tbl.nxv4f32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2f64(, )
 
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
@@ -145,6 +145,16 @@
   ret  %out
 }
 
+define  @cnt_bf16( %a,  %pg,  %b) {
+; CHECK-LABEL: cnt_bf16:
+; CHECK: cnt z0.h, p0/m, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.cnt.nxv8bf16( %a,
+%pg,
+%b)
+  ret  %out
+}
+
 define  @cnt_f32( %a,  %pg,  %b) {
 ; CHECK-LABEL: cnt_f32:
 ; CHECK: cnt z0.s, p0/m, z1.s
@@ -180,5 +190,6 @@
 declare  @llvm.aarch64.sve.cnt.nxv4i32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2i64(, , )
 declare  @llvm.aarch64.sve.cnt.nxv8f16(, , )
+declare  @llvm.aarch64.sve.cnt.nxv8bf16(, , )
 declare  @llvm.aarch64.sve.cnt.nxv4f32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2f64(, , )
Index: llvm/lib/Target/AArch64/SVEInstrFormats.td
===
--- llvm/lib/Target/AArch64/SVEInstrFormats.td
+++ llvm/lib/Target/AArch64/SVEInstrFormats.td
@@ -1020,6 +1020,8 @@
   def : SVE_2_Op_Pat(NAME # _H)>;
   def : SVE_2_Op_Pat(NAME # _S)>;
   def : SVE_2_Op_Pat(NAME # _D)>;
+
+  def : SVE_2_Op_Pat(NAME # _H)>;
 }
 
 multiclass sve2_int_perm_tbl {
@@ -1053,6 +1055,11 @@
 nxv8f16:$Op2, zsub1),
  nxv8i16:$Op3))>;
 
+  def : Pat<(nxv8bf16 (op nxv8bf16:$Op1, nxv8bf16:$Op2, nxv8i16:$Op3)),
+(nxv8bf16 (!cast(NAME # _H) (REG_SEQUENCE ZPR2, nxv8bf16:$Op1, zsub0,
+   

[PATCH] D82448: [AArch64][SVE] Add bfloat16 support to store intrinsics

2020-06-24 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM, just one nit.

Francesco




Comment at: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_st1-bfloat.c:4
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC 
-triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 
-fallow-half-arguments-and-returns -fsyntax-only -verify 
-verify-ignore-unexpected=error -verify-ignore-unexpected=note %s
+
+#include 

Nit: is it worth adding the `ASM-NOT: warning` check that is used in other 
tests? Of course, only if it doesn't fail, for in such case we would have to 
address the problem in a separate patch.

(Same for all the new C tests added in this patch).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82448/new/

https://reviews.llvm.org/D82448



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82391: [AArch64][SVE] Add bfloat16 support to svext intrinsic

2020-06-24 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM! Thank you.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82391/new/

https://reviews.llvm.org/D82391



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82448: [AArch64][SVE] Add bfloat16 support to store intrinsics

2020-06-24 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli requested changes to this revision.
fpetrogalli added a comment.
This revision now requires changes to proceed.

Thank you for updating the patch with the missing tests. I only have one 
request for the code involving assertions, and the use of `let Predicates = 
...`.

Francesco




Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12131-12132
 
+  const bool hasBF16 =
+static_cast(DAG.getSubtarget()).hasBF16();
+

I think there is no need to set up a variable here, you can fold this directly 
in the assertion. Same for the similar change below.



Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1567-1569
+  defm : pred_store;
+  defm : pred_store;
+  defm : pred_store;

nit: I think this change is not necessary, but if you really want to do it, you 
should probably move the 4th column, not the second one. 



Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1571
+
+  let Predicates = [HasBF16] in {
+defm : pred_store;

Doesn't this also need `HasSVE`?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82448/new/

https://reviews.llvm.org/D82448



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82501: [sve][acle] Add reinterpret intrinsics for brain float.

2020-06-24 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: kmclaughlin, efriedma, ctetreau, sdesmalen, 
david-arm.
Herald added subscribers: llvm-commits, cfe-commits, psnobl, rkruppe, 
hiraditya, tschuett.
Herald added projects: clang, LLVM.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82501

Files:
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret-bfloat.c
  clang/utils/TableGen/SveEmitter.cpp
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-bitcast-bfloat.ll

Index: llvm/test/CodeGen/AArch64/sve-bitcast-bfloat.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/sve-bitcast-bfloat.ll
@@ -0,0 +1,119 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
+; RUN: not --crash llc -mtriple=aarch64_be -mattr=+sve,+bf16 < %s
+; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
+
+; WARN-NOT: warning
+
+define  @bitcast_bfloat_to_i8( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_i8:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i16( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_i16:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i32( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_i32:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i64( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_i64:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_half( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_half:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_float( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_float:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_double( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_double:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i8_to_bfloat( %v) {
+; CHECK-LABEL: bitcast_i8_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i16_to_bfloat( %v) {
+; CHECK-LABEL: bitcast_i16_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i32_to_bfloat( %v) {
+; CHECK-LABEL: bitcast_i32_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i64_to_bfloat( %v) {
+; CHECK-LABEL: bitcast_i64_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_half_to_bfloat( %v) {
+; CHECK-LABEL: bitcast_half_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_float_to_bfloat( %v) {
+; CHECK-LABEL: bitcast_float_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_double_to_bfloat( %v) {
+; CHECK-LABEL: bitcast_double_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1435,7 +1435,6 @@
 
 def : Pat<(nxv8f16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8f16 ZPR:$src)>;
-def : Pat<(nxv8bf16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv4f32 ZPR:$src))), (nxv8f16 ZPR:$src)>;
@@ -1456,6 +1455,24 @@
 def : Pat<(nxv2f64 (bitconvert (nxv4f32 ZPR:$src))), (nxv2f64 ZPR:$src)>;
   }
 
+  let Predicates = [IsLE, HasSVE, HasBF16] in {
+def : Pat<(nxv8bf16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv8f16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv4f32 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv2f64 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+
+def : Pat<(nxv16i8 (bitconvert (nxv8bf16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv8i16 (bitconvert (

[PATCH] D82391: [AArch64][SVE] Add bfloat16 support to svext intrinsic

2020-06-24 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli requested changes to this revision.
fpetrogalli added a comment.
This revision now requires changes to proceed.

Putting it on hold as we need to guard those patterns with `HasBF16`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82391/new/

https://reviews.llvm.org/D82391



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82429: [sve][acle] Add some C intrinsics for brain float types.

2020-06-24 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli marked 11 inline comments as done.
fpetrogalli added inline comments.



Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll:1
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
 ; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t

c-rhodes wrote:
> need to add `+bf16` to flags
Instead of passing it as a command line argument, I have added an attribute to 
the functions that test bfloat, to keep it specific only for those functions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82429/new/

https://reviews.llvm.org/D82429



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82429: [sve][acle] Add some C intrinsics for brain float types.

2020-06-24 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli marked an inline comment as done.
fpetrogalli added inline comments.



Comment at: clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c:7
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE2 
-D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -triple aarch64-none-linux-gnu 
-target-feature +sve2 -target-feature +bf16 -fallow-half-arguments-and-returns 
-fsyntax-only -verify -verify-ignore-unexpected=error 
-verify-ignore-unexpected=note %s
+// R UN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE2 
-D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -DSVE_OVERLOADED_FORMS -triple 
aarch64-none-linux-gnu -target-feature +sve2 -target-feature +bf16 
-fallow-half-arguments-and-returns -fsyntax-only -verify=overload-bf16 
-verify-ignore-unexpected=error -verify-ignore-unexpected=note %s
+

I could do with an extra pair of eyes here: I can't figure out why the warning 
raised by this run is not detected by the `overload-bf16-warning` below... 
(Same for the same line I have added in the test for tbx).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82429/new/

https://reviews.llvm.org/D82429



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82429: [sve][acle] Add some C intrinsics for brain float types.

2020-06-24 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 273225.
fpetrogalli marked an inline comment as done.
fpetrogalli added a comment.

Add predicate to the patterns in the backend.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82429/new/

https://reviews.llvm.org/D82429

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_tbl-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbx-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll

Index: llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
===
--- llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
+++ llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
@@ -122,6 +122,16 @@
   ret  %out
 }
 
+define  @ftbx_h_bf16( %a,  %b,  %c) #0 {
+; CHECK-LABEL: ftbx_h_bf16:
+; CHECK: tbx z0.h, z1.h, z2.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbx.nxv8bf16( %a,
+%b,
+%c)
+  ret  %out
+}
+
 define  @tbx_s( %a,  %b,  %c) {
 ; CHECK-LABEL: tbx_s:
 ; CHECK: tbx z0.s, z1.s, z2.s
@@ -179,3 +189,8 @@
 declare  @llvm.aarch64.sve.tbx.nxv8f16(, , )
 declare  @llvm.aarch64.sve.tbx.nxv4f32(, , )
 declare  @llvm.aarch64.sve.tbx.nxv2f64(, , )
+
+declare  @llvm.aarch64.sve.tbx.nxv8bf16(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -1027,6 +1027,15 @@
   ret  %out
 }
 
+define  @tbl_bf16( %a,  %b) #0 {
+; CHECK-LABEL: tbl_bf16:
+; CHECK: tbl z0.h, { z0.h }, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbl.nxv8bf16( %a,
+%b)
+  ret  %out
+}
+
 define  @tbl_f32( %a,  %b) {
 ; CHECK-LABEL: tbl_f32:
 ; CHECK: tbl z0.s, { z0.s }, z1.s
@@ -1933,6 +1942,7 @@
 declare  @llvm.aarch64.sve.tbl.nxv4i32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2i64(, )
 declare  @llvm.aarch64.sve.tbl.nxv8f16(, )
+declare  @llvm.aarch64.sve.tbl.nxv8bf16(, )
 declare  @llvm.aarch64.sve.tbl.nxv4f32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2f64(, )
 
@@ -2027,3 +2037,6 @@
 declare  @llvm.aarch64.sve.zip2.nxv8f16(, )
 declare  @llvm.aarch64.sve.zip2.nxv4f32(, )
 declare  @llvm.aarch64.sve.zip2.nxv2f64(, )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
@@ -145,6 +145,16 @@
   ret  %out
 }
 
+define  @cnt_bf16( %a,  %pg,  %b) #0 {
+; CHECK-LABEL: cnt_bf16:
+; CHECK: cnt z0.h, p0/m, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.cnt.nxv8bf16( %a,
+ %pg,
+ %b)
+  ret  %out
+}
+
 define  @cnt_f32( %a,  %pg,  %b) {
 ; CHECK-LABEL: cnt_f32:
 ; CHECK: cnt z0.s, p0/m, z1.s
@@ -180,5 +190,9 @@
 declare  @llvm.aarch64.sve.cnt.nxv4i32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2i64(, , )
 declare  @llvm.aarch64.sve.cnt.nxv8f16(, , )
+declare  @llvm.aarch64.sve.cnt.nxv8bf16(, , )
 declare  @llvm.aarch64.sve.cnt.nxv4f32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2f64(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -284,6 +284,11 @@
   defm CLS_ZPmZ  : sve_int_un_pred_arit_1<   0b000, "cls",  int_aarch64_sve_cls>;
   defm CLZ_ZPmZ  : sve_int_un_pred_arit_1<   0b001, "clz",  int_aarch64_sve_clz>;
   defm CNT_ZPmZ  : sve_int_un_pred_arit_1<   0b010, "cnt",  int_aarch64_sve_cnt>;
+
+ let Predicates = [HasSVE, HasBF16] in {
+  def : SVE_3_Op_Pat(CNT_ZPmZ_H)>;
+ }
+
   defm CNOT_ZPmZ : sve_int_un_pred_arit_1<   0b011, "cnot", int_aarch64_sve_cnot>;
   defm NOT_ZPmZ  : sve_int_un_pred_arit_1<   0b110, "not",  int_aarch64_sve_not>;
   defm FABS_ZPmZ : sve_int_un_pred_arit_1_fp<0b100, "fabs", i

[PATCH] D82448: [AArch64][SVE] Add bfloat16 support to store intrinsics

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM, thank you!




Comment at: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_st1-bfloat.c:4
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC 
-triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 
-fallow-half-arguments-and-returns -fsyntax-only -verify 
-verify-ignore-unexpected=error -verify-ignore-unexpected=note %s
+
+#include 

kmclaughlin wrote:
> fpetrogalli wrote:
> > Nit: is it worth adding the `ASM-NOT: warning` check that is used in other 
> > tests? Of course, only if it doesn't fail, for in such case we would have 
> > to address the problem in a separate patch.
> > 
> > (Same for all the new C tests added in this patch).
> Adding the check doesn't fail, but I will add these checks to the load & 
> store tests in a separate patch
Yep - thanks for confirming.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82448/new/

https://reviews.llvm.org/D82448



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82501: [sve][acle] Add reinterpret intrinsics for brain float.

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli marked 2 inline comments as done.
fpetrogalli added inline comments.



Comment at: 
clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret-bfloat.c:5
+
+#include 
+

david-arm wrote:
> Hi @fpetrogalli, in the same way that you asked @kmclaughlin if she could add 
> the ASM-NOT check line in her patch, are you able to do that here? You'd need 
> to add an additional RUN line though to compile to assembly. Don't worry if 
> it's not possible though!
It is possible, but my understanding is that we anyway decided to do this work 
as a separate patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82501/new/

https://reviews.llvm.org/D82501



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82391: [AArch64][SVE] Add bfloat16 support to svext intrinsic

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM, thank you!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82391/new/

https://reviews.llvm.org/D82391



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82501: [sve][acle] Add reinterpret intrinsics for brain float.

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 273399.
fpetrogalli marked an inline comment as done.
fpetrogalli added a comment.

@david-arm, at the end I decided to add the `ASM-NOT` test, it was easy and 
came for free.

Also, I have moved the IR tests in the file with all other bitcasts, using a 
funcion attribute to enable the bf16 feature only for those functions that deal 
with bfloats.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82501/new/

https://reviews.llvm.org/D82501

Files:
  clang/utils/TableGen/SveEmitter.cpp
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-bitcast-bfloat.ll
  llvm/test/CodeGen/AArch64/sve-bitcast.ll

Index: llvm/test/CodeGen/AArch64/sve-bitcast.ll
===
--- llvm/test/CodeGen/AArch64/sve-bitcast.ll
+++ llvm/test/CodeGen/AArch64/sve-bitcast.ll
@@ -340,3 +340,118 @@
   %bc = bitcast  %v to 
   ret  %bc
 }
+
+define  @bitcast_bfloat_to_i8( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i8:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i16( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i16:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i32( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i32:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i64( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i64:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_half( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_half:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_float( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_float:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_double( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_double:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i8_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i8_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i16_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i16_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i32_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i32_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i64_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i64_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_half_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_half_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_float_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_float_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_double_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_double_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-bitcast-bfloat.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/sve-bitcast-bfloat.ll
@@ -0,0 +1,119 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 < %s 2>%t | FileCheck %s
+; RUN: not --crash llc -mtriple=aarch64_be -mattr=+sve,+bf16 < %s
+; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
+
+; WARN-NOT: warning
+
+define  @bitcast_bfloat_to_i8( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_i8:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i16( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_i16:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i32( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_i32:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i64( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_i64:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_half( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_half:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_float( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_float:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = 

[PATCH] D82429: [sve][acle] Add some C intrinsics for brain float types.

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli marked 2 inline comments as done.
fpetrogalli added inline comments.



Comment at: clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c:7
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE2 
-D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -triple aarch64-none-linux-gnu 
-target-feature +sve2 -target-feature +bf16 -fallow-half-arguments-and-returns 
-fsyntax-only -verify -verify-ignore-unexpected=error 
-verify-ignore-unexpected=note %s
+// R UN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE2 
-D__ARM_FEATURE_BF16_SCALAR_ARITHMETIC -DSVE_OVERLOADED_FORMS -triple 
aarch64-none-linux-gnu -target-feature +sve2 -target-feature +bf16 
-fallow-half-arguments-and-returns -fsyntax-only -verify=overload-bf16 
-verify-ignore-unexpected=error -verify-ignore-unexpected=note %s
+

c-rhodes wrote:
> c-rhodes wrote:
> > fpetrogalli wrote:
> > > I could do with an extra pair of eyes here: I can't figure out why the 
> > > warning raised by this run is not detected by the `overload-bf16-warning` 
> > > below... (Same for the same line I have added in the test for tbx).
> > Ah, it works in the example I linked because `whilerw` / `whilewr` uses the 
> > scalar `bfloat16_t`, whereas this is using sizeless type which is 
> > predicated on `-D__ARM_FEATURE_SVE_BF16` so we get:
> > 
> > ```error: 'error' diagnostics seen but not expected:
> >   File 
> > /home/culrho01/llvm-project/clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c
> >  Line 18: unknown type name 'svbfloat16_t'; did you mean 'svfloat16_t'?
> >   File 
> > /home/culrho01/llvm-project/clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c
> >  Line 18: unknown type name 'svbfloat16x2_t'; did you mean 
> > 'svfloat16x2_t'?```
> > 
> > I'm not sure if/how we can test this for the overloaded form
> I'm not sure if what I suggested makes sense - trying to do what we've done 
> in the sve2 acle tests where we expect an implicit declaration warning for 
> overloaded/non-overloaded intrinsics if the sve2 feature isn't enabled. I 
> guess it's different for BF16 as the types are guarded on the feature macro 
> in the ACLE, for whatever reason we get the same warning for the 
> non-overloaded intrinsics but an error for the overloaded ones. I think we 
> can be pretty confident `+bf16` is required as the test will fail otherwise, 
> but it's tricky trying to isolate an error implying the macro is missing on 
> the intrinsic. FWIW we don't test this for SVE either, I think we can skip 
> this test for the overloaded form, may as well keep the non-overloaded one in 
> if it works.
Agree. I have removed the overload tests for the warning.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82429/new/

https://reviews.llvm.org/D82429



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82429: [sve][acle] Add some C intrinsics for brain float types.

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 273411.
fpetrogalli added a comment.

Removed the run lines that didn't work, as described in 
https://reviews.llvm.org/D82429#inline-759371


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82429/new/

https://reviews.llvm.org/D82429

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_tbl-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbx-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll

Index: llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
===
--- llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
+++ llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
@@ -122,6 +122,16 @@
   ret  %out
 }
 
+define  @ftbx_h_bf16( %a,  %b,  %c) #0 {
+; CHECK-LABEL: ftbx_h_bf16:
+; CHECK: tbx z0.h, z1.h, z2.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbx.nxv8bf16( %a,
+%b,
+%c)
+  ret  %out
+}
+
 define  @tbx_s( %a,  %b,  %c) {
 ; CHECK-LABEL: tbx_s:
 ; CHECK: tbx z0.s, z1.s, z2.s
@@ -179,3 +189,8 @@
 declare  @llvm.aarch64.sve.tbx.nxv8f16(, , )
 declare  @llvm.aarch64.sve.tbx.nxv4f32(, , )
 declare  @llvm.aarch64.sve.tbx.nxv2f64(, , )
+
+declare  @llvm.aarch64.sve.tbx.nxv8bf16(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -1027,6 +1027,15 @@
   ret  %out
 }
 
+define  @tbl_bf16( %a,  %b) #0 {
+; CHECK-LABEL: tbl_bf16:
+; CHECK: tbl z0.h, { z0.h }, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbl.nxv8bf16( %a,
+%b)
+  ret  %out
+}
+
 define  @tbl_f32( %a,  %b) {
 ; CHECK-LABEL: tbl_f32:
 ; CHECK: tbl z0.s, { z0.s }, z1.s
@@ -1933,6 +1942,7 @@
 declare  @llvm.aarch64.sve.tbl.nxv4i32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2i64(, )
 declare  @llvm.aarch64.sve.tbl.nxv8f16(, )
+declare  @llvm.aarch64.sve.tbl.nxv8bf16(, )
 declare  @llvm.aarch64.sve.tbl.nxv4f32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2f64(, )
 
@@ -2027,3 +2037,6 @@
 declare  @llvm.aarch64.sve.zip2.nxv8f16(, )
 declare  @llvm.aarch64.sve.zip2.nxv4f32(, )
 declare  @llvm.aarch64.sve.zip2.nxv2f64(, )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
@@ -145,6 +145,16 @@
   ret  %out
 }
 
+define  @cnt_bf16( %a,  %pg,  %b) #0 {
+; CHECK-LABEL: cnt_bf16:
+; CHECK: cnt z0.h, p0/m, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.cnt.nxv8bf16( %a,
+ %pg,
+ %b)
+  ret  %out
+}
+
 define  @cnt_f32( %a,  %pg,  %b) {
 ; CHECK-LABEL: cnt_f32:
 ; CHECK: cnt z0.s, p0/m, z1.s
@@ -180,5 +190,9 @@
 declare  @llvm.aarch64.sve.cnt.nxv4i32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2i64(, , )
 declare  @llvm.aarch64.sve.cnt.nxv8f16(, , )
+declare  @llvm.aarch64.sve.cnt.nxv8bf16(, , )
 declare  @llvm.aarch64.sve.cnt.nxv4f32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2f64(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -284,6 +284,11 @@
   defm CLS_ZPmZ  : sve_int_un_pred_arit_1<   0b000, "cls",  int_aarch64_sve_cls>;
   defm CLZ_ZPmZ  : sve_int_un_pred_arit_1<   0b001, "clz",  int_aarch64_sve_clz>;
   defm CNT_ZPmZ  : sve_int_un_pred_arit_1<   0b010, "cnt",  int_aarch64_sve_cnt>;
+
+ let Predicates = [HasSVE, HasBF16] in {
+  def : SVE_3_Op_Pat(CNT_ZPmZ_H)>;
+ }
+
   defm CNOT_ZPmZ : sve_int_un_pred_arit_1<   0b011, "cnot", int_aarch64_sve_cnot>;
   defm NOT_ZPmZ  : sve_int_un_pred_arit_1<   0b110, "not",  int_aarch64_sve_not>;
   defm FABS_ZPmZ : sve_int_un_pred_arit_1_fp<0b100

[PATCH] D82429: [sve][acle] Add some C intrinsics for brain float types.

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
fpetrogalli marked an inline comment as done.
Closed by commit rG7200fa38a912: [sve][acle] Add some C intrinsics for brain 
float types. (authored by fpetrogalli).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82429/new/

https://reviews.llvm.org/D82429

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_tbl-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbx-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll

Index: llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
===
--- llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
+++ llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
@@ -122,6 +122,16 @@
   ret  %out
 }
 
+define  @ftbx_h_bf16( %a,  %b,  %c) #0 {
+; CHECK-LABEL: ftbx_h_bf16:
+; CHECK: tbx z0.h, z1.h, z2.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbx.nxv8bf16( %a,
+%b,
+%c)
+  ret  %out
+}
+
 define  @tbx_s( %a,  %b,  %c) {
 ; CHECK-LABEL: tbx_s:
 ; CHECK: tbx z0.s, z1.s, z2.s
@@ -179,3 +189,8 @@
 declare  @llvm.aarch64.sve.tbx.nxv8f16(, , )
 declare  @llvm.aarch64.sve.tbx.nxv4f32(, , )
 declare  @llvm.aarch64.sve.tbx.nxv2f64(, , )
+
+declare  @llvm.aarch64.sve.tbx.nxv8bf16(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -1027,6 +1027,15 @@
   ret  %out
 }
 
+define  @tbl_bf16( %a,  %b) #0 {
+; CHECK-LABEL: tbl_bf16:
+; CHECK: tbl z0.h, { z0.h }, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.tbl.nxv8bf16( %a,
+%b)
+  ret  %out
+}
+
 define  @tbl_f32( %a,  %b) {
 ; CHECK-LABEL: tbl_f32:
 ; CHECK: tbl z0.s, { z0.s }, z1.s
@@ -1933,6 +1942,7 @@
 declare  @llvm.aarch64.sve.tbl.nxv4i32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2i64(, )
 declare  @llvm.aarch64.sve.tbl.nxv8f16(, )
+declare  @llvm.aarch64.sve.tbl.nxv8bf16(, )
 declare  @llvm.aarch64.sve.tbl.nxv4f32(, )
 declare  @llvm.aarch64.sve.tbl.nxv2f64(, )
 
@@ -2027,3 +2037,6 @@
 declare  @llvm.aarch64.sve.zip2.nxv8f16(, )
 declare  @llvm.aarch64.sve.zip2.nxv4f32(, )
 declare  @llvm.aarch64.sve.zip2.nxv2f64(, )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
@@ -145,6 +145,16 @@
   ret  %out
 }
 
+define  @cnt_bf16( %a,  %pg,  %b) #0 {
+; CHECK-LABEL: cnt_bf16:
+; CHECK: cnt z0.h, p0/m, z1.h
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.cnt.nxv8bf16( %a,
+ %pg,
+ %b)
+  ret  %out
+}
+
 define  @cnt_f32( %a,  %pg,  %b) {
 ; CHECK-LABEL: cnt_f32:
 ; CHECK: cnt z0.s, p0/m, z1.s
@@ -180,5 +190,9 @@
 declare  @llvm.aarch64.sve.cnt.nxv4i32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2i64(, , )
 declare  @llvm.aarch64.sve.cnt.nxv8f16(, , )
+declare  @llvm.aarch64.sve.cnt.nxv8bf16(, , )
 declare  @llvm.aarch64.sve.cnt.nxv4f32(, , )
 declare  @llvm.aarch64.sve.cnt.nxv2f64(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -284,6 +284,11 @@
   defm CLS_ZPmZ  : sve_int_un_pred_arit_1<   0b000, "cls",  int_aarch64_sve_cls>;
   defm CLZ_ZPmZ  : sve_int_un_pred_arit_1<   0b001, "clz",  int_aarch64_sve_clz>;
   defm CNT_ZPmZ  : sve_int_un_pred_arit_1<   0b010, "cnt",  int_aarch64_sve_cnt>;
+
+ let Predicates = [HasSVE, HasBF16] in {
+  def : SVE_3_Op_Pat(CNT_ZPmZ_H)>;
+ }
+
   defm CNOT_ZPmZ : sve_int_un_pred_arit_1<   0b011, "cnot", int_aarch64_sve_cnot>;
   defm NOT_ZPmZ  : sve_int_un_pred_arit_1<   0b110, "not",  int_aarch64_sve_not

[PATCH] D82345: [sve][acle] Implement some of the C intrinsics for brain float.

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 273498.
fpetrogalli marked 5 inline comments as done.
fpetrogalli added a comment.

This patch needed some love...

@c-rhodes, I have addressed your feedback, thank you.

I have also predicated all the instruction selection pattern for 
`-mattr=+bf16`, and I have updated the test to use the per-function attribute 
instead of adding the extra `-mattr=+bf16` option at command line.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82345/new/

https://reviews.llvm.org/D82345

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clastb-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_insr-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lastb-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll

Index: llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
@@ -165,6 +165,14 @@
   ret  %out
 }
 
+define  @insr_bf16( %a, bfloat %b) #0 {
+; CHECK-LABEL: insr_bf16:
+; CHECK: insr z0.h, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.insr.nxv8bf16( %a, bfloat %b)
+  ret  %out
+}
+
 define  @insr_f32( %a, float %b) {
 ; CHECK-LABEL: insr_f32:
 ; CHECK: insr z0.s, s1
@@ -348,6 +356,7 @@
 declare  @llvm.aarch64.sve.insr.nxv4i32(, i32)
 declare  @llvm.aarch64.sve.insr.nxv2i64(, i64)
 declare  @llvm.aarch64.sve.insr.nxv8f16(, half)
+declare  @llvm.aarch64.sve.insr.nxv8bf16(, bfloat)
 declare  @llvm.aarch64.sve.insr.nxv4f32(, float)
 declare  @llvm.aarch64.sve.insr.nxv2f64(, double)
 
@@ -368,3 +377,6 @@
 declare  @llvm.aarch64.sve.lsr.wide.nxv16i8(, , )
 declare  @llvm.aarch64.sve.lsr.wide.nxv8i16(, , )
 declare  @llvm.aarch64.sve.lsr.wide.nxv4i32(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @dup_bf16( %a,  %pg, bfloat %b) #0 {
+; CHECK-LABEL: dup_bf16:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %a,
+%pg,
+   bfloat %b)
+  ret  %out
+}
+
 define  @dup_f32( %a,  %pg, float %b) {
 ; CHECK-LABEL: dup_f32:
 ; CHECK: mov z0.s, p0/m, s1
@@ -77,10 +87,41 @@
   ret  %out
 }
 
+define  @test_svdup_n_bf16_z( %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_z:
+; CHECK: mov z1.h, #0
+; CHECK: mov z1.h, p0/m, h0
+; CHECK: mov z0.d, z1.d
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( zeroinitializer,  %pg, bfloat %op)
+  ret  %out
+}
+
+define  @test_svdup_n_bf16_m( %inactive,  %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_m:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %inactive,  %pg, bfloat %op)
+  ret  %out
+}
+
+
+define  @test_svdup_n_bf16_x( %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_x:
+; CHECK: mov z0.h, p0/m, h0
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( undef,  %pg, bfloat %op)
+  ret  %out
+}
+
 declare  @llvm.aarch64.sve.dup.nxv16i8(, , i8)
 declare  @llvm.aarch64.sve.dup.nxv8i16(, , i16)
 declare  @llvm.aarch64.sve.dup.nxv4i32(, , i32)
 declare  @llvm.aarch64.sve.dup.nxv2i64(, , i64)
 declare  @llvm.aarch64.sve.dup.nxv8f16(, , half)
+declare  @llvm.aarch64.sve.dup.nxv8bf16(, , bfloat)
 declare  @llvm.aarch64.sve.dup.nxv4f32(, , float)
 declare  @llvm.aarch64.sve.dup.nxv2f64(, , double)
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @clasta_bf16( %pg,  %a,  %b) #0 {
+; CHECK-LABEL: clasta_bf16:
+; CHECK: clasta z0.h, p0, z0.h, z1.h
+; CHECK-NEXT: ret
+  %out = c

[PATCH] D82623: [sve][acle] Enable feature macros for SVE ACLE extensions.

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

Reviewers, I have added 2 parent revision with the last two set of intrinsics 
that are enabled by the macro introduced in this patch. I will update those 
tests in this patch once the patches are in. Meanwhile, please double check 
that my interpretation of the feature macros and relative target feature flags 
is correct.

Grazie,

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82623/new/

https://reviews.llvm.org/D82623



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82623: [sve][acle] Enable feature macros for SVE ACLE extensions.

2020-06-25 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: sdesmalen, efriedma, c-rhodes, kmclaughlin, 
SjoerdMeijer.
Herald added subscribers: cfe-commits, psnobl, rkruppe, kristof.beyls, tschuett.
Herald added a reviewer: rengolin.
Herald added a project: clang.
fpetrogalli added parent revisions: D82345: [sve][acle] Implement some of the C 
intrinsics for brain float., D82501: [sve][acle] Add reinterpret intrinsics for 
brain float..
fpetrogalli added a comment.

Reviewers, I have added 2 parent revision with the last two set of intrinsics 
that are enabled by the macro introduced in this patch. I will update those 
tests in this patch once the patches are in. Meanwhile, please double check 
that my interpretation of the feature macros and relative target feature flags 
is correct.

Grazie,

Francesco


The following feature macros have been added:

__ARM_FEATURE_SVE_BF16

__ARM_FEATURE_SVE_MATMUL_INT8

__ARM_FEATURE_SVE_MATMUL_FP32

__ARM_FEATURE_SVE_MATMUL_FP64

The driver has been updated to enable them accordingly to the value of
the target feature passed at command line.

The SVE ACLE tests using the macros have been modified to work with
the target feature instead of passing the macro at command line.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82623

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfdot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmlalt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfmmla.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvt-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cvtnt.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ro.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld1rq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld2-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld3-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ld4-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ldff1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ldnf1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_ldnt1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_len-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp32.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_rev-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_sel-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_splice-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_st2-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_st3-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_st4-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_sudot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_tbl-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn1-fp64-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn1-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn2-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn2-fp64-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_trn2-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_usdot.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp1-fp64-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp1-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp2-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp2-fp64-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_uzp2-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip1-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip1-fp64-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip1-fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip2-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip2-fp64-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_zip2-fp64.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbl2-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_tbx-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_whilerw-bfloat.c
  clang/test/CodeGen/aarch64-sve2-intrinsics/acle_sve2_whilewr-bfloat.c
  clang/test/Preprocessor/aarch64-target-features.c

Index: clang/test/Preprocessor/aarch64-target-features.c

[PATCH] D82501: [sve][acle] Add reinterpret intrinsics for brain float.

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli marked 2 inline comments as done.
fpetrogalli added inline comments.



Comment at: llvm/test/CodeGen/AArch64/sve-bitcast-bfloat.ll:8
+
+define  @bitcast_bfloat_to_i8( %v) {
+; CHECK-LABEL: bitcast_bfloat_to_i8:

david-arm wrote:
> Aren't these tests all duplicates of ones in 
> llvm/test/CodeGen/AArch64/sve-bitcast.ll? Looks like you can remove this file 
> completely.
(facepalm). yes, I remember thinking "I have to remove them before updating the 
patch", and then I forgot... I will do it before submitting. Thank you for 
pointing this out.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82501/new/

https://reviews.llvm.org/D82501



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82501: [sve][acle] Add reinterpret intrinsics for brain float.

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 273737.
fpetrogalli marked an inline comment as done.
fpetrogalli added a comment.

I removed the duplicate tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82501/new/

https://reviews.llvm.org/D82501

Files:
  clang/utils/TableGen/SveEmitter.cpp
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-bitcast.ll

Index: llvm/test/CodeGen/AArch64/sve-bitcast.ll
===
--- llvm/test/CodeGen/AArch64/sve-bitcast.ll
+++ llvm/test/CodeGen/AArch64/sve-bitcast.ll
@@ -340,3 +340,118 @@
   %bc = bitcast  %v to 
   ret  %bc
 }
+
+define  @bitcast_bfloat_to_i8( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i8:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i16( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i16:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i32( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i32:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i64( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i64:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_half( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_half:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_float( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_float:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_double( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_double:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i8_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i8_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i16_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i16_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i32_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i32_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i64_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i64_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_half_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_half_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_float_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_float_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_double_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_double_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1464,7 +1464,6 @@
 
 def : Pat<(nxv8f16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8f16 ZPR:$src)>;
-def : Pat<(nxv8bf16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv4f32 ZPR:$src))), (nxv8f16 ZPR:$src)>;
@@ -1485,6 +1484,24 @@
 def : Pat<(nxv2f64 (bitconvert (nxv4f32 ZPR:$src))), (nxv2f64 ZPR:$src)>;
   }
 
+  let Predicates = [IsLE, HasSVE, HasBF16] in {
+def : Pat<(nxv8bf16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv8f16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv4f32 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv2f64 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+
+def : Pat<(nxv16i8 (bitconvert (nxv8bf16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv8i16 (bitconvert (nxv8bf16 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+def : Pat<(nxv4i32 (bitconvert (nxv8bf16 ZPR:$src))), (nxv4i32 ZPR:$src)>;
+def : Pat<(nxv2i64 (bitconvert (nxv8bf16 ZPR:$src))), (nxv2i64 ZPR:$src)>;
+def : P

[PATCH] D82501: [sve][acle] Add reinterpret intrinsics for brain float.

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGa15722c5ce47: [sve][acle] Add reinterpret intrinsics for 
brain float. (authored by fpetrogalli).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82501/new/

https://reviews.llvm.org/D82501

Files:
  clang/utils/TableGen/SveEmitter.cpp
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-bitcast.ll

Index: llvm/test/CodeGen/AArch64/sve-bitcast.ll
===
--- llvm/test/CodeGen/AArch64/sve-bitcast.ll
+++ llvm/test/CodeGen/AArch64/sve-bitcast.ll
@@ -340,3 +340,118 @@
   %bc = bitcast  %v to 
   ret  %bc
 }
+
+define  @bitcast_bfloat_to_i8( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i8:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i16( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i16:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i32( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i32:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_i64( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_i64:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_half( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_half:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_float( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_float:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_bfloat_to_double( %v) #0 {
+; CHECK-LABEL: bitcast_bfloat_to_double:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i8_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i8_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i16_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i16_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i32_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i32_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_i64_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_i64_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_half_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_half_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_float_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_float_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+define  @bitcast_double_to_bfloat( %v) #0 {
+; CHECK-LABEL: bitcast_double_to_bfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %bc = bitcast  %v to 
+  ret  %bc
+}
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
===
--- llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1464,7 +1464,6 @@
 
 def : Pat<(nxv8f16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8f16 ZPR:$src)>;
-def : Pat<(nxv8bf16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8f16 ZPR:$src)>;
 def : Pat<(nxv8f16 (bitconvert (nxv4f32 ZPR:$src))), (nxv8f16 ZPR:$src)>;
@@ -1485,6 +1484,24 @@
 def : Pat<(nxv2f64 (bitconvert (nxv4f32 ZPR:$src))), (nxv2f64 ZPR:$src)>;
   }
 
+  let Predicates = [IsLE, HasSVE, HasBF16] in {
+def : Pat<(nxv8bf16 (bitconvert (nxv16i8 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv4i32 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv8f16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv4f32 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv2f64 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+
+def : Pat<(nxv16i8 (bitconvert (nxv8bf16 ZPR:$src))), (nxv16i8 ZPR:$src)>;
+def : Pat<(nxv8i16 (bitconvert (nxv8bf16 ZPR:$src))), (nxv8i16 ZPR:$src)>;
+def : Pat<(nxv4i32 (bitconvert (nxv8bf16 ZPR:$src))), (nxv4i32 ZPR:$src)>;
+def : Pat<(nxv2i64 (bitconvert (nxv8bf16 ZPR:$src))), 

[PATCH] D82345: [sve][acle] Implement some of the C intrinsics for brain float.

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:426-427
+  let Predicates = [HasSVE, HasBF16] in {
+def : Pat<(nxv8bf16 (AArch64dup (bf16 FPR16:$src))),
+  (DUP_ZZI_H (INSERT_SUBREG (IMPLICIT_DEF), FPR16:$src, hsub), 0)>;
+  }

c-rhodes wrote:
> I think we're missing a test for this pattern in 
> `llvm/test/CodeGen/AArch64/sve-vector-splat.ll`? Same applies to dup 0 
> patterns below.
I have added these patters to allow adding the regression tests in this patch, 
so they are somehow guarded by the tests. I tried to add the test cases anyway 
in sve-vector-splat.ll, but the following one crashes the compiler, so the 
whole "splatting a bfloat constant" deserve a separate patch.

```
define  @splat_nxv8bf16_imm() #0 {
; CHECK-LABEL: splat_nxv8bf16_imm:
; CHECK: mov z0.h, #1.0
; CHECK-NEXT: ret
  %1 = insertelement  undef, bfloat 1.0, i32 0
  %2 = shufflevector  %1,  undef, 
 zeroinitializer
  ret  %2
}
```

I will create a new revision and make it a parent of this one.



Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1496-1498
+def : Pat<(nxv2i64 (bitconvert (nxv8bf16 ZPR:$src))), (nxv2i64 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv2i64 ZPR:$src))), (nxv8bf16 ZPR:$src)>;
+def : Pat<(nxv8bf16 (bitconvert (nxv8i16 ZPR:$src))), (nxv8bf16 ZPR:$src)>;

c-rhodes wrote:
> missing tests in `llvm/test/CodeGen/AArch64/sve-bitcast.ll`
The bitconvert patterns went in via D82501. This code is not present anymore in 
this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82345/new/

https://reviews.llvm.org/D82345



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82345: [sve][acle] Implement some of the C intrinsics for brain float.

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 273813.
fpetrogalli marked 7 inline comments as done.
fpetrogalli added a comment.

Hi @c-rhodes,

I have addressed all your comments but one, the one that asks to add the test 
cases for the splat, as it deserves a separate patch.

I will ping you when it is ready.

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82345/new/

https://reviews.llvm.org/D82345

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clastb-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_insr-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lastb-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
  llvm/test/CodeGen/AArch64/sve-vector-splat.ll

Index: llvm/test/CodeGen/AArch64/sve-vector-splat.ll
===
--- llvm/test/CodeGen/AArch64/sve-vector-splat.ll
+++ llvm/test/CodeGen/AArch64/sve-vector-splat.ll
@@ -172,6 +172,15 @@
 
 ;; Splats of legal floating point vector types
 
+define  @splat_nxv8bf16(bfloat %val) #0 {
+; CHECK-LABEL: splat_nxv8bf16:
+; CHECK: mov z0.h, h0
+; CHECK-NEXT: ret
+  %1 = insertelement  undef, bfloat %val, i32 0
+  %2 = shufflevector  %1,  undef,  zeroinitializer
+  ret  %2
+}
+
 define  @splat_nxv8f16(half %val) {
 ; CHECK-LABEL: splat_nxv8f16:
 ; CHECK: mov z0.h, h0
@@ -233,6 +242,13 @@
   ret  zeroinitializer
 }
 
+define  @splat_nxv8bf16_zero() #0 {
+; CHECK-LABEL: splat_nxv8bf16_zero:
+; CHECK: mov z0.h, #0
+; CHECK-NEXT: ret
+  ret  zeroinitializer
+}
+
 define  @splat_nxv4f16_zero() {
 ; CHECK-LABEL: splat_nxv4f16_zero:
 ; CHECK: mov z0.h, #0
@@ -321,3 +337,6 @@
   %2 = shufflevector  %1,  undef,  zeroinitializer
   ret  %2
 }
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
@@ -165,6 +165,14 @@
   ret  %out
 }
 
+define  @insr_bf16( %a, bfloat %b) #0 {
+; CHECK-LABEL: insr_bf16:
+; CHECK: insr z0.h, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.insr.nxv8bf16( %a, bfloat %b)
+  ret  %out
+}
+
 define  @insr_f32( %a, float %b) {
 ; CHECK-LABEL: insr_f32:
 ; CHECK: insr z0.s, s1
@@ -348,6 +356,7 @@
 declare  @llvm.aarch64.sve.insr.nxv4i32(, i32)
 declare  @llvm.aarch64.sve.insr.nxv2i64(, i64)
 declare  @llvm.aarch64.sve.insr.nxv8f16(, half)
+declare  @llvm.aarch64.sve.insr.nxv8bf16(, bfloat)
 declare  @llvm.aarch64.sve.insr.nxv4f32(, float)
 declare  @llvm.aarch64.sve.insr.nxv2f64(, double)
 
@@ -368,3 +377,6 @@
 declare  @llvm.aarch64.sve.lsr.wide.nxv16i8(, , )
 declare  @llvm.aarch64.sve.lsr.wide.nxv8i16(, , )
 declare  @llvm.aarch64.sve.lsr.wide.nxv4i32(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @dup_bf16( %a,  %pg, bfloat %b) #0 {
+; CHECK-LABEL: dup_bf16:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %a,
+%pg,
+   bfloat %b)
+  ret  %out
+}
+
 define  @dup_f32( %a,  %pg, float %b) {
 ; CHECK-LABEL: dup_f32:
 ; CHECK: mov z0.s, p0/m, s1
@@ -77,10 +87,41 @@
   ret  %out
 }
 
+define  @test_svdup_n_bf16_z( %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_z:
+; CHECK: mov z1.h, #0
+; CHECK: mov z1.h, p0/m, h0
+; CHECK: mov z0.d, z1.d
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( zeroinitializer,  %pg, bfloat %op)
+  ret  %out
+}
+
+define  @test_svdup_n_bf16_m( %inactive,  %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_m:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %inactive,  %pg, bfloat %op)
+  ret  %out
+}
+
+
+define  @test_svdup_n_bf16_x( %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_x:
+; CHECK: mov z0.h, p0/m, h0
+; C

[PATCH] D82345: [sve][acle] Implement some of the C intrinsics for brain float.

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli marked an inline comment as done.
fpetrogalli added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:426-427
+  let Predicates = [HasSVE, HasBF16] in {
+def : Pat<(nxv8bf16 (AArch64dup (bf16 FPR16:$src))),
+  (DUP_ZZI_H (INSERT_SUBREG (IMPLICIT_DEF), FPR16:$src, hsub), 0)>;
+  }

fpetrogalli wrote:
> c-rhodes wrote:
> > I think we're missing a test for this pattern in 
> > `llvm/test/CodeGen/AArch64/sve-vector-splat.ll`? Same applies to dup 0 
> > patterns below.
> I have added these patters to allow adding the regression tests in this 
> patch, so they are somehow guarded by the tests. I tried to add the test 
> cases anyway in sve-vector-splat.ll, but the following one crashes the 
> compiler, so the whole "splatting a bfloat constant" deserve a separate patch.
> 
> ```
> define  @splat_nxv8bf16_imm() #0 {
> ; CHECK-LABEL: splat_nxv8bf16_imm:
> ; CHECK: mov z0.h, #1.0
> ; CHECK-NEXT: ret
>   %1 = insertelement  undef, bfloat 1.0, i32 0
>   %2 = shufflevector  %1,  undef, 
>  zeroinitializer
>   ret  %2
> }
> ```
> 
> I will create a new revision and make it a parent of this one.
(facepalm) There is no "dup" instruction for bfloat immediates... that's why 
this is not working. I guess a separate patch is not needed, this one is 
enough...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82345/new/

https://reviews.llvm.org/D82345



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82501: [sve][acle] Add reinterpret intrinsics for brain float.

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

Ops, I accidentally removed the C tests... I'll revert, add the tests and 
recommit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82501/new/

https://reviews.llvm.org/D82501



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82501: [sve][acle] Add reinterpret intrinsics for brain float.

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

Reverted in 
https://github.com/llvm/llvm-project/commit/ff5ccf258e297df29f32d6b5e4fa0a7b95c44f9c


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82501/new/

https://reviews.llvm.org/D82501



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82668: [AArch64][SVE] clang: Add missing svbfloat16_t tests

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/test/CodeGenCXX/aarch64-sve-typeinfo.cpp:4
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:   -target-feature +sve | FileCheck %s
+// RUN:   -target-feature +sve,+bf16 | FileCheck %s
 

I wonder if we should keep the bf16 tests separate, as +bf16 is not needed to 
generate any of the other SVE types. I don't have strong opinions here, but it 
seems a better thing to do to isolate the bfloat tests in SVE in a separate 
file. What do you think, @c-rhodes?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82668/new/

https://reviews.llvm.org/D82668



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82665: [AArch64][SVE] Add bfloat16 to outstanding tuple vector intrinsics

2020-06-26 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: 
clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_create2-bfloat.c:9
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1,A2_UNUSED,A3,A4_UNUSED) A1##A3
+#else

Nit: these are all new files, so you can safely use clang-format on them, as 
they will not modify the formatting of pre-existing tests. I know this is not 
what we have done for most of the files in the ACLE tests, but it makes life so 
much easier to just run a `git clang-format HEAD^` on a patch and kinda 
forgetting about formatting! Since they are new files, they will not generate 
conflicts with anything we might have downstream. FWIW, this is my personal 
preference, so if you prefer to adhere with the manual formatting of other 
files, I am happy with it.



Comment at: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_get2-bfloat.c:20
+  // expected-warning@+1 {{implicit declaration of function 'svget2_bf16'}}
+  return SVE_ACLE_FUNC(svget2,_bf16,,)(tuple, 0);
+}

Shouldn't we test also values other then zero? 0,1 for get2, 0,1,2 for get3... 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82665/new/

https://reviews.llvm.org/D82665



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82665: [AArch64][SVE] Add bfloat16 to outstanding tuple vector intrinsics

2020-06-29 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-create-tuple.ll:1
-; RUN: llc -mtriple aarch64 -mattr=+sve -asm-verbose=1 < %s | FileCheck %s
+; RUN: llc -mtriple aarch64 -mattr=+sve,+bf16 -asm-verbose=1 < %s | FileCheck 
%s
 

I think you should use the function attribute trick in this test, not the 
command line option, like we have done in other tests.



Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-create-tuple.ll:100
+
+define  @test_svcreate2_bf16_vec0(i1 %p,  %z0,  %z1) local_unnamed_addr #0 {
+; CHECK-LABEL: test_svcreate2_bf16_vec0:

nit: remove `local_unnamed_addr ` from all tests you have added.



Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-create-tuple.ll:109
+L2:
+  %extract = tail call  
@llvm.aarch64.sve.tuple.get.nxv8bf16.nxv16bf16( %tuple, 
i32 0)
+  ret  %extract

out of curiosity why not test the create intrinsic directly?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82665/new/

https://reviews.llvm.org/D82665



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82345: [sve][acle] Implement some of the C intrinsics for brain float.

2020-06-29 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG67e4330facfb: [sve][acle] Implement some of the C intrinsics 
for brain float. (authored by fpetrogalli).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82345/new/

https://reviews.llvm.org/D82345

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clastb-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_insr-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lastb-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
  llvm/test/CodeGen/AArch64/sve-vector-splat.ll

Index: llvm/test/CodeGen/AArch64/sve-vector-splat.ll
===
--- llvm/test/CodeGen/AArch64/sve-vector-splat.ll
+++ llvm/test/CodeGen/AArch64/sve-vector-splat.ll
@@ -172,6 +172,15 @@
 
 ;; Splats of legal floating point vector types
 
+define  @splat_nxv8bf16(bfloat %val) #0 {
+; CHECK-LABEL: splat_nxv8bf16:
+; CHECK: mov z0.h, h0
+; CHECK-NEXT: ret
+  %1 = insertelement  undef, bfloat %val, i32 0
+  %2 = shufflevector  %1,  undef,  zeroinitializer
+  ret  %2
+}
+
 define  @splat_nxv8f16(half %val) {
 ; CHECK-LABEL: splat_nxv8f16:
 ; CHECK: mov z0.h, h0
@@ -233,6 +242,13 @@
   ret  zeroinitializer
 }
 
+define  @splat_nxv8bf16_zero() #0 {
+; CHECK-LABEL: splat_nxv8bf16_zero:
+; CHECK: mov z0.h, #0
+; CHECK-NEXT: ret
+  ret  zeroinitializer
+}
+
 define  @splat_nxv4f16_zero() {
 ; CHECK-LABEL: splat_nxv4f16_zero:
 ; CHECK: mov z0.h, #0
@@ -321,3 +337,6 @@
   %2 = shufflevector  %1,  undef,  zeroinitializer
   ret  %2
 }
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
@@ -165,6 +165,14 @@
   ret  %out
 }
 
+define  @insr_bf16( %a, bfloat %b) #0 {
+; CHECK-LABEL: insr_bf16:
+; CHECK: insr z0.h, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.insr.nxv8bf16( %a, bfloat %b)
+  ret  %out
+}
+
 define  @insr_f32( %a, float %b) {
 ; CHECK-LABEL: insr_f32:
 ; CHECK: insr z0.s, s1
@@ -348,6 +356,7 @@
 declare  @llvm.aarch64.sve.insr.nxv4i32(, i32)
 declare  @llvm.aarch64.sve.insr.nxv2i64(, i64)
 declare  @llvm.aarch64.sve.insr.nxv8f16(, half)
+declare  @llvm.aarch64.sve.insr.nxv8bf16(, bfloat)
 declare  @llvm.aarch64.sve.insr.nxv4f32(, float)
 declare  @llvm.aarch64.sve.insr.nxv2f64(, double)
 
@@ -368,3 +377,6 @@
 declare  @llvm.aarch64.sve.lsr.wide.nxv16i8(, , )
 declare  @llvm.aarch64.sve.lsr.wide.nxv8i16(, , )
 declare  @llvm.aarch64.sve.lsr.wide.nxv4i32(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @dup_bf16( %a,  %pg, bfloat %b) #0 {
+; CHECK-LABEL: dup_bf16:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %a,
+%pg,
+   bfloat %b)
+  ret  %out
+}
+
 define  @dup_f32( %a,  %pg, float %b) {
 ; CHECK-LABEL: dup_f32:
 ; CHECK: mov z0.s, p0/m, s1
@@ -77,10 +87,41 @@
   ret  %out
 }
 
+define  @test_svdup_n_bf16_z( %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_z:
+; CHECK: mov z1.h, #0
+; CHECK: mov z1.h, p0/m, h0
+; CHECK: mov z0.d, z1.d
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( zeroinitializer,  %pg, bfloat %op)
+  ret  %out
+}
+
+define  @test_svdup_n_bf16_m( %inactive,  %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_m:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %inactive,  %pg, bfloat %op)
+  ret  %out
+}
+
+
+define  @test_svdup_n_bf16_x( %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_x:
+; CHECK: mov z0.h, p0/m, h0
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( undef,  %pg, bfloat %op)
+  ret  %out
+}
+
 declare  @llvm.

[PATCH] D82345: [sve][acle] Implement some of the C intrinsics for brain float.

2020-06-29 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 274149.
fpetrogalli added a comment.

Rebase on top of master. NFC


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82345/new/

https://reviews.llvm.org/D82345

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_clastb-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_dupq-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_insr-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lasta-bfloat.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_lastb-bfloat.c
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-dup-x.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-perm-select.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
  llvm/test/CodeGen/AArch64/sve-vector-splat.ll

Index: llvm/test/CodeGen/AArch64/sve-vector-splat.ll
===
--- llvm/test/CodeGen/AArch64/sve-vector-splat.ll
+++ llvm/test/CodeGen/AArch64/sve-vector-splat.ll
@@ -172,6 +172,15 @@
 
 ;; Splats of legal floating point vector types
 
+define  @splat_nxv8bf16(bfloat %val) #0 {
+; CHECK-LABEL: splat_nxv8bf16:
+; CHECK: mov z0.h, h0
+; CHECK-NEXT: ret
+  %1 = insertelement  undef, bfloat %val, i32 0
+  %2 = shufflevector  %1,  undef,  zeroinitializer
+  ret  %2
+}
+
 define  @splat_nxv8f16(half %val) {
 ; CHECK-LABEL: splat_nxv8f16:
 ; CHECK: mov z0.h, h0
@@ -233,6 +242,13 @@
   ret  zeroinitializer
 }
 
+define  @splat_nxv8bf16_zero() #0 {
+; CHECK-LABEL: splat_nxv8bf16_zero:
+; CHECK: mov z0.h, #0
+; CHECK-NEXT: ret
+  ret  zeroinitializer
+}
+
 define  @splat_nxv4f16_zero() {
 ; CHECK-LABEL: splat_nxv4f16_zero:
 ; CHECK: mov z0.h, #0
@@ -321,3 +337,6 @@
   %2 = shufflevector  %1,  undef,  zeroinitializer
   ret  %2
 }
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-shifts.ll
@@ -165,6 +165,14 @@
   ret  %out
 }
 
+define  @insr_bf16( %a, bfloat %b) #0 {
+; CHECK-LABEL: insr_bf16:
+; CHECK: insr z0.h, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.insr.nxv8bf16( %a, bfloat %b)
+  ret  %out
+}
+
 define  @insr_f32( %a, float %b) {
 ; CHECK-LABEL: insr_f32:
 ; CHECK: insr z0.s, s1
@@ -348,6 +356,7 @@
 declare  @llvm.aarch64.sve.insr.nxv4i32(, i32)
 declare  @llvm.aarch64.sve.insr.nxv2i64(, i64)
 declare  @llvm.aarch64.sve.insr.nxv8f16(, half)
+declare  @llvm.aarch64.sve.insr.nxv8bf16(, bfloat)
 declare  @llvm.aarch64.sve.insr.nxv4f32(, float)
 declare  @llvm.aarch64.sve.insr.nxv2f64(, double)
 
@@ -368,3 +377,6 @@
 declare  @llvm.aarch64.sve.lsr.wide.nxv16i8(, , )
 declare  @llvm.aarch64.sve.lsr.wide.nxv8i16(, , )
 declare  @llvm.aarch64.sve.lsr.wide.nxv4i32(, , )
+
+; +bf16 is required for the bfloat version.
+attributes #0 = { "target-features"="+sve,+bf16" }
Index: llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
===
--- llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
+++ llvm/test/CodeGen/AArch64/sve-intrinsics-scalar-to-vec.ll
@@ -57,6 +57,16 @@
   ret  %out
 }
 
+define  @dup_bf16( %a,  %pg, bfloat %b) #0 {
+; CHECK-LABEL: dup_bf16:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %a,
+%pg,
+   bfloat %b)
+  ret  %out
+}
+
 define  @dup_f32( %a,  %pg, float %b) {
 ; CHECK-LABEL: dup_f32:
 ; CHECK: mov z0.s, p0/m, s1
@@ -77,10 +87,41 @@
   ret  %out
 }
 
+define  @test_svdup_n_bf16_z( %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_z:
+; CHECK: mov z1.h, #0
+; CHECK: mov z1.h, p0/m, h0
+; CHECK: mov z0.d, z1.d
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( zeroinitializer,  %pg, bfloat %op)
+  ret  %out
+}
+
+define  @test_svdup_n_bf16_m( %inactive,  %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_m:
+; CHECK: mov z0.h, p0/m, h1
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( %inactive,  %pg, bfloat %op)
+  ret  %out
+}
+
+
+define  @test_svdup_n_bf16_x( %pg, bfloat %op) #0 {
+; CHECK-LABEL: test_svdup_n_bf16_x:
+; CHECK: mov z0.h, p0/m, h0
+; CHECK-NEXT: ret
+  %out = call  @llvm.aarch64.sve.dup.nxv8bf16( undef,  %pg, bfloat %op)
+  ret  %out
+}
+
 declare  @llvm.aarch64.sve.dup.nxv16i8(, , i8)
 declare  @llvm.aarch64.sve.dup.nxv8i16(, , i16)
 declare

[PATCH] D82665: [AArch64][SVE] Add bfloat16 to outstanding tuple vector intrinsics

2020-06-29 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

LGTM, thank you.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82665/new/

https://reviews.llvm.org/D82665



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82668: [AArch64][SVE] clang: Add missing svbfloat16_t tests

2020-06-29 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.

Ship it! :)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82668/new/

https://reviews.llvm.org/D82668



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78509: [AArch64][SVE] Add addressing mode for contiguous loads & stores

2020-04-20 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli requested changes to this revision.
fpetrogalli added a comment.
This revision now requires changes to proceed.

Hi @kmclaughlin , thank you for working on this!

The patch LGTM, but please consider renaming the multiclass for the non 
faulting loads like I suggested. The rest of the feedback is mostly cosmetic 
changes!

Francesco




Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1617
 
+  multiclass ld1nf {
+// scalar + immediate (mul vl)

The instruction mnemonic is `ldfn1*`, maybe we should call this multiclass 
`ldnf1` instead of `ld1nf`?



Comment at: 
llvm/test/CodeGen/AArch64/sve-intrinsics-ld1-addressing-mode-reg-reg.ll:7
+
+define  @ld1b_i8( %pg, i8* %a, i64 
%offset) {
+; CHECK-LABEL: ld1b_i8

Nit: I think you should rename the variable `%offset` to `%index` all through 
this tests, as usually we intend offset to represent bytes, while index is for 
indexes of arrays, which are scaled.



Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-ld1.ll:1
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s
 

For the sake of consistency, I think it is worth splitting this test into two 
tests. One that tests the pattern for the value of offset of 0 (you can leave 
the current name for the file), and one that tests all valid and invalid values 
for the reg+imm addressing mode, and call the test 
`sve-intrinsics-ld1-addressing-mode-reg-imm.ll`



Comment at: 
llvm/test/CodeGen/AArch64/sve-intrinsics-st1-addressing-mode-reg-reg.ll:7
+
+define void @st1b_i8( %data,  %pred, i8* 
%a, i64 %offset) {
+; CHECK-LABEL: st1b_i8:

Nit: same here. For the sake of consistency with other tests, I think you could 
rename `%offset` to `%index`.



Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-st1.ll:1
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s
 

Nit: Same for this test, I think it is worth splitting for the sake of 
consistency with other tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78509/new/

https://reviews.llvm.org/D78509



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78509: [AArch64][SVE] Add addressing mode for contiguous loads & stores

2020-04-20 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.
This revision is now accepted and ready to land.

I think I made a mess with the Actions for this review! I mean to accept it, 
not to enforce the nit comments!

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78509/new/

https://reviews.llvm.org/D78509



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78965: [clang][OpenMP] Fix mangling of linear parameters.

2020-04-27 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: ABataev, andwar.
Herald added subscribers: cfe-commits, guansong, yaxunl.
Herald added a reviewer: jdoerfert.
Herald added a project: clang.

The linear parameter token in the mangling function must be multiplied
by the pointee size in bytes when the parameter is a pointer.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D78965

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/test/OpenMP/declare_simd_aarch64.c
  clang/test/OpenMP/declare_simd_codegen.cpp

Index: clang/test/OpenMP/declare_simd_codegen.cpp
===
--- clang/test/OpenMP/declare_simd_codegen.cpp
+++ clang/test/OpenMP/declare_simd_codegen.cpp
@@ -136,14 +136,14 @@
 // CHECK-DAG: declare {{.+}}@_Z5add_2Pf(
 // CHECK-DAG: define {{.+}}@_Z11constlineari(
 
-// CHECK-DAG: "_ZGVbM4l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVbN4l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVcM8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVcN8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVdM8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVdN8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVeM16l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVeN16l8__Z5add_1Pf"
+// CHECK-DAG: "_ZGVbM4l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVbN4l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVcM8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVcN8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVdM8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVdN8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVeM16l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVeN16l32__Z5add_1Pf"
 // CHECK-DAG: "_ZGVbM32v__Z5add_1Pf"
 // CHECK-DAG: "_ZGVcM32v__Z5add_1Pf"
 // CHECK-DAG: "_ZGVdM32v__Z5add_1Pf"
@@ -180,14 +180,14 @@
 // CHECK-DAG: "_ZGVeM16uus1__ZN2VV3addEii"
 // CHECK-DAG: "_ZGVeN16uus1__ZN2VV3addEii"
 
-// CHECK-DAG: "_ZGVbM4lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVbN4lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVcM8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVcN8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVdM8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVdN8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVeM16lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVeN16lla16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVbM4ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVbN4ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVcM8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVcN8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVdM8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVdN8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVeM16ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVeN16ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
 
 // CHECK-DAG: "_ZGVbM4vvl8__ZN2VV4taddERA_iRi"
 // CHECK-DAG: "_ZGVbN4vvl8__ZN2VV4taddERA_iRi"
@@ -293,23 +293,23 @@
 // CHECK-DAG: "_ZGVeM16vvv__Z3bax2VVPdi"
 // CHECK-DAG: "_ZGVeN16vvv__Z3bax2VVPdi"
 
-// CHECK-DAG: "_ZGVbM4ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVbN4ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVcM8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVcN8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVdM8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVdN8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVeM16ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVeN16ua16vl1__Z3fooPffi"
-
-// CHECK-DAG: "_ZGVbM4l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVbN4l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVcM8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVcN8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVdM8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVdN8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVeM16l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVeN16l8__Z5add_2Pf"
+// CHECK-DAG: "_ZGVbM4ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVbN4ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVcM8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVcN8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVdM8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVdN8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVeM16ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVeN16ua16vl__Z3fooPffi"
+
+// CHECK-DAG: "_ZGVbM4l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVbN4l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVcM8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVcN8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVdM8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVdN8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVeM16l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVeN16l32__Z5add_2Pf"
 // CHECK-DAG: "_ZGVbM32v__Z5add_2Pf"
 // CHECK-DAG: "_ZGVcM32v__Z5add_2Pf"
 // CHECK-DAG: "_ZGVdM32v__Z5add_2Pf"
Index: clang/test/OpenMP/declare_simd_aarch64.c
===
--- clang/test/OpenMP/declare_simd_aarch64.c
+++ clang/test/OpenMP/declare_simd_aarch64.c
@@ -130,12 +130,12 @@
 /*/
 #pragma omp declare simd linear(sin) linear(cos)
 void sincos(double in, double *sin, double *cos);
-// AARCH64: "_ZGVnN2vll_sincos"
+// AARCH64: "_ZGVnN2vl8l8_sincos"
 // AARCH64-NOT: sincos
 
 #pragma omp declare simd linear(sin : 1) linear(cos : 2)
 void SinCos(double in, double *sin, double *cos);
-// AARCH64: "_ZGVnN2vll2_SinCos"
+// AARCH64: "_ZGVnN2vl8l16_SinCos"
 // AARCH64-NOT: SinCos
 
 // Selection of tests based on the examples provided in chapter 5 of
@@ -158,7 +158,7 @@
 // Listing 

[PATCH] D78969: [clang][OpenMP] Fix getNDSWDS for aarch64.

2020-04-27 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added a reviewer: ABataev.
Herald added subscribers: cfe-commits, danielkiss, guansong, kristof.beyls, 
yaxunl.
Herald added a reviewer: jdoerfert.
Herald added a project: clang.
fpetrogalli added a comment.

Hello reviewers,

I know this is not how the fix should be tested (`fprintf` in debug builds...).

This function would benefit of some unitests in the `unittests` folder of 
clang, but I don't have a way to export it there as it is private to this 
module.

I would like to lift it to some class (as a static function of 
`CodeGenFunction`, for example), but that would require exposing the  
`ParamAttrTy`. Are you OK with that? I'd rather use the `llvm::VFParameter` of 
`llvm/Analisys/VectorUtils.h` as I suggested in 
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141057.html, but that would 
definitely require a first patch work to remove the uses of `ParamAttrTy` in 
favor of `llvm::VFParameter`.

I am open to alternative suggestions, of course!

Kind regards,

Francesco


This change fixes an aarch64-specific bug in the generation of the NDS
and WDS values used to compute the signature of the vector functions
out of OpenMP directives like `declare simd`.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D78969

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/test/OpenMP/declare_simd_aarch64_ndswds.c


Index: clang/test/OpenMP/declare_simd_aarch64_ndswds.c
===
--- /dev/null
+++ clang/test/OpenMP/declare_simd_aarch64_ndswds.c
@@ -0,0 +1,20 @@
+// REQUIRES: aarch64-registered-target
+// REQUIRES: asserts
+// -fopemp and -fopenmp-simd behavior are expected to be the same.
+
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fopenmp -x 
c -emit-llvm %s -o - -femit-all-decls 2>&1 | FileCheck %s --check-prefix=AARCH64
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon 
-fopenmp-simd -x c -emit-llvm %s -o - -femit-all-decls 2>&1 | FileCheck %s 
--check-prefix=AARCH64
+
+#pragma omp declare simd linear(sin) linear(cos) notinbranch
+void SineCosineWithFloat(float in, float *sin, float *cos);
+// AARCH64-DAG: getNDSWDS SineCosineWithFloat 32 32
+
+#pragma omp declare simd notinbranch
+void SineCosineNoLinear(float in, float *sin, float *cos);
+// AARCH64-DAG: getNDSWDS SineCosineNoLinear 32 64
+
+static float *F;
+void do_something() {
+  SineCosineWithFloat(*F, F, F);
+  SineCosineNoLinear(*F, F, F);
+}
Index: clang/lib/CodeGen/CGOpenMPRuntime.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -11086,7 +11086,7 @@
 /// as defined by `LS(P)` in 3.2.1 of the AAVFABI.
 /// TODO: Add support for references, section 3.2.1, item 1.
 static unsigned getAArch64LS(QualType QT, ParamKindTy Kind, ASTContext &C) {
-  if (getAArch64MTV(QT, Kind) && QT.getCanonicalType()->isPointerType()) {
+  if (!getAArch64MTV(QT, Kind) && QT.getCanonicalType()->isPointerType()) {
 QualType PTy = QT.getCanonicalType()->getPointeeType();
 if (getAArch64PBV(PTy, C))
   return C.getTypeSize(PTy);
@@ -11129,9 +11129,15 @@
  }) &&
  "Invalid size");
 
-  return std::make_tuple(*std::min_element(std::begin(Sizes), std::end(Sizes)),
- *std::max_element(std::begin(Sizes), std::end(Sizes)),
- OutputBecomesInput);
+  const auto Ret =
+  std::make_tuple(*std::min_element(std::begin(Sizes), std::end(Sizes)),
+  *std::max_element(std::begin(Sizes), std::end(Sizes)),
+  OutputBecomesInput);
+#ifndef NDEBUG
+  fprintf(stderr, "getNDSWDS %s %d %d\n", FD->getNameAsString().c_str(),
+  std::get<0>(Ret), std::get<1>(Ret));
+#endif
+  return Ret;
 }
 
 /// Mangle the parameter part of the vector function name according to


Index: clang/test/OpenMP/declare_simd_aarch64_ndswds.c
===
--- /dev/null
+++ clang/test/OpenMP/declare_simd_aarch64_ndswds.c
@@ -0,0 +1,20 @@
+// REQUIRES: aarch64-registered-target
+// REQUIRES: asserts
+// -fopemp and -fopenmp-simd behavior are expected to be the same.
+
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fopenmp -x c -emit-llvm %s -o - -femit-all-decls 2>&1 | FileCheck %s --check-prefix=AARCH64
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fopenmp-simd -x c -emit-llvm %s -o - -femit-all-decls 2>&1 | FileCheck %s --check-prefix=AARCH64
+
+#pragma omp declare simd linear(sin) linear(cos) notinbranch
+void SineCosineWithFloat(float in, float *sin, float *cos);
+// AARCH64-DAG: getNDSWDS SineCosineWithFloat 32 32
+
+#pragma omp declare simd notinbranch
+void SineCosineNoLinear(float in, float *sin, float *cos);
+// AARCH64-DAG: getNDSWDS SineCosineNoLinear 32 64
+
+static float *F;
+void do_something() {
+ 

[PATCH] D78969: [clang][OpenMP] Fix getNDSWDS for aarch64.

2020-04-27 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

Hello reviewers,

I know this is not how the fix should be tested (`fprintf` in debug builds...).

This function would benefit of some unitests in the `unittests` folder of 
clang, but I don't have a way to export it there as it is private to this 
module.

I would like to lift it to some class (as a static function of 
`CodeGenFunction`, for example), but that would require exposing the  
`ParamAttrTy`. Are you OK with that? I'd rather use the `llvm::VFParameter` of 
`llvm/Analisys/VectorUtils.h` as I suggested in 
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141057.html, but that would 
definitely require a first patch work to remove the uses of `ParamAttrTy` in 
favor of `llvm::VFParameter`.

I am open to alternative suggestions, of course!

Kind regards,

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78969/new/

https://reviews.llvm.org/D78969



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78969: [clang][OpenMP] Fix getNDSWDS for aarch64.

2020-04-30 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

In D78969#2012591 , @jdoerfert wrote:

> Is the NDS and WDS number never visible in the IR, e.g., as part of the name?


Hang on, thanks for asking the question, maybe I can work something out here! :)

They won't be visible in IR directly, but they are part of an expression that 
is used to compute the number of lanes in the mangled name.

I'll come up with an IR check.

Thanks!

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78969/new/

https://reviews.llvm.org/D78969



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78965: [clang][OpenMP] Fix mangling of linear parameters.

2020-04-30 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 261225.
fpetrogalli added a comment.

Address review.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78965/new/

https://reviews.llvm.org/D78965

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/test/OpenMP/declare_simd_aarch64.c
  clang/test/OpenMP/declare_simd_codegen.cpp

Index: clang/test/OpenMP/declare_simd_codegen.cpp
===
--- clang/test/OpenMP/declare_simd_codegen.cpp
+++ clang/test/OpenMP/declare_simd_codegen.cpp
@@ -136,14 +136,14 @@
 // CHECK-DAG: declare {{.+}}@_Z5add_2Pf(
 // CHECK-DAG: define {{.+}}@_Z11constlineari(
 
-// CHECK-DAG: "_ZGVbM4l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVbN4l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVcM8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVcN8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVdM8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVdN8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVeM16l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVeN16l8__Z5add_1Pf"
+// CHECK-DAG: "_ZGVbM4l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVbN4l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVcM8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVcN8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVdM8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVdN8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVeM16l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVeN16l32__Z5add_1Pf"
 // CHECK-DAG: "_ZGVbM32v__Z5add_1Pf"
 // CHECK-DAG: "_ZGVcM32v__Z5add_1Pf"
 // CHECK-DAG: "_ZGVdM32v__Z5add_1Pf"
@@ -180,14 +180,14 @@
 // CHECK-DAG: "_ZGVeM16uus1__ZN2VV3addEii"
 // CHECK-DAG: "_ZGVeN16uus1__ZN2VV3addEii"
 
-// CHECK-DAG: "_ZGVbM4lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVbN4lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVcM8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVcN8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVdM8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVdN8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVeM16lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVeN16lla16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVbM4ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVbN4ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVcM8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVcN8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVdM8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVdN8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVeM16ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVeN16ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
 
 // CHECK-DAG: "_ZGVbM4vvl8__ZN2VV4taddERA_iRi"
 // CHECK-DAG: "_ZGVbN4vvl8__ZN2VV4taddERA_iRi"
@@ -293,23 +293,23 @@
 // CHECK-DAG: "_ZGVeM16vvv__Z3bax2VVPdi"
 // CHECK-DAG: "_ZGVeN16vvv__Z3bax2VVPdi"
 
-// CHECK-DAG: "_ZGVbM4ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVbN4ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVcM8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVcN8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVdM8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVdN8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVeM16ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVeN16ua16vl1__Z3fooPffi"
-
-// CHECK-DAG: "_ZGVbM4l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVbN4l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVcM8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVcN8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVdM8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVdN8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVeM16l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVeN16l8__Z5add_2Pf"
+// CHECK-DAG: "_ZGVbM4ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVbN4ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVcM8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVcN8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVdM8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVdN8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVeM16ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVeN16ua16vl__Z3fooPffi"
+
+// CHECK-DAG: "_ZGVbM4l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVbN4l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVcM8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVcN8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVdM8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVdN8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVeM16l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVeN16l32__Z5add_2Pf"
 // CHECK-DAG: "_ZGVbM32v__Z5add_2Pf"
 // CHECK-DAG: "_ZGVcM32v__Z5add_2Pf"
 // CHECK-DAG: "_ZGVdM32v__Z5add_2Pf"
Index: clang/test/OpenMP/declare_simd_aarch64.c
===
--- clang/test/OpenMP/declare_simd_aarch64.c
+++ clang/test/OpenMP/declare_simd_aarch64.c
@@ -130,12 +130,12 @@
 /*/
 #pragma omp declare simd linear(sin) linear(cos)
 void sincos(double in, double *sin, double *cos);
-// AARCH64: "_ZGVnN2vll_sincos"
+// AARCH64: "_ZGVnN2vl8l8_sincos"
 // AARCH64-NOT: sincos
 
 #pragma omp declare simd linear(sin : 1) linear(cos : 2)
 void SinCos(double in, double *sin, double *cos);
-// AARCH64: "_ZGVnN2vll2_SinCos"
+// AARCH64: "_ZGVnN2vl8l16_SinCos"
 // AARCH64-NOT: SinCos
 
 // Selection of tests based on the examples provided in chapter 5 of
@@ -158,7 +158,7 @@
 // Listing 6, p. 19
 #pragma omp declare simd linear(x) aligned(x : 16) simdlen(4)
 int foo4(int *x, float y);
-// AARCH64: "_ZGVnM4la16v_foo4" "_ZGVnN4la16v_foo4"
+// AARCH64: "_ZGVnM4l4a

[PATCH] D78965: [clang][OpenMP] Fix mangling of linear parameters.

2020-05-01 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG7585ba208e67: [clang][OpenMP] Fix mangling of linear 
parameters. (authored by fpetrogalli).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78965/new/

https://reviews.llvm.org/D78965

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/test/OpenMP/declare_simd_aarch64.c
  clang/test/OpenMP/declare_simd_codegen.cpp

Index: clang/test/OpenMP/declare_simd_codegen.cpp
===
--- clang/test/OpenMP/declare_simd_codegen.cpp
+++ clang/test/OpenMP/declare_simd_codegen.cpp
@@ -136,14 +136,14 @@
 // CHECK-DAG: declare {{.+}}@_Z5add_2Pf(
 // CHECK-DAG: define {{.+}}@_Z11constlineari(
 
-// CHECK-DAG: "_ZGVbM4l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVbN4l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVcM8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVcN8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVdM8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVdN8l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVeM16l8__Z5add_1Pf"
-// CHECK-DAG: "_ZGVeN16l8__Z5add_1Pf"
+// CHECK-DAG: "_ZGVbM4l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVbN4l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVcM8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVcN8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVdM8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVdN8l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVeM16l32__Z5add_1Pf"
+// CHECK-DAG: "_ZGVeN16l32__Z5add_1Pf"
 // CHECK-DAG: "_ZGVbM32v__Z5add_1Pf"
 // CHECK-DAG: "_ZGVcM32v__Z5add_1Pf"
 // CHECK-DAG: "_ZGVdM32v__Z5add_1Pf"
@@ -180,14 +180,14 @@
 // CHECK-DAG: "_ZGVeM16uus1__ZN2VV3addEii"
 // CHECK-DAG: "_ZGVeN16uus1__ZN2VV3addEii"
 
-// CHECK-DAG: "_ZGVbM4lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVbN4lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVcM8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVcN8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVdM8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVdN8lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVeM16lla16l4a4__ZN2VV6taddpfEPfRS0_"
-// CHECK-DAG: "_ZGVeN16lla16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVbM4ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVbN4ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVcM8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVcN8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVdM8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVdN8ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVeM16ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
+// CHECK-DAG: "_ZGVeN16ll4a16l4a4__ZN2VV6taddpfEPfRS0_"
 
 // CHECK-DAG: "_ZGVbM4vvl8__ZN2VV4taddERA_iRi"
 // CHECK-DAG: "_ZGVbN4vvl8__ZN2VV4taddERA_iRi"
@@ -293,23 +293,23 @@
 // CHECK-DAG: "_ZGVeM16vvv__Z3bax2VVPdi"
 // CHECK-DAG: "_ZGVeN16vvv__Z3bax2VVPdi"
 
-// CHECK-DAG: "_ZGVbM4ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVbN4ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVcM8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVcN8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVdM8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVdN8ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVeM16ua16vl1__Z3fooPffi"
-// CHECK-DAG: "_ZGVeN16ua16vl1__Z3fooPffi"
-
-// CHECK-DAG: "_ZGVbM4l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVbN4l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVcM8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVcN8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVdM8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVdN8l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVeM16l8__Z5add_2Pf"
-// CHECK-DAG: "_ZGVeN16l8__Z5add_2Pf"
+// CHECK-DAG: "_ZGVbM4ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVbN4ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVcM8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVcN8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVdM8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVdN8ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVeM16ua16vl__Z3fooPffi"
+// CHECK-DAG: "_ZGVeN16ua16vl__Z3fooPffi"
+
+// CHECK-DAG: "_ZGVbM4l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVbN4l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVcM8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVcN8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVdM8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVdN8l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVeM16l32__Z5add_2Pf"
+// CHECK-DAG: "_ZGVeN16l32__Z5add_2Pf"
 // CHECK-DAG: "_ZGVbM32v__Z5add_2Pf"
 // CHECK-DAG: "_ZGVcM32v__Z5add_2Pf"
 // CHECK-DAG: "_ZGVdM32v__Z5add_2Pf"
Index: clang/test/OpenMP/declare_simd_aarch64.c
===
--- clang/test/OpenMP/declare_simd_aarch64.c
+++ clang/test/OpenMP/declare_simd_aarch64.c
@@ -130,12 +130,12 @@
 /*/
 #pragma omp declare simd linear(sin) linear(cos)
 void sincos(double in, double *sin, double *cos);
-// AARCH64: "_ZGVnN2vll_sincos"
+// AARCH64: "_ZGVnN2vl8l8_sincos"
 // AARCH64-NOT: sincos
 
 #pragma omp declare simd linear(sin : 1) linear(cos : 2)
 void SinCos(double in, double *sin, double *cos);
-// AARCH64: "_ZGVnN2vll2_SinCos"
+// AARCH64: "_ZGVnN2vl8l16_SinCos"
 // AARCH64-NOT: SinCos
 
 // Selection of tests based on the examples provided in chapter 5 of
@@ -158,7 +158,7 @@
 // Listing 6, p. 19
 #pragma omp declare simd linear(x) aligned(x : 16) simdlen(4)
 int foo4(int *x

[PATCH] D78969: [clang][OpenMP] Fix getNDSWDS for aarch64.

2020-05-01 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 261610.
fpetrogalli added a comment.

I have added (indirect) testing of the values of NDS and WDS.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78969/new/

https://reviews.llvm.org/D78969

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/test/OpenMP/aarch64_vfabi_NarrowestDataSize.c
  clang/test/OpenMP/aarch64_vfabi_WidestDataSize.c

Index: clang/test/OpenMP/aarch64_vfabi_WidestDataSize.c
===
--- /dev/null
+++ clang/test/OpenMP/aarch64_vfabi_WidestDataSize.c
@@ -0,0 +1,78 @@
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +sve  -fopenmp  -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +sve  -fopenmp-simd -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+
+// REQUIRES: aarch64-registered-target
+// Note: -fopemp and -fopenmp-simd behavior are expected to be the same.
+
+// This test checks the values of Widest Data Size (WDS), as defined
+// in https://github.com/ARM-software/abi-aa/tree/master/vfabia64
+//
+// WDS is used to check the accepted values  of `simdlen()` when
+// targeting fixed-length SVE vector function names. The values of
+// `` that are accepted are such that for X = WDS *  * 8,
+// 128-bit <= X <= 2048-bit and X is a multiple of 128-bit.
+
+#pragma omp declare simd simdlen(8)
+#pragma omp declare simd simdlen(16)
+#pragma omp declare simd simdlen(256)
+#pragma omp declare simd simdlen(272)
+char WDS_is_sizeof_char(char in);
+// WDS = 1, simdlen(8) and simdlen(272) are not generated.
+// CHECK-DAG: _ZGVsM16v_WDS_is_sizeof_char
+// CHECK-DAG: _ZGVsM256v_WDS_is_sizeof_char
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_char
+
+#pragma omp declare simd simdlen(4)
+#pragma omp declare simd simdlen(8)
+#pragma omp declare simd simdlen(128)
+#pragma omp declare simd simdlen(136)
+char WDS_is_sizeof_short(short in);
+// WDS = 2, simdlen(4) and simdlen(136) are not generated.
+// CHECK-DAG: _ZGVsM8v_WDS_is_sizeof_short
+// CHECK-DAG: _ZGVsM128v_WDS_is_sizeof_short
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_short
+
+#pragma omp declare simd linear(sin) notinbranch simdlen(2)
+#pragma omp declare simd linear(sin) notinbranch simdlen(4)
+#pragma omp declare simd linear(sin) notinbranch simdlen(64)
+#pragma omp declare simd linear(sin) notinbranch simdlen(68)
+void WDS_is_sizeof_float_pointee(float in, float *sin);
+// WDS = 4, simdlen(2) and simdlen(68) are not generated.
+// CHECK-DAG: _ZGVsM4vl4_WDS_is_sizeof_float_pointee
+// CHECK-DAG: _ZGVsM64vl4_WDS_is_sizeof_float_pointee
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_float_pointee
+
+#pragma omp declare simd linear(sin) notinbranch simdlen(2)
+#pragma omp declare simd linear(sin) notinbranch simdlen(4)
+#pragma omp declare simd linear(sin) notinbranch simdlen(32)
+#pragma omp declare simd linear(sin) notinbranch simdlen(34)
+void WDS_is_sizeof_double_pointee(float in, double *sin);
+// WDS = 8 because of the linear clause, simdlen(34) is not generated.
+// CHECK-DAG: _ZGVsM2vl8_WDS_is_sizeof_double_pointee
+// CHECK-DAG: _ZGVsM4vl8_WDS_is_sizeof_double_pointee
+// CHECK-DAG: _ZGVsM32vl8_WDS_is_sizeof_double_pointee
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_double_pointee
+
+#pragma omp declare simd simdlen(2)
+#pragma omp declare simd simdlen(4)
+#pragma omp declare simd simdlen(32)
+#pragma omp declare simd simdlen(34)
+double WDS_is_sizeof_double(double in);
+// WDS = 8, simdlen(34) is not generated.
+// CHECK-DAG: _ZGVsM2v_WDS_is_sizeof_double
+// CHECK-DAG: _ZGVsM4v_WDS_is_sizeof_double
+// CHECK-DAG: _ZGVsM32v_WDS_is_sizeof_double
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_double
+
+static char C;
+static short S;
+static float F;
+static double D;
+
+void do_something() {
+  C = WDS_is_sizeof_char(C);
+  C = WDS_is_sizeof_short(S);
+  WDS_is_sizeof_float_pointee(F, &F);
+  WDS_is_sizeof_double_pointee(F, &D);
+  D = WDS_is_sizeof_double(D);
+}
Index: clang/test/OpenMP/aarch64_vfabi_NarrowestDataSize.c
===
--- /dev/null
+++ clang/test/OpenMP/aarch64_vfabi_NarrowestDataSize.c
@@ -0,0 +1,82 @@
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fopenmp  -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fopenmp-simd -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+
+// REQUIRES: aarch64-registered-target
+// Note: -fopemp and -fopenmp-simd behavior are expected to be the same.
+
+// This test checks the values of Narrowest Data Size (NDS), as defined in
+// https://github.com/ARM-software/abi-aa/tree/master/vfabia64
+//
+// NDS is used to compute the  token in the name of AdvSIMD
+// vector functions when no `simdlen` is specified, with the rule:
+//
+// if NDS(f) = 1, then VLEN = 16, 8;
+// if NDS(f) = 2, then VLEN = 8, 4;
+// if NDS(f) = 4, t

[PATCH] D78969: [clang][OpenMP] Fix getNDSWDS for aarch64.

2020-05-01 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 261611.
fpetrogalli added a comment.

Removed single use variable definition.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78969/new/

https://reviews.llvm.org/D78969

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/test/OpenMP/aarch64_vfabi_NarrowestDataSize.c
  clang/test/OpenMP/aarch64_vfabi_WidestDataSize.c

Index: clang/test/OpenMP/aarch64_vfabi_WidestDataSize.c
===
--- /dev/null
+++ clang/test/OpenMP/aarch64_vfabi_WidestDataSize.c
@@ -0,0 +1,78 @@
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +sve  -fopenmp  -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +sve  -fopenmp-simd -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+
+// REQUIRES: aarch64-registered-target
+// Note: -fopemp and -fopenmp-simd behavior are expected to be the same.
+
+// This test checks the values of Widest Data Size (WDS), as defined
+// in https://github.com/ARM-software/abi-aa/tree/master/vfabia64
+//
+// WDS is used to check the accepted values  of `simdlen()` when
+// targeting fixed-length SVE vector function names. The values of
+// `` that are accepted are such that for X = WDS *  * 8,
+// 128-bit <= X <= 2048-bit and X is a multiple of 128-bit.
+
+#pragma omp declare simd simdlen(8)
+#pragma omp declare simd simdlen(16)
+#pragma omp declare simd simdlen(256)
+#pragma omp declare simd simdlen(272)
+char WDS_is_sizeof_char(char in);
+// WDS = 1, simdlen(8) and simdlen(272) are not generated.
+// CHECK-DAG: _ZGVsM16v_WDS_is_sizeof_char
+// CHECK-DAG: _ZGVsM256v_WDS_is_sizeof_char
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_char
+
+#pragma omp declare simd simdlen(4)
+#pragma omp declare simd simdlen(8)
+#pragma omp declare simd simdlen(128)
+#pragma omp declare simd simdlen(136)
+char WDS_is_sizeof_short(short in);
+// WDS = 2, simdlen(4) and simdlen(136) are not generated.
+// CHECK-DAG: _ZGVsM8v_WDS_is_sizeof_short
+// CHECK-DAG: _ZGVsM128v_WDS_is_sizeof_short
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_short
+
+#pragma omp declare simd linear(sin) notinbranch simdlen(2)
+#pragma omp declare simd linear(sin) notinbranch simdlen(4)
+#pragma omp declare simd linear(sin) notinbranch simdlen(64)
+#pragma omp declare simd linear(sin) notinbranch simdlen(68)
+void WDS_is_sizeof_float_pointee(float in, float *sin);
+// WDS = 4, simdlen(2) and simdlen(68) are not generated.
+// CHECK-DAG: _ZGVsM4vl4_WDS_is_sizeof_float_pointee
+// CHECK-DAG: _ZGVsM64vl4_WDS_is_sizeof_float_pointee
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_float_pointee
+
+#pragma omp declare simd linear(sin) notinbranch simdlen(2)
+#pragma omp declare simd linear(sin) notinbranch simdlen(4)
+#pragma omp declare simd linear(sin) notinbranch simdlen(32)
+#pragma omp declare simd linear(sin) notinbranch simdlen(34)
+void WDS_is_sizeof_double_pointee(float in, double *sin);
+// WDS = 8 because of the linear clause, simdlen(34) is not generated.
+// CHECK-DAG: _ZGVsM2vl8_WDS_is_sizeof_double_pointee
+// CHECK-DAG: _ZGVsM4vl8_WDS_is_sizeof_double_pointee
+// CHECK-DAG: _ZGVsM32vl8_WDS_is_sizeof_double_pointee
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_double_pointee
+
+#pragma omp declare simd simdlen(2)
+#pragma omp declare simd simdlen(4)
+#pragma omp declare simd simdlen(32)
+#pragma omp declare simd simdlen(34)
+double WDS_is_sizeof_double(double in);
+// WDS = 8, simdlen(34) is not generated.
+// CHECK-DAG: _ZGVsM2v_WDS_is_sizeof_double
+// CHECK-DAG: _ZGVsM4v_WDS_is_sizeof_double
+// CHECK-DAG: _ZGVsM32v_WDS_is_sizeof_double
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_double
+
+static char C;
+static short S;
+static float F;
+static double D;
+
+void do_something() {
+  C = WDS_is_sizeof_char(C);
+  C = WDS_is_sizeof_short(S);
+  WDS_is_sizeof_float_pointee(F, &F);
+  WDS_is_sizeof_double_pointee(F, &D);
+  D = WDS_is_sizeof_double(D);
+}
Index: clang/test/OpenMP/aarch64_vfabi_NarrowestDataSize.c
===
--- /dev/null
+++ clang/test/OpenMP/aarch64_vfabi_NarrowestDataSize.c
@@ -0,0 +1,82 @@
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fopenmp  -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fopenmp-simd -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+
+// REQUIRES: aarch64-registered-target
+// Note: -fopemp and -fopenmp-simd behavior are expected to be the same.
+
+// This test checks the values of Narrowest Data Size (NDS), as defined in
+// https://github.com/ARM-software/abi-aa/tree/master/vfabia64
+//
+// NDS is used to compute the  token in the name of AdvSIMD
+// vector functions when no `simdlen` is specified, with the rule:
+//
+// if NDS(f) = 1, then VLEN = 16, 8;
+// if NDS(f) = 2, then VLEN = 8, 4;
+// if NDS(f) = 4, then VLEN = 4, 2;
+// i

[PATCH] D85977: [release][docs] Update contributions to LLVM 11 for SVE.

2020-08-14 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
Herald added subscribers: llvm-commits, cfe-commits, tschuett.
Herald added projects: clang, LLVM.
fpetrogalli requested review of this revision.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D85977

Files:
  clang/test/CodeGen/aarch64-sve-acle-example.cpp
  llvm/docs/ReleaseNotes.rst

Index: llvm/docs/ReleaseNotes.rst
===
--- llvm/docs/ReleaseNotes.rst
+++ llvm/docs/ReleaseNotes.rst
@@ -66,7 +66,9 @@
   added to describe the mapping between scalar functions and vector
   functions, to enable vectorization of call sites. The information
   provided by the attribute is interfaced via the API provided by the
-  ``VFDatabase`` class.
+  ``VFDatabase`` class. When scanning through the set of vector
+  functions associated to a scalar call, the loop vectorizer now
+  relies on ``VFDatabase``, instead of ``TargetLibraryInfo``.
 
 * `dereferenceable` attributes and metadata on pointers no longer imply
   anything about the alignment of the pointer in question. Previously, some
@@ -78,6 +80,17 @@
   information. This information is used to represent Fortran modules debug
   info at IR level.
 
+* LLVM IR now supports two distinct ``llvm::FixedVectorType`` and
+  ``llvm::ScalableVectorType``, both derived from the base class
+  ``llvm::VectorType``. A number of algorithms dealing with IR vector
+  types have been updated to make sure they work for both scalable and
+  fixed vector types. Where possible, the code has been made generic
+  to cover both cases using the base class. Specifically, places that
+  were using the type ``unsigned`` to count the number of lanes of a
+  vector are now using ``llvm::ElementCount``. In places where
+  ``uint64_t`` was used to denote the size in bits of a IR type we
+  have partially migrated the codebase to using ``llvm::TypeSize``.
+
 Changes to building LLVM
 
 
@@ -101,6 +114,55 @@
   default may wish to specify ``-fno-omit-frame-pointer`` to get the old
   behavior. This improves compatibility with GCC.
 
+* Clang supports to the following macros that enable the C-intrinsics
+  from the `Arm C language extensions for SVE
+  `_ (version
+  ``00bet5``, see section 2.1 for the list of intrinsics associated to
+  each macro):
+
+
+  =  =
+  Preprocessor macro Target feature
+  =  =
+  ``__ARM_FEATURE_SVE``  ``+sve``
+  ``__ARM_FEATURE_SVE_BF16`` ``+sve+bf16``
+  ``__ARM_FEATURE_SVE_MATMUL_FP32``  ``+sve+f32mm``
+  ``__ARM_FEATURE_SVE_MATMUL_FP64``  ``+sve+f64mm``
+  ``__ARM_FEATURE_SVE_MATMUL_INT8``  ``+sve+i8mm``
+  ``__ARM_FEATURE_SVE2`` ``+sve2``
+  ``__ARM_FEATURE_SVE2_AES`` ``+sve2-aes``
+  ``__ARM_FEATURE_SVE2_BITPERM`` ``+sve2-bitperm``
+  ``__ARM_FEATURE_SVE2_SHA3````+sve2-sha3``
+  ``__ARM_FEATURE_SVE2_SM4`` ``+sve2-sm4``
+  =  =
+
+  The macros enable users to write C/C++ `Vector Length Agnostic
+  (VLA)` loops, that can be executed on any CPU that implements the
+  underlying instructions supported by the C intrinsics, independently
+  of the hardware vector register size.
+
+  For example, the ``__ARM_FEATURE_SVE`` macro is enabled when
+  targeting AArch64 code generation by setting ``-march=armv8-a+sve``
+  on the command line.
+
+  .. code-block:: c
+ :caption: Example of VLA addition of two arrays with SVE ACLE.
+
+ // Compile with:
+ // `clang++ -march=armv8a+sve ...` (for c++)
+ // `clang -stc=c11 -march=armv8a+sve ...` (for c)
+ #include 
+
+ void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
+   for (unsigned i = 0; i < N; i += svcntd()) {
+ svbool_t Pg = svwhilelt_b64(i, N);
+ svfloat64_t vx = svld1(Pg, &x[i]);
+ svfloat64_t vy = svld1(Pg, &y[i]);
+ svfloat64_t vout = svadd_x(Pg, vx, vy);
+svst1(Pg, &out[i], vout);
+   }
+ }
+
 Changes to the MIPS Target
 --
 
Index: clang/test/CodeGen/aarch64-sve-acle-example.cpp
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-acle-example.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang -x c++ -c -target aarch64-linux-gnu -march=armv8-a+sve -o - -S %s -O3 | FileCheck %s --check-prefix=CPP
+// RUN: %clang -x c -std=c11 -c -target aarch64-linux-gnu -march=armv8-a+sve -o - -S %s -O3 | FileCheck %s --check-prefix=C
+// REQUIRES: aarch64-registered-target
+
+#include 
+
+// CPP-LABEL: _Z14VLA_add_arraysPdS_S_j:
+// C-LABEL: VLA_add_arrays:
+void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
+  for (unsigned i = 0; i < N; i += svcntd()) {
+svbool_t Pg = svwhilelt_b64(i, N);
+  

[PATCH] D85977: [release][docs] Update contributions to LLVM 11 for SVE.

2020-08-14 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 285678.
fpetrogalli added a comment.

Added context to the diff.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85977/new/

https://reviews.llvm.org/D85977

Files:
  clang/test/CodeGen/aarch64-sve-acle-example.cpp
  llvm/docs/ReleaseNotes.rst

Index: llvm/docs/ReleaseNotes.rst
===
--- llvm/docs/ReleaseNotes.rst
+++ llvm/docs/ReleaseNotes.rst
@@ -66,7 +66,9 @@
   added to describe the mapping between scalar functions and vector
   functions, to enable vectorization of call sites. The information
   provided by the attribute is interfaced via the API provided by the
-  ``VFDatabase`` class.
+  ``VFDatabase`` class. When scanning through the set of vector
+  functions associated to a scalar call, the loop vectorizer now
+  relies on ``VFDatabase``, instead of ``TargetLibraryInfo``.
 
 * `dereferenceable` attributes and metadata on pointers no longer imply
   anything about the alignment of the pointer in question. Previously, some
@@ -78,6 +80,17 @@
   information. This information is used to represent Fortran modules debug
   info at IR level.
 
+* LLVM IR now supports two distinct ``llvm::FixedVectorType`` and
+  ``llvm::ScalableVectorType``, both derived from the base class
+  ``llvm::VectorType``. A number of algorithms dealing with IR vector
+  types have been updated to make sure they work for both scalable and
+  fixed vector types. Where possible, the code has been made generic
+  to cover both cases using the base class. Specifically, places that
+  were using the type ``unsigned`` to count the number of lanes of a
+  vector are now using ``llvm::ElementCount``. In places where
+  ``uint64_t`` was used to denote the size in bits of a IR type we
+  have partially migrated the codebase to using ``llvm::TypeSize``.
+
 Changes to building LLVM
 
 
@@ -101,6 +114,55 @@
   default may wish to specify ``-fno-omit-frame-pointer`` to get the old
   behavior. This improves compatibility with GCC.
 
+* Clang supports to the following macros that enable the C-intrinsics
+  from the `Arm C language extensions for SVE
+  `_ (version
+  ``00bet5``, see section 2.1 for the list of intrinsics associated to
+  each macro):
+
+
+  =  =
+  Preprocessor macro Target feature
+  =  =
+  ``__ARM_FEATURE_SVE``  ``+sve``
+  ``__ARM_FEATURE_SVE_BF16`` ``+sve+bf16``
+  ``__ARM_FEATURE_SVE_MATMUL_FP32``  ``+sve+f32mm``
+  ``__ARM_FEATURE_SVE_MATMUL_FP64``  ``+sve+f64mm``
+  ``__ARM_FEATURE_SVE_MATMUL_INT8``  ``+sve+i8mm``
+  ``__ARM_FEATURE_SVE2`` ``+sve2``
+  ``__ARM_FEATURE_SVE2_AES`` ``+sve2-aes``
+  ``__ARM_FEATURE_SVE2_BITPERM`` ``+sve2-bitperm``
+  ``__ARM_FEATURE_SVE2_SHA3````+sve2-sha3``
+  ``__ARM_FEATURE_SVE2_SM4`` ``+sve2-sm4``
+  =  =
+
+  The macros enable users to write C/C++ `Vector Length Agnostic
+  (VLA)` loops, that can be executed on any CPU that implements the
+  underlying instructions supported by the C intrinsics, independently
+  of the hardware vector register size.
+
+  For example, the ``__ARM_FEATURE_SVE`` macro is enabled when
+  targeting AArch64 code generation by setting ``-march=armv8-a+sve``
+  on the command line.
+
+  .. code-block:: c
+ :caption: Example of VLA addition of two arrays with SVE ACLE.
+
+ // Compile with:
+ // `clang++ -march=armv8a+sve ...` (for c++)
+ // `clang -stc=c11 -march=armv8a+sve ...` (for c)
+ #include 
+
+ void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
+   for (unsigned i = 0; i < N; i += svcntd()) {
+ svbool_t Pg = svwhilelt_b64(i, N);
+ svfloat64_t vx = svld1(Pg, &x[i]);
+ svfloat64_t vy = svld1(Pg, &y[i]);
+ svfloat64_t vout = svadd_x(Pg, vx, vy);
+svst1(Pg, &out[i], vout);
+   }
+ }
+
 Changes to the MIPS Target
 --
 
Index: clang/test/CodeGen/aarch64-sve-acle-example.cpp
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-acle-example.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang -x c++ -c -target aarch64-linux-gnu -march=armv8-a+sve -o - -S %s -O3 | FileCheck %s --check-prefix=CPP
+// RUN: %clang -x c -std=c11 -c -target aarch64-linux-gnu -march=armv8-a+sve -o - -S %s -O3 | FileCheck %s --check-prefix=C
+// REQUIRES: aarch64-registered-target
+
+#include 
+
+// CPP-LABEL: _Z14VLA_add_arraysPdS_S_j:
+// C-LABEL: VLA_add_arrays:
+void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
+  for (unsigned i = 0; i < N; i += svcntd()) {
+svbool_t Pg = svwhilelt_b64(i, N);
+svfloat64_t vx = svld1(Pg, &x[i]);
+svfl

[PATCH] D85977: [release][docs] Update contributions to LLVM 11 for SVE.

2020-08-14 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 285731.
fpetrogalli added a comment.

Fix indentation of the code exmaple.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85977/new/

https://reviews.llvm.org/D85977

Files:
  clang/test/CodeGen/aarch64-sve-acle-example.cpp
  llvm/docs/ReleaseNotes.rst

Index: llvm/docs/ReleaseNotes.rst
===
--- llvm/docs/ReleaseNotes.rst
+++ llvm/docs/ReleaseNotes.rst
@@ -66,7 +66,9 @@
   added to describe the mapping between scalar functions and vector
   functions, to enable vectorization of call sites. The information
   provided by the attribute is interfaced via the API provided by the
-  ``VFDatabase`` class.
+  ``VFDatabase`` class. When scanning through the set of vector
+  functions associated to a scalar call, the loop vectorizer now
+  relies on ``VFDatabase``, instead of ``TargetLibraryInfo``.
 
 * `dereferenceable` attributes and metadata on pointers no longer imply
   anything about the alignment of the pointer in question. Previously, some
@@ -78,6 +80,17 @@
   information. This information is used to represent Fortran modules debug
   info at IR level.
 
+* LLVM IR now supports two distinct ``llvm::FixedVectorType`` and
+  ``llvm::ScalableVectorType``, both derived from the base class
+  ``llvm::VectorType``. A number of algorithms dealing with IR vector
+  types have been updated to make sure they work for both scalable and
+  fixed vector types. Where possible, the code has been made generic
+  to cover both cases using the base class. Specifically, places that
+  were using the type ``unsigned`` to count the number of lanes of a
+  vector are now using ``llvm::ElementCount``. In places where
+  ``uint64_t`` was used to denote the size in bits of a IR type we
+  have partially migrated the codebase to using ``llvm::TypeSize``.
+
 Changes to building LLVM
 
 
@@ -101,6 +114,55 @@
   default may wish to specify ``-fno-omit-frame-pointer`` to get the old
   behavior. This improves compatibility with GCC.
 
+* Clang supports to the following macros that enable the C-intrinsics
+  from the `Arm C language extensions for SVE
+  `_ (version
+  ``00bet5``, see section 2.1 for the list of intrinsics associated to
+  each macro):
+
+
+  =  =
+  Preprocessor macro Target feature
+  =  =
+  ``__ARM_FEATURE_SVE``  ``+sve``
+  ``__ARM_FEATURE_SVE_BF16`` ``+sve+bf16``
+  ``__ARM_FEATURE_SVE_MATMUL_FP32``  ``+sve+f32mm``
+  ``__ARM_FEATURE_SVE_MATMUL_FP64``  ``+sve+f64mm``
+  ``__ARM_FEATURE_SVE_MATMUL_INT8``  ``+sve+i8mm``
+  ``__ARM_FEATURE_SVE2`` ``+sve2``
+  ``__ARM_FEATURE_SVE2_AES`` ``+sve2-aes``
+  ``__ARM_FEATURE_SVE2_BITPERM`` ``+sve2-bitperm``
+  ``__ARM_FEATURE_SVE2_SHA3````+sve2-sha3``
+  ``__ARM_FEATURE_SVE2_SM4`` ``+sve2-sm4``
+  =  =
+
+  The macros enable users to write C/C++ `Vector Length Agnostic
+  (VLA)` loops, that can be executed on any CPU that implements the
+  underlying instructions supported by the C intrinsics, independently
+  of the hardware vector register size.
+
+  For example, the ``__ARM_FEATURE_SVE`` macro is enabled when
+  targeting AArch64 code generation by setting ``-march=armv8-a+sve``
+  on the command line.
+
+  .. code-block:: c
+ :caption: Example of VLA addition of two arrays with SVE ACLE.
+
+ // Compile with:
+ // `clang++ -march=armv8a+sve ...` (for c++)
+ // `clang -stc=c11 -march=armv8a+sve ...` (for c)
+ #include 
+
+ void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
+   for (unsigned i = 0; i < N; i += svcntd()) {
+ svbool_t Pg = svwhilelt_b64(i, N);
+ svfloat64_t vx = svld1(Pg, &x[i]);
+ svfloat64_t vy = svld1(Pg, &y[i]);
+ svfloat64_t vout = svadd_x(Pg, vx, vy);
+ svst1(Pg, &out[i], vout);
+   }
+ }
+
 Changes to the MIPS Target
 --
 
Index: clang/test/CodeGen/aarch64-sve-acle-example.cpp
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-acle-example.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang -x c++ -c -target aarch64-linux-gnu -march=armv8-a+sve -o - -S %s -O3 | FileCheck %s --check-prefix=CPP
+// RUN: %clang -x c -std=c11 -c -target aarch64-linux-gnu -march=armv8-a+sve -o - -S %s -O3 | FileCheck %s --check-prefix=C
+// REQUIRES: aarch64-registered-target
+
+#include 
+
+// CPP-LABEL: _Z14VLA_add_arraysPdS_S_j:
+// C-LABEL: VLA_add_arrays:
+void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
+  for (unsigned i = 0; i < N; i += svcntd()) {
+svbool_t Pg = svwhilelt_b64(i, N);

[PATCH] D86065: [SVE] Make ElementCount members private

2020-08-17 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: llvm/include/llvm/Support/TypeSize.h:56
 
+  friend bool operator>(const ElementCount &LHS, const ElementCount &RHS) {
+assert(LHS.Scalable == RHS.Scalable &&

I think that @ctetreau is right on 
https://reviews.llvm.org/D85794#inline-793909. We should not overload a 
comparison operator on this class because the set it represent it cannot be 
ordered.

Chris suggests an approach of writing a static function that can be used as a 
comparison operator,  so that we can make it explicit of what kind of 
comparison we  are doing. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86065/new/

https://reviews.llvm.org/D86065

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D85977: [release][docs] Update contributions to LLVM 11 for SVE.

2020-08-17 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 286057.
fpetrogalli marked 3 inline comments as done.
fpetrogalli added a comment.

Thank you for the review @david-arm.

I have addressed all your comments, and I also have removed the
additional unit tests that was checking that the example in the docs
was working. I have decided to do so because I don't want to deal with
the intricacies of making sure that the use of `#include `
when running the test on x86 hardware is correctly handled (for
reference, see failure at
https://reviews.llvm.org/harbormaster/unit/view/160088/). All the code
generation needed by the example in the release note is unit-tested in
the SVE ACLE tests, so the unit test for the example is not really
necessary.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85977/new/

https://reviews.llvm.org/D85977

Files:
  llvm/docs/ReleaseNotes.rst


Index: llvm/docs/ReleaseNotes.rst
===
--- llvm/docs/ReleaseNotes.rst
+++ llvm/docs/ReleaseNotes.rst
@@ -66,7 +66,9 @@
   added to describe the mapping between scalar functions and vector
   functions, to enable vectorization of call sites. The information
   provided by the attribute is interfaced via the API provided by the
-  ``VFDatabase`` class.
+  ``VFDatabase`` class. When scanning through the set of vector
+  functions associated with a scalar call, the loop vectorizer now
+  relies on ``VFDatabase``, instead of ``TargetLibraryInfo``.
 
 * `dereferenceable` attributes and metadata on pointers no longer imply
   anything about the alignment of the pointer in question. Previously, some
@@ -78,6 +80,17 @@
   information. This information is used to represent Fortran modules debug
   info at IR level.
 
+* LLVM IR now supports two distinct ``llvm::FixedVectorType`` and
+  ``llvm::ScalableVectorType`` vector types, both derived from the
+  base class ``llvm::VectorType``. A number of algorithms dealing with
+  IR vector types have been updated to make sure they work for both
+  scalable and fixed vector types. Where possible, the code has been
+  made generic to cover both cases using the base class. Specifically,
+  places that were using the type ``unsigned`` to count the number of
+  lanes of a vector are now using ``llvm::ElementCount``. In places
+  where ``uint64_t`` was used to denote the size in bits of a IR type
+  we have partially migrated the codebase to using ``llvm::TypeSize``.
+
 Changes to building LLVM
 
 
@@ -101,6 +114,55 @@
   default may wish to specify ``-fno-omit-frame-pointer`` to get the old
   behavior. This improves compatibility with GCC.
 
+* Clang adds support for the following macros that enable the
+  C-intrinsics from the `Arm C language extensions for SVE
+  `_ (version
+  ``00bet5``, see section 2.1 for the list of intrinsics associated to
+  each macro):
+
+
+  =  =
+  Preprocessor macro Target feature
+  =  =
+  ``__ARM_FEATURE_SVE``  ``+sve``
+  ``__ARM_FEATURE_SVE_BF16`` ``+sve+bf16``
+  ``__ARM_FEATURE_SVE_MATMUL_FP32``  ``+sve+f32mm``
+  ``__ARM_FEATURE_SVE_MATMUL_FP64``  ``+sve+f64mm``
+  ``__ARM_FEATURE_SVE_MATMUL_INT8``  ``+sve+i8mm``
+  ``__ARM_FEATURE_SVE2`` ``+sve2``
+  ``__ARM_FEATURE_SVE2_AES`` ``+sve2-aes``
+  ``__ARM_FEATURE_SVE2_BITPERM`` ``+sve2-bitperm``
+  ``__ARM_FEATURE_SVE2_SHA3````+sve2-sha3``
+  ``__ARM_FEATURE_SVE2_SM4`` ``+sve2-sm4``
+  =  =
+
+  The macros enable users to write C/C++ `Vector Length Agnostic
+  (VLA)` loops, that can be executed on any CPU that implements the
+  underlying instructions supported by the C intrinsics, independently
+  of the hardware vector register size.
+
+  For example, the ``__ARM_FEATURE_SVE`` macro is enabled when
+  targeting AArch64 code generation by setting ``-march=armv8-a+sve``
+  on the command line.
+
+  .. code-block:: c
+ :caption: Example of VLA addition of two arrays with SVE ACLE.
+
+ // Compile with:
+ // `clang++ -march=armv8a+sve ...` (for c++)
+ // `clang -stc=c11 -march=armv8a+sve ...` (for c)
+ #include 
+
+ void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
+   for (unsigned i = 0; i < N; i += svcntd()) {
+ svbool_t Pg = svwhilelt_b64(i, N);
+ svfloat64_t vx = svld1(Pg, &x[i]);
+ svfloat64_t vy = svld1(Pg, &y[i]);
+ svfloat64_t vout = svadd_x(Pg, vx, vy);
+ svst1(Pg, &out[i], vout);
+   }
+ }
+
 Changes to the MIPS Target
 --
 


Index: llvm/docs/ReleaseNotes.rst
===
--- llvm/docs/ReleaseNotes.rst
+++ llvm

[PATCH] D78190: Add Bfloat IR type

2020-05-04 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

Hi @stuij,

thank you for working on this!

Admittedly, I don't know much about the Asm parser, but I have left some 
comments anyway.

1. Shouldn't we test also that the parser is happy with the following 
expressions?

  bfloat *
  %... = fadd  %..., %... 
  similar for 

Or is this not needed, or left to be done in a separate patch?

2. Would it make sense to to split this patch into 2 separate patches? One that 
defines the enums and interfaces for `bfloat`, and one that does the actual 
parsing/emission in the IR? I suspect there is much intertwine going on, so 
probably not - in that case, I am happy for everything to go via a single patch.

3. Do you need those changes in the Hexagon and x86 backend? Could they be 
submitted separately, with some testing?

Kind regards,

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78190/new/

https://reviews.llvm.org/D78190



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78190: Add Bfloat IR type

2020-05-05 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli accepted this revision.
fpetrogalli added a comment.

Hi @stuij,

thank you for adding the vector tests. I can say the patch LGTM now :)

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78190/new/

https://reviews.llvm.org/D78190



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78969: [clang][OpenMP] Fix getNDSWDS for aarch64.

2020-05-05 Thread Francesco Petrogalli via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG4fa13a3dac1e: [clang][OpenMP] Fix getNDSWDS for aarch64. 
(authored by fpetrogalli).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78969/new/

https://reviews.llvm.org/D78969

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/test/OpenMP/aarch64_vfabi_NarrowestDataSize.c
  clang/test/OpenMP/aarch64_vfabi_WidestDataSize.c

Index: clang/test/OpenMP/aarch64_vfabi_WidestDataSize.c
===
--- /dev/null
+++ clang/test/OpenMP/aarch64_vfabi_WidestDataSize.c
@@ -0,0 +1,78 @@
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +sve  -fopenmp  -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +sve  -fopenmp-simd -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+
+// REQUIRES: aarch64-registered-target
+// Note: -fopemp and -fopenmp-simd behavior are expected to be the same.
+
+// This test checks the values of Widest Data Size (WDS), as defined
+// in https://github.com/ARM-software/abi-aa/tree/master/vfabia64
+//
+// WDS is used to check the accepted values  of `simdlen()` when
+// targeting fixed-length SVE vector function names. The values of
+// `` that are accepted are such that for X = WDS *  * 8,
+// 128-bit <= X <= 2048-bit and X is a multiple of 128-bit.
+
+#pragma omp declare simd simdlen(8)
+#pragma omp declare simd simdlen(16)
+#pragma omp declare simd simdlen(256)
+#pragma omp declare simd simdlen(272)
+char WDS_is_sizeof_char(char in);
+// WDS = 1, simdlen(8) and simdlen(272) are not generated.
+// CHECK-DAG: _ZGVsM16v_WDS_is_sizeof_char
+// CHECK-DAG: _ZGVsM256v_WDS_is_sizeof_char
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_char
+
+#pragma omp declare simd simdlen(4)
+#pragma omp declare simd simdlen(8)
+#pragma omp declare simd simdlen(128)
+#pragma omp declare simd simdlen(136)
+char WDS_is_sizeof_short(short in);
+// WDS = 2, simdlen(4) and simdlen(136) are not generated.
+// CHECK-DAG: _ZGVsM8v_WDS_is_sizeof_short
+// CHECK-DAG: _ZGVsM128v_WDS_is_sizeof_short
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_short
+
+#pragma omp declare simd linear(sin) notinbranch simdlen(2)
+#pragma omp declare simd linear(sin) notinbranch simdlen(4)
+#pragma omp declare simd linear(sin) notinbranch simdlen(64)
+#pragma omp declare simd linear(sin) notinbranch simdlen(68)
+void WDS_is_sizeof_float_pointee(float in, float *sin);
+// WDS = 4, simdlen(2) and simdlen(68) are not generated.
+// CHECK-DAG: _ZGVsM4vl4_WDS_is_sizeof_float_pointee
+// CHECK-DAG: _ZGVsM64vl4_WDS_is_sizeof_float_pointee
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_float_pointee
+
+#pragma omp declare simd linear(sin) notinbranch simdlen(2)
+#pragma omp declare simd linear(sin) notinbranch simdlen(4)
+#pragma omp declare simd linear(sin) notinbranch simdlen(32)
+#pragma omp declare simd linear(sin) notinbranch simdlen(34)
+void WDS_is_sizeof_double_pointee(float in, double *sin);
+// WDS = 8 because of the linear clause, simdlen(34) is not generated.
+// CHECK-DAG: _ZGVsM2vl8_WDS_is_sizeof_double_pointee
+// CHECK-DAG: _ZGVsM4vl8_WDS_is_sizeof_double_pointee
+// CHECK-DAG: _ZGVsM32vl8_WDS_is_sizeof_double_pointee
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_double_pointee
+
+#pragma omp declare simd simdlen(2)
+#pragma omp declare simd simdlen(4)
+#pragma omp declare simd simdlen(32)
+#pragma omp declare simd simdlen(34)
+double WDS_is_sizeof_double(double in);
+// WDS = 8, simdlen(34) is not generated.
+// CHECK-DAG: _ZGVsM2v_WDS_is_sizeof_double
+// CHECK-DAG: _ZGVsM4v_WDS_is_sizeof_double
+// CHECK-DAG: _ZGVsM32v_WDS_is_sizeof_double
+// CHECK-NOT: _ZGV{{.*}}_WDS_is_sizeof_double
+
+static char C;
+static short S;
+static float F;
+static double D;
+
+void do_something() {
+  C = WDS_is_sizeof_char(C);
+  C = WDS_is_sizeof_short(S);
+  WDS_is_sizeof_float_pointee(F, &F);
+  WDS_is_sizeof_double_pointee(F, &D);
+  D = WDS_is_sizeof_double(D);
+}
Index: clang/test/OpenMP/aarch64_vfabi_NarrowestDataSize.c
===
--- /dev/null
+++ clang/test/OpenMP/aarch64_vfabi_NarrowestDataSize.c
@@ -0,0 +1,82 @@
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fopenmp  -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fopenmp-simd -x c -emit-llvm %s -o - -femit-all-decls | FileCheck %s
+
+// REQUIRES: aarch64-registered-target
+// Note: -fopemp and -fopenmp-simd behavior are expected to be the same.
+
+// This test checks the values of Narrowest Data Size (NDS), as defined in
+// https://github.com/ARM-software/abi-aa/tree/master/vfabia64
+//
+// NDS is used to compute the  token in the name of AdvSIMD
+// vector functions when no `simdlen` is specified, with the rule:
+//
+// if NDS(f) = 1, then VLEN = 16, 8;
+// if NDS(f) = 2, the

[PATCH] D79639: [SveEmitter] Builtins for SVE matrix multiply `mmla`.

2020-05-08 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli created this revision.
fpetrogalli added reviewers: sdesmalen, kmclaughlin, efriedma.
Herald added subscribers: cfe-commits, kristof.beyls, tschuett.
Herald added a reviewer: rengolin.
Herald added a project: clang.
fpetrogalli added a parent revision: D79638: [llvm][SVE] IR intrinscs for 
matrix multiplication instructions..
fpetrogalli updated this revision to Diff 262923.
fpetrogalli added a comment.

I replaced the lines `Signed = !Signed` in the tablegen emitter with `Signed = 
false`.


Guarded by __ARM_FEATURE_SVE_MATMUL_INT8:

- svmmla_u32
- svmmla_s32
- svusmmla_s32

Guarded by __ARM_FEATURE_SVE_MATMUL_FP32:

- svmmla_f32

Guarded by __ARM_FEATURE_SVE_MATMUL_FP64:

- svmmla_f64

Extra change: replace one use of auto with the type returned by the
function (NFC).


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D79639

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp32.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
  clang/utils/TableGen/SveEmitter.cpp

Index: clang/utils/TableGen/SveEmitter.cpp
===
--- clang/utils/TableGen/SveEmitter.cpp
+++ clang/utils/TableGen/SveEmitter.cpp
@@ -513,6 +513,11 @@
   case 'q':
 ElementBitwidth /= 4;
 break;
+  case 'b':
+Signed = false;
+Float = false;
+ElementBitwidth /= 4;
+break;
   case 'o':
 ElementBitwidth *= 4;
 break;
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
@@ -0,0 +1,39 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_INT8 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// RUN:-emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_INT8 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// RUN:-emit-llvm -o - %s -DSVE_OVERLOADED_FORMS| FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svint32_t test_svmmla_s32(svint32_t x, svint8_t y, svint8_t z) {
+  // CHECK-LABEL: test_svmmla_s32
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.smmla.nxv4i32( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svmmla, _s32, , )(x, y, z);
+}
+
+svuint32_t test_svmmla_u32(svuint32_t x, svuint8_t y, svuint8_t z) {
+  // CHECK-LABEL: test_svmmla_u32
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.ummla.nxv4i32( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svmmla, _u32, , )(x, y, z);
+}
+
+svint32_t test_svusmmla_s32(svint32_t x, svuint8_t y, svint8_t z) {
+  // CHECK-LABEL: test_svusmmla_s32
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.usmmla.nxv4i32( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svusmmla, _s32, , )(x, y, z);
+}
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP64 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// RUN:-emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP64 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// RUN:-emit-llvm -o - %s -DSVE_OVERLOADED_FORMS | FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svfloat64_t test_svmmla_f64(svfloat64_t x, svfloat64_t y, svfloat64_t z) {
+  // CHECK-LABEL: test_svmmla_f64
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.mmla.nxv2f64( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svmmla, _f64, , )(x, y, z);
+}
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp32.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intri

[PATCH] D79639: [SveEmitter] Builtins for SVE matrix multiply `mmla`.

2020-05-08 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 262923.
fpetrogalli added a comment.

I replaced the lines `Signed = !Signed` in the tablegen emitter with `Signed = 
false`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79639/new/

https://reviews.llvm.org/D79639

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp32.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
  clang/utils/TableGen/SveEmitter.cpp

Index: clang/utils/TableGen/SveEmitter.cpp
===
--- clang/utils/TableGen/SveEmitter.cpp
+++ clang/utils/TableGen/SveEmitter.cpp
@@ -513,6 +513,11 @@
   case 'q':
 ElementBitwidth /= 4;
 break;
+  case 'b':
+Signed = false;
+Float = false;
+ElementBitwidth /= 4;
+break;
   case 'o':
 ElementBitwidth *= 4;
 break;
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
@@ -0,0 +1,39 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_INT8 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// RUN:-emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_INT8 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// RUN:-emit-llvm -o - %s -DSVE_OVERLOADED_FORMS| FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svint32_t test_svmmla_s32(svint32_t x, svint8_t y, svint8_t z) {
+  // CHECK-LABEL: test_svmmla_s32
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.smmla.nxv4i32( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svmmla, _s32, , )(x, y, z);
+}
+
+svuint32_t test_svmmla_u32(svuint32_t x, svuint8_t y, svuint8_t z) {
+  // CHECK-LABEL: test_svmmla_u32
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.ummla.nxv4i32( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svmmla, _u32, , )(x, y, z);
+}
+
+svint32_t test_svusmmla_s32(svint32_t x, svuint8_t y, svint8_t z) {
+  // CHECK-LABEL: test_svusmmla_s32
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.usmmla.nxv4i32( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svusmmla, _s32, , )(x, y, z);
+}
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP64 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// RUN:-emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP64 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// RUN:-emit-llvm -o - %s -DSVE_OVERLOADED_FORMS | FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svfloat64_t test_svmmla_f64(svfloat64_t x, svfloat64_t y, svfloat64_t z) {
+  // CHECK-LABEL: test_svmmla_f64
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.mmla.nxv2f64( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svmmla, _f64, , )(x, y, z);
+}
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp32.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp32.c
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP32 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// RUN:-emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP32 \
+// RUN:-triple aarch64-none-linux-gnu -target-feature +sve \
+// RUN:-fallow-half-arguments-and-returns -S -O1 -Werror -Wall \
+// R

[PATCH] D79639: [SveEmitter] Builtins for SVE matrix multiply `mmla`.

2020-05-11 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

Adding some text here as phabricator refuses to submit with an empty comment...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79639/new/

https://reviews.llvm.org/D79639



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79639: [SveEmitter] Builtins for SVE matrix multiply `mmla`.

2020-05-11 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli updated this revision to Diff 263239.
fpetrogalli marked 4 inline comments as done.
fpetrogalli added a comment.

Thank you for the review @sdesmalen, I have addressed all your comments.

Francesco


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79639/new/

https://reviews.llvm.org/D79639

Files:
  clang/include/clang/Basic/arm_sve.td
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp32.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
  clang/utils/TableGen/SveEmitter.cpp

Index: clang/utils/TableGen/SveEmitter.cpp
===
--- clang/utils/TableGen/SveEmitter.cpp
+++ clang/utils/TableGen/SveEmitter.cpp
@@ -513,6 +513,11 @@
   case 'q':
 ElementBitwidth /= 4;
 break;
+  case 'b':
+Signed = false;
+Float = false;
+ElementBitwidth /= 4;
+break;
   case 'o':
 ElementBitwidth *= 4;
 break;
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_mmla.c
@@ -0,0 +1,32 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_INT8 -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_INT8 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svint32_t test_svmmla_s32(svint32_t x, svint8_t y, svint8_t z) {
+  // CHECK-LABEL: test_svmmla_s32
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.smmla.nxv4i32( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svmmla, _s32, , )(x, y, z);
+}
+
+svuint32_t test_svmmla_u32(svuint32_t x, svuint8_t y, svuint8_t z) {
+  // CHECK-LABEL: test_svmmla_u32
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.ummla.nxv4i32( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svmmla, _u32, , )(x, y, z);
+}
+
+svint32_t test_svusmmla_s32(svint32_t x, svuint8_t y, svint8_t z) {
+  // CHECK-LABEL: test_svusmmla_s32
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.usmmla.nxv4i32( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svusmmla, _s32, , )(x, y, z);
+}
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp64.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP64 -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP64 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svfloat64_t test_svmmla_f64(svfloat64_t x, svfloat64_t y, svfloat64_t z) {
+  // CHECK-LABEL: test_svmmla_f64
+  // CHECK: %[[RET:.*]] = call  @llvm.aarch64.sve.mmla.nxv2f64( %x,  %y,  %z)
+  // CHECK: ret  %[[RET]]
+  return SVE_ACLE_FUNC(svmmla, _f64, , )(x, y, z);
+}
Index: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp32.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_matmul_fp32.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP32 -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -D__ARM_FEATURE_SVE -D__ARM_FEATURE_SVE_MATMUL_FP32 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu -target-feature +sve -fallow-half-arguments-and-returns -S -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1, A2_UNUSED, A3, A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1, A2, A3, A4) A1##A2##A3##A4
+#endif
+
+svfloat32_t test_svmmla_f32(svfloat32_t x, svfloat32_t y, svfloat32_

[PATCH] D79708: [clang][BFloat] add NEON emitter for bfloat

2020-05-13 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added a comment.

Hi @stuij ,

thank you for working on this.

Would it make sense to add a test that includes the header file you have 
created?

Regards,

Francesco




Comment at: clang/include/clang/Basic/arm_neon_incl.td:293
+
+  string CartesianProductWith = "";
 }

What is this for?



Comment at: clang/utils/TableGen/NeonEmitter.cpp:628
 S += "x" + utostr(getNumElements());
+
   if (NumVectors > 1)

Remove me.



Comment at: clang/utils/TableGen/NeonEmitter.cpp:2198
 
+static void emitNeonTypeDefs(const std::string& types, raw_ostream &OS) {
+  std::string TypedefTypes(types);

Is this related to the changes for bfloat? Or is it a just a refactoring that 
it is nice to have? If the latter, please consider submitting it as a separate 
patch. If both refactoring and BF16 related, at the moment it is not possible 
to see clearly which changes are BF16 specific, so please do submit the 
refactoring first.



Comment at: clang/utils/TableGen/NeonEmitter.cpp:2617
+
+  OS << "#endif";
 }

Missing `\n`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79708/new/

https://reviews.llvm.org/D79708



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-13 Thread Francesco Petrogalli via Phabricator via cfe-commits
fpetrogalli added inline comments.



Comment at: clang/test/CodeGen/aarch64-bf16-getset-intrinsics.c:119-120
+// CHECK-LABEL: test_vduph_laneq_bf16
+// CHECK64: %vgetq_lane = extractelement <8 x bfloat> %v, i32 7
+// CHECK32: %vget_lane = extractelement <8 x bfloat> %v, i32 7

This seems to be the only place where you need to differentiate between check32 
and check64, and I am not 100% sure the extra `q` in the name of the variable 
is relevant in terms of codegen testing.

Maybe you can just test both aarch32 and aarch64 with the same `CHECK` prefix?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   >