from:"Brendan Dahl via cfe\-commits"

[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)

2023-11-29 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

@efriedma-quic missed your comment. I don't have commit access. Can you merge 
for me?

Thanks!

https://github.com/llvm/llvm-project/pull/66716
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)

2023-09-18 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/66716

Previously, annotations were only emitted for function definitions. With this 
change annotations are also emitted for declarations. Also, emitting function 
annotations is now deferred until the end so that the most up to date 
declaration is used which will have any inherited annotations.

>From 846deb6e2055a8e458530c9e27bbd512a68deb5c Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 12 Sep 2023 12:53:24 -0700
Subject: [PATCH] [clang][CodeGen] Emit annotations for function declarations.

Previously, annotations were only emitted for function definitions. With
this change annotations are also emitted for declarations. Also, emitting
function annotations is now deferred until the end so that the most
up to date declaration is used which will have any inherited annotations.
---
 clang/lib/CodeGen/CodeGenModule.cpp   | 23 +--
 clang/lib/CodeGen/CodeGenModule.h |  4 
 .../test/CodeGen/annotations-decl-use-decl.c  | 16 +
 .../CodeGen/annotations-decl-use-define.c | 16 +
 clang/test/CodeGen/annotations-declaration.c  | 17 ++
 clang/test/CodeGen/annotations-global.c   |  8 +++
 .../CodeGenCXX/attr-annotate-constructor.cpp  | 10 
 .../CodeGenCXX/attr-annotate-destructor.cpp   | 10 
 clang/test/CodeGenCXX/attr-annotate.cpp   |  6 ++---
 9 files changed, 101 insertions(+), 9 deletions(-)
 create mode 100644 clang/test/CodeGen/annotations-decl-use-decl.c
 create mode 100644 clang/test/CodeGen/annotations-decl-use-define.c
 create mode 100644 clang/test/CodeGen/annotations-declaration.c
 create mode 100644 clang/test/CodeGenCXX/attr-annotate-constructor.cpp
 create mode 100644 clang/test/CodeGenCXX/attr-annotate-destructor.cpp

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index 8b0c9340775cbe9..5108e6c91bfb30c 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -697,6 +697,7 @@ void CodeGenModule::checkAliases() {
 void CodeGenModule::clear() {
   DeferredDeclsToEmit.clear();
   EmittedDeferredDecls.clear();
+  DeferredAnnotations.clear();
   if (OpenMPRuntime)
 OpenMPRuntime->clear();
 }
@@ -3093,6 +3094,13 @@ void CodeGenModule::EmitVTablesOpportunistically() {
 }
 
 void CodeGenModule::EmitGlobalAnnotations() {
+  for (const auto& [MangledName, VD] : DeferredAnnotations) {
+llvm::GlobalValue *GV = GetGlobalValue(MangledName);
+if (GV)
+  AddGlobalAnnotations(VD, GV);
+  }
+  DeferredAnnotations.clear();
+
   if (Annotations.empty())
 return;
 
@@ -3597,6 +3605,14 @@ void CodeGenModule::EmitGlobal(GlobalDecl GD) {
 
   // Ignore declarations, they will be emitted on their first use.
   if (const auto *FD = dyn_cast(Global)) {
+// Update deferred annotations with the latest declaration if the function
+// function was already used or defined.
+if (FD->hasAttr()) {
+  StringRef MangledName = getMangledName(GD);
+  if (GetGlobalValue(MangledName))
+DeferredAnnotations[MangledName] = FD;
+}
+
 // Forward declarations are emitted lazily on first use.
 if (!FD->doesThisDeclarationHaveABody()) {
   if (!FD->doesDeclarationForceExternallyVisibleDefinition())
@@ -4370,6 +4386,11 @@ llvm::Constant *CodeGenModule::GetOrCreateLLVMFunction(
   llvm::Function::Create(FTy, llvm::Function::ExternalLinkage,
  Entry ? StringRef() : MangledName, &getModule());
 
+  // Store the declaration associated with this function so it is potentially
+  // updated by further declarations or definitions and emitted at the end.
+  if (D && D->hasAttr())
+DeferredAnnotations[MangledName] = cast(D);
+
   // If we already created a function with the same mangled name (but different
   // type) before, take its name and add it to the list of functions to be
   // replaced with F at the end of CodeGen.
@@ -5664,8 +5685,6 @@ void 
CodeGenModule::EmitGlobalFunctionDefinition(GlobalDecl GD,
 AddGlobalCtor(Fn, CA->getPriority());
   if (const DestructorAttr *DA = D->getAttr())
 AddGlobalDtor(Fn, DA->getPriority(), true);
-  if (D->hasAttr())
-AddGlobalAnnotations(D, Fn);
   if (getLangOpts().OpenMP && D->hasAttr())
 getOpenMPRuntime().emitDeclareTargetFunction(D, GV);
 }
diff --git a/clang/lib/CodeGen/CodeGenModule.h 
b/clang/lib/CodeGen/CodeGenModule.h
index 073b471c6e3cc11..8b0d68afbd0ecd2 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -431,6 +431,10 @@ class CodeGenModule : public CodeGenTypeCache {
   /// Global annotations.
   std::vector Annotations;
 
+  // Store deferred function annotations so they can be emitted at the end with
+  // most up to date ValueDecl that will have all the inherited annotations.
+  llvm::DenseMap DeferredAnnotations;
+
   /// Map used to get unique annotation strings.
   llvm::St

[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)

2023-09-18 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

This is relanding the patch from [here](https://reviews.llvm.org/D156172). It 
fixes the [backout 
failure](https://reviews.llvm.org/rG88b7e06dcf9723d0869b0c6bee030b4140e4366d) 
and adds a test for it.

https://github.com/llvm/llvm-project/pull/66716
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)

2023-09-19 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

@efriedma-quic could you re-review? The only changes were 
https://github.com/llvm/llvm-project/pull/66716/files#diff-e724febedab9c1a2832bf2056d208ff02ddcb2e6f90b5a653afc9b19ac78a5d7R3098-R3100

https://github.com/llvm/llvm-project/pull/66716
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)

2023-10-17 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

@AaronBallman or @efriedma-quic ping are you able to add reviewers?

https://github.com/llvm/llvm-project/pull/66716
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [WebAssembly] Change F16x8 extract lane to require constant integer. (PR #108116)

2024-09-10 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/108116

Building with no optimizations resulted in failures since the lane constant 
wasn't a constant in LLVM IR.

>From 3b813cd5b0555e6b654f575140e4db9a57ed699a Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 10 Sep 2024 21:52:55 +
Subject: [PATCH] [WebAssembly] Change F16x8 extract lane to require constant
 integer.

Building with no optimizations resulted in failures since the lane
constant wasn't a constant in LL IR.
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  4 ++--
 clang/lib/Headers/wasm_simd128.h  | 19 ---
 clang/test/CodeGen/builtins-wasm.c| 12 ++--
 3 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 2e80eef2c8b9bc..ad73f031922a0b 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -209,8 +209,8 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16")
 TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16")
 TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16")
-TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16")
-TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hIi", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hIif", "nc", "fp16")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/Headers/wasm_simd128.h b/clang/lib/Headers/wasm_simd128.h
index 67d12f6f2cf419..947bb9fe23029e 100644
--- a/clang/lib/Headers/wasm_simd128.h
+++ b/clang/lib/Headers/wasm_simd128.h
@@ -1888,18 +1888,15 @@ static __inline__ v128_t __FP16_FN_ATTRS 
wasm_f16x8_splat(float __a) {
   return (v128_t)__builtin_wasm_splat_f16x8(__a);
 }
 
-static __inline__ float __FP16_FN_ATTRS wasm_f16x8_extract_lane(v128_t __a,
-int __i)
-__REQUIRE_CONSTANT(__i) {
-  return __builtin_wasm_extract_lane_f16x8((__f16x8)__a, __i);
-}
+#ifdef __wasm_fp16__
 
-static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_replace_lane(v128_t __a,
- int __i,
- float __b)
-__REQUIRE_CONSTANT(__i) {
-  return (v128_t)__builtin_wasm_replace_lane_f16x8((__f16x8)__a, __i, __b);
-}
+#define wasm_f16x8_extract_lane(__a, __i)  
\
+  (__builtin_wasm_extract_lane_f16x8((__f16x8)(__a), __i))
+
+#define wasm_f16x8_replace_lane(__a, __i, __b) 
\
+  ((v128_t)__builtin_wasm_replace_lane_f16x8((__f16x8)(__a), __i, __b))
+
+#endif
 
 static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_abs(v128_t __a) {
   return (v128_t)__builtin_wasm_abs_f16x8((__f16x8)__a);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 3010b8954f1c2e..8943a92faad044 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -834,16 +834,16 @@ f16x8 splat_f16x8(float a) {
   return __builtin_wasm_splat_f16x8(a);
 }
 
-float extract_lane_f16x8(f16x8 a, int i) {
-  // WEBASSEMBLY:  %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x 
half> %a, i32 %i)
+float extract_lane_f16x8(f16x8 a) {
+  // WEBASSEMBLY:  %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x 
half> %a, i32 7)
   // WEBASSEMBLY-NEXT: ret float %0
-  return __builtin_wasm_extract_lane_f16x8(a, i);
+  return __builtin_wasm_extract_lane_f16x8(a, 7);
 }
 
-f16x8 replace_lane_f16x8(f16x8 a, int i, float v) {
-  // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 
x half> %a, i32 %i, float %v)
+f16x8 replace_lane_f16x8(f16x8 a, float v) {
+  // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 
x half> %a, i32 7, float %v)
   // WEBASSEMBLY-NEXT: ret <8 x half> %0
-  return __builtin_wasm_replace_lane_f16x8(a, i, v);
+  return __builtin_wasm_replace_lane_f16x8(a, 7, v);
 }
 
 f16x8 min_f16x8(f16x8 a, f16x8 b) {

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [WebAssembly] Change F16x8 extract lane to require constant integer. (PR #108116)

2024-09-11 Thread Brendan Dahl via cfe-commits



@@ -1888,18 +1888,15 @@ static __inline__ v128_t __FP16_FN_ATTRS 
wasm_f16x8_splat(float __a) {
   return (v128_t)__builtin_wasm_splat_f16x8(__a);
 }
 
-static __inline__ float __FP16_FN_ATTRS wasm_f16x8_extract_lane(v128_t __a,
-int __i)
-__REQUIRE_CONSTANT(__i) {

brendandahl wrote:

It does require a constant in C code, but in a no-opt build it is not a 
constant in LLVM IR. 

https://github.com/llvm/llvm-project/pull/108116
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement f16x8 madd and nmadd instructions. (PR #95151)

2024-06-11 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/95151

Implemented with intrinsics and builtins.

Specified at:
https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md

>From fd5ea6036e97e504e3286d218fe6b966e5bead82 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 11 Jun 2024 17:12:09 +
Subject: [PATCH] [WebAssembly] Implement f16x8 madd and nmadd instructions.

Implemented with intrinsics and builtins.

Specified at:
https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  2 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  4 +++
 clang/test/CodeGen/builtins-wasm.c| 14 ++
 .../WebAssembly/WebAssemblyInstrSIMD.td   | 27 ++-
 llvm/test/MC/WebAssembly/simd-encodings.s |  6 +
 5 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 4e48ff48b60f5..2a45f8a6582a2 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -170,6 +170,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f32x4, 
"V4fV4fV4fV4f", "nc", "relaxed
 TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f32x4, "V4fV4fV4fV4f", "nc", 
"relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f64x2, "V2dV2dV2dV2d", "nc", 
"relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f64x2, "V2dV2dV2dV2d", "nc", 
"relaxed-simd")
+TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", 
"half-precision")
+TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", 
"half-precision")
 
 TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i8x16, 
"V16ScV16ScV16ScV16Sc", "nc", "relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i16x8, "V8sV8sV8sV8s", "nc", 
"relaxed-simd")
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 06e201fa71e6f..511e1fd4016d7 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21149,6 +21149,8 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_shuffle);
 return Builder.CreateCall(Callee, Ops);
   }
+  case WebAssembly::BI__builtin_wasm_relaxed_madd_f16x8:
+  case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f16x8:
   case WebAssembly::BI__builtin_wasm_relaxed_madd_f32x4:
   case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f32x4:
   case WebAssembly::BI__builtin_wasm_relaxed_madd_f64x2:
@@ -21158,10 +21160,12 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Value *C = EmitScalarExpr(E->getArg(2));
 unsigned IntNo;
 switch (BuiltinID) {
+case WebAssembly::BI__builtin_wasm_relaxed_madd_f16x8:
 case WebAssembly::BI__builtin_wasm_relaxed_madd_f32x4:
 case WebAssembly::BI__builtin_wasm_relaxed_madd_f64x2:
   IntNo = Intrinsic::wasm_relaxed_madd;
   break;
+case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f16x8:
 case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f32x4:
 case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f64x2:
   IntNo = Intrinsic::wasm_relaxed_nmadd;
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index d6ee4f68700dc..75861b1b4bd6d 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -690,6 +690,20 @@ f64x2 nmadd_f64x2(f64x2 a, f64x2 b, f64x2 c) {
   // WEBASSEMBLY-NEXT: ret
 }
 
+f16x8 madd_f16x8(f16x8 a, f16x8 b, f16x8 c) {
+  return __builtin_wasm_relaxed_madd_f16x8(a, b, c);
+  // WEBASSEMBLY: call <8 x half> @llvm.wasm.relaxed.madd.v8f16(
+  // WEBASSEMBLY-SAME: <8 x half> %a, <8 x half> %b, <8 x half> %c)
+  // WEBASSEMBLY-NEXT: ret
+}
+
+f16x8 nmadd_f16x8(f16x8 a, f16x8 b, f16x8 c) {
+  return __builtin_wasm_relaxed_nmadd_f16x8(a, b, c);
+  // WEBASSEMBLY: call <8 x half> @llvm.wasm.relaxed.nmadd.v8f16(
+  // WEBASSEMBLY-SAME: <8 x half> %a, <8 x half> %b, <8 x half> %c)
+  // WEBASSEMBLY-NEXT: ret
+}
+
 i8x16 laneselect_i8x16(i8x16 a, i8x16 b, i8x16 c) {
   return __builtin_wasm_relaxed_laneselect_i8x16(a, b, c);
   // WEBASSEMBLY: call <16 x i8> @llvm.wasm.relaxed.laneselect.v16i8(
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td 
b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
index 3c97befcea1a4..3888175efd115 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -1480,23 +1480,24 @@ defm "" : RelaxedConvert simdopA, bits<32> simdopS> {
+multiclass SIMDMADD simdopA, bits<32> simdopS, 
list reqs> {
   defm MADD_#vec :
-RELAXED_I<(outs V128:$dst), (ins V128:$a, V128:$b, V128:$c), (outs), (ins),
-  [(set (vec.vt V128:$dst), (int_wasm_relaxed_madd
-(vec.vt V128

[clang] [llvm] [WebAssembly] Implement f16x8 madd and nmadd instructions. (PR #95151)

2024-06-11 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

Note: I've [opened an 
issue](https://github.com/WebAssembly/half-precision/issues/5) about the 
`relaxed_` prefix and whether it should be included in the instruction name.

https://github.com/llvm/llvm-project/pull/95151
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement f16x8 madd and nmadd instructions. (PR #95151)

2024-06-11 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl closed 
https://github.com/llvm/llvm-project/pull/95151
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)

2024-05-28 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/93360

>From c33801afebb6720bc4b51fb4064b59529c40d298 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Thu, 23 May 2024 23:38:51 +
Subject: [PATCH 1/2] [WebAssembly] Implement all f16x8 binary instructions.

This reuses most of the code that was created for f32x4 and f64x2 binary
instructions and tries to follow how they were implemented.

add/sub/mul/div - use regular LL instructions
min/max - use the minimum/maximum intrinsic, and also have builtins
pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins

Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  4 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  4 ++
 clang/test/CodeGen/builtins-wasm.c| 24 +++
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  5 ++
 .../WebAssembly/WebAssemblyInstrSIMD.td   | 37 +++---
 .../CodeGen/WebAssembly/half-precision.ll | 68 +++
 llvm/test/MC/WebAssembly/simd-encodings.s | 24 +++
 7 files changed, 157 insertions(+), 9 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index fd8c1b480d6da..4e48ff48b60f5 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -135,6 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", 
"nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision")
 
 TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128")
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 0549afa12e430..f8be7182b5267 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -20779,6 +20779,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
   }
   case WebAssembly::BI__builtin_wasm_min_f32:
   case WebAssembly::BI__builtin_wasm_min_f64:
+  case WebAssembly::BI__builtin_wasm_min_f16x8:
   case WebAssembly::BI__builtin_wasm_min_f32x4:
   case WebAssembly::BI__builtin_wasm_min_f64x2: {
 Value *LHS = EmitScalarExpr(E->getArg(0));
@@ -20789,6 +20790,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
   }
   case WebAssembly::BI__builtin_wasm_max_f32:
   case WebAssembly::BI__builtin_wasm_max_f64:
+  case WebAssembly::BI__builtin_wasm_max_f16x8:
   case WebAssembly::BI__builtin_wasm_max_f32x4:
   case WebAssembly::BI__builtin_wasm_max_f64x2: {
 Value *LHS = EmitScalarExpr(E->getArg(0));
@@ -20797,6 +20799,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 CGM.getIntrinsic(Intrinsic::maximum, ConvertType(E->getType()));
 return Builder.CreateCall(Callee, {LHS, RHS});
   }
+  case WebAssembly::BI__builtin_wasm_pmin_f16x8:
   case WebAssembly::BI__builtin_wasm_pmin_f32x4:
   case WebAssembly::BI__builtin_wasm_pmin_f64x2: {
 Value *LHS = EmitScalarExpr(E->getArg(0));
@@ -20805,6 +20808,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 CGM.getIntrinsic(Intrinsic::wasm_pmin, ConvertType(E->getType()));
 return Builder.CreateCall(Callee, {LHS, RHS});
   }
+  case WebAssembly::BI__builtin_wasm_pmax_f16x8:
   case WebAssembly::BI__builtin_wasm_pmax_f32x4:
   case WebAssembly::BI__builtin_wasm_pmax_f64x2: {
 Value *LHS = EmitScalarExpr(E->getArg(0));
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 93a6ab06081c9..d6ee4f68700dc 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -825,6 +825,30 @@ float extract_lane_f16x8(f16x8 a, int i) {
   // WEBASSEMBLY-NEXT: ret float %0
   return __builtin_wasm_extract_lane_f16x8(a, i);
 }
+
+f16x8 min_f16x8(f16x8 a, f16x8 b) {
+  // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.minimum.v8f16(<8 x half> 
%a, <8 x half> %b)
+  // WEBASSEMBLY-NEXT: ret <8 x half> %0
+  return __builtin_wasm_min_f16x8(a, b);
+}
+
+f16x8 max_f16x8(f16x8 a, f16x8 b) {
+  // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.maximum.v8f16(<8 x half> 
%a, <8 x half> %b)
+  // WEBASSEMBLY-NEXT: ret <8 x half> %0
+  return __builtin_wasm_max_f16x8(a, b);
+}
+
+f16x8 pmin_f16x8(f16x8 a, f16x8 b) {
+  // WEBASSEMBLY:  %0 = tail call

[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)

2024-05-28 Thread Brendan Dahl via cfe-commits



@@ -152,6 +153,18 @@ def F64x2 : Vec {
   let prefix = "f64x2";
 }
 
+def F16x8 : Vec {
+ let vt = v8f16;
+ let int_vt = v8i16;
+ let lane_vt = f32;
+ let lane_rc = F32;
+ let lane_bits = 16;
+ let lane_idx = LaneIdx8;
+ let lane_load = int_wasm_loadf16_f32;
+ let splat = PatFrag<(ops node:$x), (v8f16 (splat_vector (f16 $x)))>;
+ let prefix = "f16x8";
+}
+
 defvar AllVecs = [I8x16, I16x8, I32x4, I64x2, F32x4, F64x2];

brendandahl wrote:

I hope to include F16x8 here when we better support it and the regular patterns 
work for it. I've added a comment for now, but can change the name if wanted.

https://github.com/llvm/llvm-project/pull/93360
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)

2024-05-28 Thread Brendan Dahl via cfe-commits



@@ -1199,6 +1213,7 @@ def : Pat<(v2f64 (froundeven (v2f64 V128:$src))), 
(NEAREST_F64x2 V128:$src)>;
 multiclass SIMDBinaryFP 
baseInst> {
   defm "" : SIMDBinary;
   defm "" : SIMDBinary;
+  defm "" : SIMDBinary;

brendandahl wrote:

I ended up adding `HalfPrecisionBinary`. I was hoping there was some way I 
could pass a multiclass id as a parameter so i could then pass in `SIMD_I` or 
`HALF_PRECISION_I` as an argument, but I couldn't figure out a way to make that 
work.

https://github.com/llvm/llvm-project/pull/93360
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)

2024-05-28 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl closed 
https://github.com/llvm/llvm-project/pull/93360
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement f16x8.replace_lane instruction. (PR #99388)

2024-07-24 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl closed 
https://github.com/llvm/llvm-project/pull/99388
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement f16x8.replace_lane instruction. (PR #99388)

2024-07-17 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/99388

Use a builtin and intrinsic until half types are better supported for 
instruction selection.

>From a6d65f276fba7487fdecf2e31edef457f74fbafe Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 17 Jul 2024 20:10:20 +
Subject: [PATCH] [WebAssembly] Implement f16x8.replace_lane instruction.

Use a builtin and intrinsic until half types are better supported for
instruction selection.
---
 clang/include/clang/Basic/BuiltinsWebAssembly.def   |  1 +
 clang/lib/CodeGen/CGBuiltin.cpp |  7 +++
 clang/test/CodeGen/builtins-wasm.c  |  6 ++
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td   |  4 
 llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td | 13 +
 llvm/test/CodeGen/WebAssembly/half-precision.ll |  8 
 llvm/test/MC/WebAssembly/simd-encodings.s   |  3 +++
 7 files changed, 42 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 2a45f8a6582a2..df304a71e475e 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -201,6 +201,7 @@ TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", 
"half-precision")
 TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision")
 TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision")
 TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", 
"half-precision")
+TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", 
"half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 67027f8aa93f3..402b7a7b20e61 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21386,6 +21386,13 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_extract_lane_f16x8);
 return Builder.CreateCall(Callee, {Vector, Index});
   }
+  case WebAssembly::BI__builtin_wasm_replace_lane_f16x8: {
+Value *Vector = EmitScalarExpr(E->getArg(0));
+Value *Index = EmitScalarExpr(E->getArg(1));
+Value *Val = EmitScalarExpr(E->getArg(2));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_replace_lane_f16x8);
+return Builder.CreateCall(Callee, {Vector, Index, Val});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 75861b1b4bd6d..f494aeada0157 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -840,6 +840,12 @@ float extract_lane_f16x8(f16x8 a, int i) {
   return __builtin_wasm_extract_lane_f16x8(a, i);
 }
 
+f16x8 replace_lane_f16x8(f16x8 a, int i, float v) {
+  // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 
x half> %a, i32 %i, float %v)
+  // WEBASSEMBLY-NEXT: ret <8 x half> %0
+  return __builtin_wasm_replace_lane_f16x8(a, i, v);
+}
+
 f16x8 min_f16x8(f16x8 a, f16x8 b) {
   // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.minimum.v8f16(<8 x half> 
%a, <8 x half> %b)
   // WEBASSEMBLY-NEXT: ret <8 x half> %0
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index 47aab196a6d4f..4d2df1c44ebce 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -363,6 +363,10 @@ def int_wasm_extract_lane_f16x8:
   DefaultAttrsIntrinsic<[llvm_float_ty],
 [llvm_v8f16_ty, llvm_i32_ty],
 [IntrNoMem, IntrSpeculatable]>;
+def int_wasm_replace_lane_f16x8:
+  DefaultAttrsIntrinsic<[llvm_v8f16_ty],
+[llvm_v8f16_ty, llvm_i32_ty, llvm_float_ty],
+[IntrNoMem, IntrSpeculatable]>;
 
 
 
//===--===//
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td 
b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
index 2ee430c88169d..f11fe12c6ecb8 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -702,6 +702,19 @@ defm "" : ReplaceLane;
 defm "" : ReplaceLane;
 defm "" : ReplaceLane;
 
+// For now use an instrinsic for f16x8.replace_lane instead of ReplaceLane 
above
+// since LL generated with half type arguments is not well supported and 
creates
+// conversions from f16->f32.
+defm REPLACE_LANE_F16x8 :
+  HALF_PRECISION_I<(outs V128:$dst), (ins V128:$vec, vec_i8imm_op:$idx, 
F32:$x),
+   (outs), (ins vec_i8imm_

[clang] [llvm] [WebAssembly] Implement f16x8.replace_lane instruction. (PR #99388)

2024-07-22 Thread Brendan Dahl via cfe-commits



@@ -702,6 +702,19 @@ defm "" : ReplaceLane;
 defm "" : ReplaceLane;
 defm "" : ReplaceLane;
 
+// For now use an instrinsic for f16x8.replace_lane instead of ReplaceLane 
above
+// since LL generated with half type arguments is not well supported and 
creates

brendandahl wrote:

Yeah, I'll update.

https://github.com/llvm/llvm-project/pull/99388
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement f16x8.replace_lane instruction. (PR #99388)

2024-07-22 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/99388

>From 8320b1f7f45f42363547cefb748627cfe1bb7af6 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 17 Jul 2024 20:10:20 +
Subject: [PATCH] [WebAssembly] Implement f16x8.replace_lane instruction.

Use a builtin and intrinsic until half types are better supported for
instruction selection.
---
 clang/include/clang/Basic/BuiltinsWebAssembly.def   |  1 +
 clang/lib/CodeGen/CGBuiltin.cpp |  7 +++
 clang/test/CodeGen/builtins-wasm.c  |  6 ++
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td   |  4 
 llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td | 13 +
 llvm/test/CodeGen/WebAssembly/half-precision.ll |  8 
 llvm/test/MC/WebAssembly/simd-encodings.s   |  3 +++
 7 files changed, 42 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 2a45f8a6582a2..df304a71e475e 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -201,6 +201,7 @@ TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", 
"half-precision")
 TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision")
 TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision")
 TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", 
"half-precision")
+TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", 
"half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 67027f8aa93f3..402b7a7b20e61 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21386,6 +21386,13 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_extract_lane_f16x8);
 return Builder.CreateCall(Callee, {Vector, Index});
   }
+  case WebAssembly::BI__builtin_wasm_replace_lane_f16x8: {
+Value *Vector = EmitScalarExpr(E->getArg(0));
+Value *Index = EmitScalarExpr(E->getArg(1));
+Value *Val = EmitScalarExpr(E->getArg(2));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_replace_lane_f16x8);
+return Builder.CreateCall(Callee, {Vector, Index, Val});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 75861b1b4bd6d..f494aeada0157 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -840,6 +840,12 @@ float extract_lane_f16x8(f16x8 a, int i) {
   return __builtin_wasm_extract_lane_f16x8(a, i);
 }
 
+f16x8 replace_lane_f16x8(f16x8 a, int i, float v) {
+  // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 
x half> %a, i32 %i, float %v)
+  // WEBASSEMBLY-NEXT: ret <8 x half> %0
+  return __builtin_wasm_replace_lane_f16x8(a, i, v);
+}
+
 f16x8 min_f16x8(f16x8 a, f16x8 b) {
   // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.minimum.v8f16(<8 x half> 
%a, <8 x half> %b)
   // WEBASSEMBLY-NEXT: ret <8 x half> %0
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index 47aab196a6d4f..4d2df1c44ebce 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -363,6 +363,10 @@ def int_wasm_extract_lane_f16x8:
   DefaultAttrsIntrinsic<[llvm_float_ty],
 [llvm_v8f16_ty, llvm_i32_ty],
 [IntrNoMem, IntrSpeculatable]>;
+def int_wasm_replace_lane_f16x8:
+  DefaultAttrsIntrinsic<[llvm_v8f16_ty],
+[llvm_v8f16_ty, llvm_i32_ty, llvm_float_ty],
+[IntrNoMem, IntrSpeculatable]>;
 
 
 
//===--===//
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td 
b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
index 2ee430c88169d..8eaf107b2cc40 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
@@ -702,6 +702,19 @@ defm "" : ReplaceLane;
 defm "" : ReplaceLane;
 defm "" : ReplaceLane;
 
+// For now use an instrinsic for f16x8.replace_lane instead of ReplaceLane 
above
+// since LLVM IR generated with half type arguments is not well supported and
+// creates conversions from f16->f32.
+defm REPLACE_LANE_F16x8 :
+  HALF_PRECISION_I<(outs V128:$dst), (ins V128:$vec, vec_i8imm_op:$idx, 
F32:$x),
+   (outs), (ins vec_i8imm_op:$idx),
+   [(set (v8f16 V128:$dst), (int_wasm_replace_lane_f16x8
+

[clang] [llvm] [WebAssembly] Add half-precision feature (PR #90248)

2024-04-26 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/90248

This currently only defines a constant, but in the future will be used to gate 
builtins for experimenting and prototyping half-precision proposal 
(https://github.com/WebAssembly/half-precision).

>From 098342189d16b653a189889de43fe5a3d38592c8 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Fri, 26 Apr 2024 18:30:48 +
Subject: [PATCH] [WebAssembly] Add half-precision feature

This currently only defines a constant, but in the future will be used
to gate builtins for experimenting and prototyping half-precision proposal
(https://github.com/WebAssembly/half-precision).
---
 clang/include/clang/Driver/Options.td   |  2 ++
 clang/lib/Basic/Targets/WebAssembly.cpp | 11 +++
 clang/lib/Basic/Targets/WebAssembly.h   |  1 +
 llvm/lib/Target/WebAssembly/WebAssembly.td  |  3 +++
 llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td |  4 
 llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h  |  2 ++
 6 files changed, 23 insertions(+)

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 922bda721dc780..0a3c4494443cad 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4876,6 +4876,8 @@ def msimd128 : Flag<["-"], "msimd128">, 
Group;
 def mno_simd128 : Flag<["-"], "mno-simd128">, Group;
 def mrelaxed_simd : Flag<["-"], "mrelaxed-simd">, Group;
 def mno_relaxed_simd : Flag<["-"], "mno-relaxed-simd">, 
Group;
+def mhalf_precision : Flag<["-"], "mhalf-precision">, 
Group;
+def mno_half_precision : Flag<["-"], "mno-half-precision">, 
Group;
 def mnontrapping_fptoint : Flag<["-"], "mnontrapping-fptoint">, 
Group;
 def mno_nontrapping_fptoint : Flag<["-"], "mno-nontrapping-fptoint">, 
Group;
 def msign_ext : Flag<["-"], "msign-ext">, Group;
diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp 
b/clang/lib/Basic/Targets/WebAssembly.cpp
index d473fd19086460..3d76411f890a86 100644
--- a/clang/lib/Basic/Targets/WebAssembly.cpp
+++ b/clang/lib/Basic/Targets/WebAssembly.cpp
@@ -47,6 +47,7 @@ bool WebAssemblyTargetInfo::hasFeature(StringRef Feature) 
const {
   return llvm::StringSwitch(Feature)
   .Case("simd128", SIMDLevel >= SIMD128)
   .Case("relaxed-simd", SIMDLevel >= RelaxedSIMD)
+  .Case("half-precision", HasHalfPrecision)
   .Case("nontrapping-fptoint", HasNontrappingFPToInt)
   .Case("sign-ext", HasSignExt)
   .Case("exception-handling", HasExceptionHandling)
@@ -156,6 +157,7 @@ bool WebAssemblyTargetInfo::initFeatureMap(
 Features["reference-types"] = true;
 Features["sign-ext"] = true;
 Features["tail-call"] = true;
+Features["half-precision"] = true;
 setSIMDLevel(Features, SIMD128, true);
   } else if (CPU == "generic") {
 Features["mutable-globals"] = true;
@@ -216,6 +218,15 @@ bool WebAssemblyTargetInfo::handleTargetFeatures(
   HasBulkMemory = false;
   continue;
 }
+if (Feature == "+half-precision") {
+  SIMDLevel = std::max(SIMDLevel, SIMD128);
+  HasHalfPrecision = true;
+  continue;
+}
+if (Feature == "-half-precision") {
+  HasHalfPrecision = false;
+  continue;
+}
 if (Feature == "+atomics") {
   HasAtomics = true;
   continue;
diff --git a/clang/lib/Basic/Targets/WebAssembly.h 
b/clang/lib/Basic/Targets/WebAssembly.h
index 5568aa28eaefa7..e4c18879182ed7 100644
--- a/clang/lib/Basic/Targets/WebAssembly.h
+++ b/clang/lib/Basic/Targets/WebAssembly.h
@@ -64,6 +64,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public 
TargetInfo {
   bool HasReferenceTypes = false;
   bool HasExtendedConst = false;
   bool HasMultiMemory = false;
+  bool HasHalfPrecision = false;
 
   std::string ABI;
 
diff --git a/llvm/lib/Target/WebAssembly/WebAssembly.td 
b/llvm/lib/Target/WebAssembly/WebAssembly.td
index d538197450b65b..f00974531209d2 100644
--- a/llvm/lib/Target/WebAssembly/WebAssembly.td
+++ b/llvm/lib/Target/WebAssembly/WebAssembly.td
@@ -28,6 +28,9 @@ def FeatureSIMD128 : SubtargetFeature<"simd128", "SIMDLevel", 
"SIMD128",
 def FeatureRelaxedSIMD : SubtargetFeature<"relaxed-simd", "SIMDLevel", 
"RelaxedSIMD",
   "Enable relaxed-simd instructions">;
 
+def FeatureHalfPrecision : SubtargetFeature<"half-precision", 
"HasHalfPrecision", "true",
+"Enable half precision 
instructions">;
+
 def FeatureAtomics : SubtargetFeature<"atomics", "HasAtomics", "true",
   "Enable Atomics">;
 
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td 
b/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
index 59ea9247bd86f5..7b57f8ce90e066 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
@@ -30,6 +30,10 @@ def HasRelaxedSIMD :
 Predicate<"Subtarget->hasRelaxedSIMD()">,
 Assembl

[clang] [llvm] [WebAssembly] Add half-precision feature (PR #90248)

2024-04-26 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/90248

>From 85e5e1660ad1e6fda8ecf8984aab0cba96130b4f Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Fri, 26 Apr 2024 18:30:48 +
Subject: [PATCH] [WebAssembly] Add half-precision feature

This currently only defines a constant, but in the future will be used
to gate builtins for experimenting and prototyping half-precision proposal
(https://github.com/WebAssembly/half-precision).
---
 clang/include/clang/Driver/Options.td   |  2 ++
 clang/lib/Basic/Targets/WebAssembly.cpp | 11 +++
 clang/lib/Basic/Targets/WebAssembly.h   |  1 +
 clang/test/Driver/wasm-features.c   |  6 ++
 llvm/lib/Target/WebAssembly/WebAssembly.td  |  3 +++
 llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td |  4 
 llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h  |  2 ++
 7 files changed, 29 insertions(+)

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 922bda721dc780..0a3c4494443cad 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4876,6 +4876,8 @@ def msimd128 : Flag<["-"], "msimd128">, 
Group;
 def mno_simd128 : Flag<["-"], "mno-simd128">, Group;
 def mrelaxed_simd : Flag<["-"], "mrelaxed-simd">, Group;
 def mno_relaxed_simd : Flag<["-"], "mno-relaxed-simd">, 
Group;
+def mhalf_precision : Flag<["-"], "mhalf-precision">, 
Group;
+def mno_half_precision : Flag<["-"], "mno-half-precision">, 
Group;
 def mnontrapping_fptoint : Flag<["-"], "mnontrapping-fptoint">, 
Group;
 def mno_nontrapping_fptoint : Flag<["-"], "mno-nontrapping-fptoint">, 
Group;
 def msign_ext : Flag<["-"], "msign-ext">, Group;
diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp 
b/clang/lib/Basic/Targets/WebAssembly.cpp
index d473fd19086460..3d76411f890a86 100644
--- a/clang/lib/Basic/Targets/WebAssembly.cpp
+++ b/clang/lib/Basic/Targets/WebAssembly.cpp
@@ -47,6 +47,7 @@ bool WebAssemblyTargetInfo::hasFeature(StringRef Feature) 
const {
   return llvm::StringSwitch(Feature)
   .Case("simd128", SIMDLevel >= SIMD128)
   .Case("relaxed-simd", SIMDLevel >= RelaxedSIMD)
+  .Case("half-precision", HasHalfPrecision)
   .Case("nontrapping-fptoint", HasNontrappingFPToInt)
   .Case("sign-ext", HasSignExt)
   .Case("exception-handling", HasExceptionHandling)
@@ -156,6 +157,7 @@ bool WebAssemblyTargetInfo::initFeatureMap(
 Features["reference-types"] = true;
 Features["sign-ext"] = true;
 Features["tail-call"] = true;
+Features["half-precision"] = true;
 setSIMDLevel(Features, SIMD128, true);
   } else if (CPU == "generic") {
 Features["mutable-globals"] = true;
@@ -216,6 +218,15 @@ bool WebAssemblyTargetInfo::handleTargetFeatures(
   HasBulkMemory = false;
   continue;
 }
+if (Feature == "+half-precision") {
+  SIMDLevel = std::max(SIMDLevel, SIMD128);
+  HasHalfPrecision = true;
+  continue;
+}
+if (Feature == "-half-precision") {
+  HasHalfPrecision = false;
+  continue;
+}
 if (Feature == "+atomics") {
   HasAtomics = true;
   continue;
diff --git a/clang/lib/Basic/Targets/WebAssembly.h 
b/clang/lib/Basic/Targets/WebAssembly.h
index 5568aa28eaefa7..e4c18879182ed7 100644
--- a/clang/lib/Basic/Targets/WebAssembly.h
+++ b/clang/lib/Basic/Targets/WebAssembly.h
@@ -64,6 +64,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public 
TargetInfo {
   bool HasReferenceTypes = false;
   bool HasExtendedConst = false;
   bool HasMultiMemory = false;
+  bool HasHalfPrecision = false;
 
   std::string ABI;
 
diff --git a/clang/test/Driver/wasm-features.c 
b/clang/test/Driver/wasm-features.c
index 5dae5dbc89b905..1f7fb213498265 100644
--- a/clang/test/Driver/wasm-features.c
+++ b/clang/test/Driver/wasm-features.c
@@ -77,6 +77,12 @@
 // RELAXED-SIMD: "-target-feature" "+relaxed-simd"
 // NO-RELAXED-SIMD: "-target-feature" "-relaxed-simd"
 
+// RUN: %clang --target=wasm32-unknown-unknown -### %s -mhalf-precision 2>&1 | 
FileCheck %s -check-prefix=HALF-PRECISION
+// RUN: %clang --target=wasm32-unknown-unknown -### %s -mno-half-precision 
2>&1 | FileCheck %s -check-prefix=NO-HALF-PRECISION
+
+// HALF-PRECISION: "-target-feature" "+half-precision"
+// NO-HALF-PRECISION: "-target-feature" "-half-precision"
+
 // RUN: %clang --target=wasm32-unknown-unknown -### %s -mexception-handling 
2>&1 | FileCheck %s -check-prefix=EXCEPTION-HANDLING
 // RUN: %clang --target=wasm32-unknown-unknown -### %s -mno-exception-handling 
2>&1 | FileCheck %s -check-prefix=NO-EXCEPTION-HANDLING
 
diff --git a/llvm/lib/Target/WebAssembly/WebAssembly.td 
b/llvm/lib/Target/WebAssembly/WebAssembly.td
index d538197450b65b..f00974531209d2 100644
--- a/llvm/lib/Target/WebAssembly/WebAssembly.td
+++ b/llvm/lib/Target/WebAssembly/WebAssembly.td
@@ -28,6 +28,9 @@ def FeatureSIMD128 : SubtargetFeature<"simd128", "SIMDLevel", 
"SIMD128",
 def Feature

[clang] [llvm] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (PR #106465)

2024-08-28 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/106465

Getting this to work required a few additional changes:
 - Add builtins for any instructions that can't be done with plain C currently.
 - Add support for the saturating version of fp_to__I16x8. Other vector 
sizes supported this already.
 - Support bitcast of f16x8 to v128. Needed to return a __f16x8 as v128_t.

>From 4df403d1d3e32a591b6994acea8f7daa9df78c7b Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 28 Aug 2024 22:56:09 +
Subject: [PATCH] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16
 instructions.

Getting this to work required a few additional changes:
 - Add builtins for any instructions that can't be done with plain C
   currently.
 - Add support for the saturating version of fp_to__I16x8. Other
   vector sizes supported this already.
 - Support bitcast of f16x8 to v128. Needed to return a __f16x8 as v128_t.
---
 .../clang/Basic/BuiltinsWebAssembly.def   |   9 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  12 ++
 clang/lib/Headers/wasm_simd128.h  | 147 ++
 .../intrinsic-header-tests/wasm_simd128.c | 138 +++-
 .../WebAssembly/WebAssemblyISelLowering.cpp   |   9 +-
 .../WebAssembly/WebAssemblyInstrSIMD.td   |  28 ++--
 .../CodeGen/WebAssembly/half-precision.ll |  18 +++
 7 files changed, 348 insertions(+), 13 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 034d32c6291b3d..2e80eef2c8b9bc 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -124,6 +124,7 @@ TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "UiV8s", "nc", 
"simd128")
 TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "UiV4i", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "UiV2LLi", "nc", "simd128")
 
+TARGET_BUILTIN(__builtin_wasm_abs_f16x8, "V8hV8h", "nc", "fp16")
 TARGET_BUILTIN(__builtin_wasm_abs_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_abs_f64x2, "V2dV2d", "nc", "simd128")
 
@@ -140,6 +141,10 @@ TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", 
"nc", "fp16")
 TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "fp16")
 TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "fp16")
 
+TARGET_BUILTIN(__builtin_wasm_ceil_f16x8, "V8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_floor_f16x8, "V8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_trunc_f16x8, "V8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_nearest_f16x8, "V8hV8h", "nc", "fp16")
 TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_trunc_f32x4, "V4fV4f", "nc", "simd128")
@@ -151,9 +156,13 @@ TARGET_BUILTIN(__builtin_wasm_nearest_f64x2, "V2dV2d", 
"nc", "simd128")
 
 TARGET_BUILTIN(__builtin_wasm_dot_s_i32x4_i16x8, "V4iV8sV8s", "nc", "simd128")
 
+TARGET_BUILTIN(__builtin_wasm_sqrt_f16x8, "V8hV8h", "nc", "fp16")
 TARGET_BUILTIN(__builtin_wasm_sqrt_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_sqrt_f64x2, "V2dV2d", "nc", "simd128")
 
+TARGET_BUILTIN(__builtin_wasm_trunc_saturate_s_i16x8_f16x8, "V8sV8h", "nc", 
"simd128")
+TARGET_BUILTIN(__builtin_wasm_trunc_saturate_u_i16x8_f16x8, "V8sV8h", "nc", 
"simd128")
+
 TARGET_BUILTIN(__builtin_wasm_trunc_saturate_s_i32x4_f32x4, "V4iV4f", "nc", 
"simd128")
 TARGET_BUILTIN(__builtin_wasm_trunc_saturate_u_i32x4_f32x4, "V4iV4f", "nc", 
"simd128")
 
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 2a733e4d834cfa..bb5367c29b1c3a 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21208,6 +21208,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
   case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i32_f64:
   case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i64_f32:
   case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i64_f64:
+  case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i16x8_f16x8:
   case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i32x4_f32x4: {
 Value *Src = EmitScalarExpr(E->getArg(0));
 llvm::Type *ResT = ConvertType(E->getType());
@@ -21219,6 +21220,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
   case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i32_f64:
   case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i64_f32:
   case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i64_f64:
+  case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i16x8_f16x8:
   case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i32x4_f32x4: {
 Value *Src = EmitScalarExpr(E->getArg(0));
 llvm::Type *ResT = ConvertType(E->getType());
@@ -21266,6 +21268,10 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 CGM.getIntrinsic(Intrinsic::was

[clang] [llvm] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (PR #106465)

2024-08-28 Thread Brendan Dahl via cfe-commits



@@ -165,8 +165,9 @@ def F16x8 : Vec {
  let prefix = "f16x8";
 }
 
-// TODO: Include F16x8 here when half precision is better supported.
-defvar AllVecs = [I8x16, I16x8, I32x4, I64x2, F32x4, F64x2];
+// TODO: Remove StdVecs when the F16x8 works every where StdVecs is used.

brendandahl wrote:

It's not obvious from this patch, but now `AllVecs` is only used in one place 
for bitcast (which means it now works for f16x8 vectors too).

Alternatively, I can leave `AllVecs` alone and just concat F16x8 down where 
bitcast is supported.

https://github.com/llvm/llvm-project/pull/106465
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (PR #106465)

2024-08-29 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

> Would it make sense to put these declarations behind `#ifdef __wasm_fp16__` 
> so that they aren't declared if fp16 support isn't enabled?

I could do that, if that's preferred. I followed what the relaxed instructions 
did and use the target attribute `__target__("fp16")`. 

https://github.com/llvm/llvm-project/pull/106465
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (PR #106465)

2024-08-30 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl closed 
https://github.com/llvm/llvm-project/pull/106465
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)

2024-05-02 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/90906

Adds a builtin and intrinsic for the f32.load_f16 instruction.

The instruction loads an f16 value from memory and puts it in an f32. Specified 
at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is 
incorrect and will be changed to 0xFC30 soon.

>From 14313fa9ef33b4cbc8cf18f280ee885b38015ca4 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 1 May 2024 21:53:39 +
Subject: [PATCH] [WebAssembly] Implement prototype f32.load_f16 instruction.

Adds a builtin and intrinsic for the f32.load_f16 instruction.

The instruction loads an f16 value from memory and puts it in an f32.
Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is 
incorrect
and will be changed to 0xFC30 soon.
---
 clang/include/clang/Basic/BuiltinsWebAssembly.def|  3 +++
 clang/lib/CodeGen/CGBuiltin.cpp  |  5 +
 clang/test/CodeGen/builtins-wasm.c   |  9 +++--
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td| 12 
 .../MCTargetDesc/WebAssemblyMCTargetDesc.h   |  1 +
 .../Target/WebAssembly/WebAssemblyISelLowering.cpp   |  8 
 .../lib/Target/WebAssembly/WebAssemblyInstrMemory.td |  5 +
 llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td  |  7 +++
 llvm/test/CodeGen/WebAssembly/half-precision.ll  | 12 
 llvm/test/MC/WebAssembly/simd-encodings.s|  5 -
 10 files changed, 64 insertions(+), 3 deletions(-)
 create mode 100644 llvm/test/CodeGen/WebAssembly/half-precision.ll

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 7e950914ad946d..8b0a1d4579d84c 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -190,6 +190,9 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_s_i16x8, "V8sV16ScV16Sc",
 TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, 
"V4iV16ScV16ScV4i", "nc", "relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, 
"V4fV8UsV8UsV4f", "nc", "relaxed-simd")
 
+// Half-Precision (fp16)
+TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fi*", "nU", "half-precision")
+
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
 // in which case the argument spec (second argument) is unused.
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index a370734e00d3e1..e9d465bd2a6b01 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21257,6 +21257,11 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 CGM.getIntrinsic(Intrinsic::wasm_relaxed_dot_bf16x8_add_f32);
 return Builder.CreateCall(Callee, {LHS, RHS, Acc});
   }
+  case WebAssembly::BI__builtin_wasm_loadf16_f32: {
+Value *Addr = EmitScalarExpr(E->getArg(0));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32);
+return Builder.CreateCall(Callee, {Addr});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 9a323da9a8e846..a845d5429039d4 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics 
-flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s 
-check-prefixes WEBASSEMBLY,WEBASSEMBLY32
-// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics 
-flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s 
-check-prefixes WEBASSEMBLY,WEBASSEMBLY64
+// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics -target-feature 
+half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | 
FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY3

[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)

2024-05-02 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/90906

>From 14313fa9ef33b4cbc8cf18f280ee885b38015ca4 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 1 May 2024 21:53:39 +
Subject: [PATCH 1/2] [WebAssembly] Implement prototype f32.load_f16
 instruction.

Adds a builtin and intrinsic for the f32.load_f16 instruction.

The instruction loads an f16 value from memory and puts it in an f32.
Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is 
incorrect
and will be changed to 0xFC30 soon.
---
 clang/include/clang/Basic/BuiltinsWebAssembly.def|  3 +++
 clang/lib/CodeGen/CGBuiltin.cpp  |  5 +
 clang/test/CodeGen/builtins-wasm.c   |  9 +++--
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td| 12 
 .../MCTargetDesc/WebAssemblyMCTargetDesc.h   |  1 +
 .../Target/WebAssembly/WebAssemblyISelLowering.cpp   |  8 
 .../lib/Target/WebAssembly/WebAssemblyInstrMemory.td |  5 +
 llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td  |  7 +++
 llvm/test/CodeGen/WebAssembly/half-precision.ll  | 12 
 llvm/test/MC/WebAssembly/simd-encodings.s|  5 -
 10 files changed, 64 insertions(+), 3 deletions(-)
 create mode 100644 llvm/test/CodeGen/WebAssembly/half-precision.ll

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 7e950914ad946d..8b0a1d4579d84c 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -190,6 +190,9 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_s_i16x8, "V8sV16ScV16Sc",
 TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, 
"V4iV16ScV16ScV4i", "nc", "relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, 
"V4fV8UsV8UsV4f", "nc", "relaxed-simd")
 
+// Half-Precision (fp16)
+TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fi*", "nU", "half-precision")
+
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
 // in which case the argument spec (second argument) is unused.
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index a370734e00d3e1..e9d465bd2a6b01 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21257,6 +21257,11 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 CGM.getIntrinsic(Intrinsic::wasm_relaxed_dot_bf16x8_add_f32);
 return Builder.CreateCall(Callee, {LHS, RHS, Acc});
   }
+  case WebAssembly::BI__builtin_wasm_loadf16_f32: {
+Value *Addr = EmitScalarExpr(E->getArg(0));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32);
+return Builder.CreateCall(Callee, {Addr});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 9a323da9a8e846..a845d5429039d4 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics 
-flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s 
-check-prefixes WEBASSEMBLY,WEBASSEMBLY32
-// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics 
-flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s 
-check-prefixes WEBASSEMBLY,WEBASSEMBLY64
+// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics -target-feature 
+half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | 
FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY32
+// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics -target-feature 
+half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | 
FileCheck %s -check-prefixes WEBAS

[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)

2024-05-02 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

/cc @tlively @dschuff
(I guess I can't assign reviewers since I don't have commit access.) 

https://github.com/llvm/llvm-project/pull/90906
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)

2024-05-03 Thread Brendan Dahl via cfe-commits



@@ -321,6 +321,18 @@ def int_wasm_relaxed_dot_bf16x8_add_f32:
 [llvm_v8i16_ty, llvm_v8i16_ty, llvm_v4f32_ty],
 [IntrNoMem, IntrSpeculatable]>;
 
+//===--===//
+// Half-precision intrinsics (experimental)
+//===--===//
+
+// TODO: Replace these intrinsic with normal ISel patterns once the XXX
+// instructions are merged to the proposal.
+def int_wasm_loadf16_f32:
+  Intrinsic<[llvm_float_ty],
+[llvm_ptr_ty],
+[IntrReadMem, IntrArgMemOnly],
+ "", [SDNPMemOperand]>;

brendandahl wrote:

It looks like we have empty/missing names for nearly all of the intrinsics 
except for `int_wasm_ref_is_null_extern` and `int_wasm_ref_is_null_func`. Do 
you know what the name is used for? Maybe it's for when the name can't be 
automatically translated to an llvm name?

https://github.com/llvm/llvm-project/pull/90906
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)

2024-05-03 Thread Brendan Dahl via cfe-commits



@@ -38,6 +38,13 @@ multiclass RELAXED_I;
 }
 
+multiclass HALF_PRECISION_I pattern_r, string asmstr_r = "",
+string asmstr_s = "", bits<32> simdop = -1> {
+  defm "" : ABSTRACT_SIMD_I;
+}
+

brendandahl wrote:

This will be for my next PRs. I'll remove.

https://github.com/llvm/llvm-project/pull/90906
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)

2024-05-03 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

> Overall this looks good, and I think it makes sense to model this as short* 
> for now. I think it will be interesting to see if that ends up causing 
> issues. Out of curiosity does this work if you try `_fp16`?

I was trying _Float16 and that wasn't working since it requires the target to 
support it. `__fp16` does work though. I'll change it.

https://github.com/llvm/llvm-project/pull/90906
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)

2024-05-06 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/90906

>From 14313fa9ef33b4cbc8cf18f280ee885b38015ca4 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 1 May 2024 21:53:39 +
Subject: [PATCH 1/3] [WebAssembly] Implement prototype f32.load_f16
 instruction.

Adds a builtin and intrinsic for the f32.load_f16 instruction.

The instruction loads an f16 value from memory and puts it in an f32.
Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is 
incorrect
and will be changed to 0xFC30 soon.
---
 clang/include/clang/Basic/BuiltinsWebAssembly.def|  3 +++
 clang/lib/CodeGen/CGBuiltin.cpp  |  5 +
 clang/test/CodeGen/builtins-wasm.c   |  9 +++--
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td| 12 
 .../MCTargetDesc/WebAssemblyMCTargetDesc.h   |  1 +
 .../Target/WebAssembly/WebAssemblyISelLowering.cpp   |  8 
 .../lib/Target/WebAssembly/WebAssemblyInstrMemory.td |  5 +
 llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td  |  7 +++
 llvm/test/CodeGen/WebAssembly/half-precision.ll  | 12 
 llvm/test/MC/WebAssembly/simd-encodings.s|  5 -
 10 files changed, 64 insertions(+), 3 deletions(-)
 create mode 100644 llvm/test/CodeGen/WebAssembly/half-precision.ll

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 7e950914ad946d..8b0a1d4579d84c 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -190,6 +190,9 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_s_i16x8, "V8sV16ScV16Sc",
 TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, 
"V4iV16ScV16ScV4i", "nc", "relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, 
"V4fV8UsV8UsV4f", "nc", "relaxed-simd")
 
+// Half-Precision (fp16)
+TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fi*", "nU", "half-precision")
+
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
 // in which case the argument spec (second argument) is unused.
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index a370734e00d3e1..e9d465bd2a6b01 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21257,6 +21257,11 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 CGM.getIntrinsic(Intrinsic::wasm_relaxed_dot_bf16x8_add_f32);
 return Builder.CreateCall(Callee, {LHS, RHS, Acc});
   }
+  case WebAssembly::BI__builtin_wasm_loadf16_f32: {
+Value *Addr = EmitScalarExpr(E->getArg(0));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32);
+return Builder.CreateCall(Callee, {Addr});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 9a323da9a8e846..a845d5429039d4 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics 
-flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s 
-check-prefixes WEBASSEMBLY,WEBASSEMBLY32
-// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics 
-flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s 
-check-prefixes WEBASSEMBLY,WEBASSEMBLY64
+// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics -target-feature 
+half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | 
FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY32
+// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature 
+reference-types -target-feature +simd128 -target-feature +relaxed-simd 
-target-feature +nontrapping-fptoint -target-feature +exception-handling 
-target-feature +bulk-memory -target-feature +atomics -target-feature 
+half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | 
FileCheck %s -check-prefixes WEBAS

[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)

2024-05-06 Thread Brendan Dahl via cfe-commits



@@ -666,3 +666,29 @@ define {i32,i32,i32,i32} @aggregate_return() {
 define {i64,i32,i16,i8} @aggregate_return_without_merge() {
   ret {i64,i32,i16,i8} zeroinitializer
 }
+
+;===

brendandahl wrote:

I didn't add many tests here, since it seems pretty well covered for other load 
patters.

https://github.com/llvm/llvm-project/pull/90906
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)

2024-05-08 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/91545

Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory. Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is 
incorrect and will be changed to 0xFC31 soon.

>From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 8 May 2024 23:10:07 +
Subject: [PATCH] [WebAssembly] Implement prototype f32.store_f16 instruction.

Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory.
Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is 
incorrect
and will be changed to 0xFC31 soon.
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  1 +
 clang/lib/CodeGen/CGBuiltin.cpp   |  6 +
 clang/test/CodeGen/builtins-wasm.c|  6 +
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td |  5 
 .../MCTargetDesc/WebAssemblyMCTargetDesc.h|  1 +
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  8 ++
 .../WebAssembly/WebAssemblyInstrMemory.td |  4 +++
 .../CodeGen/WebAssembly/half-precision.ll |  9 +++
 llvm/test/CodeGen/WebAssembly/offset.ll   | 27 +++
 llvm/test/MC/WebAssembly/simd-encodings.s |  3 +++
 10 files changed, 70 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index cf54f8f4422f8..41fadd10e9432 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -192,6 +192,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 
 // Half-Precision (fp16)
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index e8a6bd050e17e..abb644d8eb506 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21308,6 +21308,12 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32);
 return Builder.CreateCall(Callee, {Addr});
   }
+  case WebAssembly::BI__builtin_wasm_storef16_f32: {
+Value *Val = EmitScalarExpr(E->getArg(0));
+Value *Addr = EmitScalarExpr(E->getArg(1));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32);
+return Builder.CreateCall(Callee, {Val, Addr});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index ab1c6cd494ae5..bcb15969de1c5 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) {
   // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}})
 }
 
+void store_f16_f32(float val, __fp16 *addr) {
+  return __builtin_wasm_storef16_f32(val, addr);
+  // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr 
%{{.*}})
+  // WEBASSEMBLY-NEXT: ret
+}
+
 __externref_t externref_null() {
   return __builtin_wasm_ref_null_extern();
   // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern()
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index f8142a8ca9e93..572d334ac9552 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -332,6 +332,11 @@ def int_wasm_loadf16_f32:
 [llvm_ptr_ty],
 [IntrReadMem, IntrArgMemOnly],
  "", [SDNPMemOperand]>;
+def int_wasm_storef16_f32:
+  Intrinsic<[],
+[llvm_float_ty, llvm_ptr_ty],
+[IntrWriteMem, IntrArgMemOnly],
+ "", [SDNPMemOperand]>;
 
 
 
//===--===//
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h 
b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
index d3b496ae59179..d4e9fb057c44d 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
@@ -207,6 +207,7 @@ inline unsigned GetDefaultP2

[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)

2024-05-09 Thread Brendan Dahl via cfe-commits



@@ -192,6 +192,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 
 // Half-Precision (fp16)
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision")

brendandahl wrote:

What does `pure` mean in this context? The docs in clang/Basic/Builtins.def 
don't have any info on this.

https://github.com/llvm/llvm-project/pull/91545
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)

2024-05-09 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/91545

>From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 8 May 2024 23:10:07 +
Subject: [PATCH 1/2] [WebAssembly] Implement prototype f32.store_f16
 instruction.

Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory.
Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is 
incorrect
and will be changed to 0xFC31 soon.
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  1 +
 clang/lib/CodeGen/CGBuiltin.cpp   |  6 +
 clang/test/CodeGen/builtins-wasm.c|  6 +
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td |  5 
 .../MCTargetDesc/WebAssemblyMCTargetDesc.h|  1 +
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  8 ++
 .../WebAssembly/WebAssemblyInstrMemory.td |  4 +++
 .../CodeGen/WebAssembly/half-precision.ll |  9 +++
 llvm/test/CodeGen/WebAssembly/offset.ll   | 27 +++
 llvm/test/MC/WebAssembly/simd-encodings.s |  3 +++
 10 files changed, 70 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index cf54f8f4422f..41fadd10e943 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -192,6 +192,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 
 // Half-Precision (fp16)
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index e8a6bd050e17..abb644d8eb50 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21308,6 +21308,12 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32);
 return Builder.CreateCall(Callee, {Addr});
   }
+  case WebAssembly::BI__builtin_wasm_storef16_f32: {
+Value *Val = EmitScalarExpr(E->getArg(0));
+Value *Addr = EmitScalarExpr(E->getArg(1));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32);
+return Builder.CreateCall(Callee, {Val, Addr});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index ab1c6cd494ae..bcb15969de1c 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) {
   // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}})
 }
 
+void store_f16_f32(float val, __fp16 *addr) {
+  return __builtin_wasm_storef16_f32(val, addr);
+  // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr 
%{{.*}})
+  // WEBASSEMBLY-NEXT: ret
+}
+
 __externref_t externref_null() {
   return __builtin_wasm_ref_null_extern();
   // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern()
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index f8142a8ca9e9..572d334ac955 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -332,6 +332,11 @@ def int_wasm_loadf16_f32:
 [llvm_ptr_ty],
 [IntrReadMem, IntrArgMemOnly],
  "", [SDNPMemOperand]>;
+def int_wasm_storef16_f32:
+  Intrinsic<[],
+[llvm_float_ty, llvm_ptr_ty],
+[IntrWriteMem, IntrArgMemOnly],
+ "", [SDNPMemOperand]>;
 
 
 
//===--===//
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h 
b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
index d3b496ae5917..d4e9fb057c44 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
@@ -207,6 +207,7 @@ inline unsigned GetDefaultP2AlignAny(unsigned Opc) {
   WASM_LOAD_STORE(LOAD_LANE_I16x8)
   WASM_LOAD_STORE(STORE_LANE_I16x8)
   WASM_LOAD_STORE(LOAD_F16_F32)
+  WASM_LOAD_STORE(STORE_F16_F32)
   return 1;
   WASM_LOAD_STORE(LOAD_I32)
   WASM_LOAD_STORE(LOAD_F32)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp 
b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index ed52fe53bc60..527bb

[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)

2024-05-09 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/91545

>From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 8 May 2024 23:10:07 +
Subject: [PATCH 1/3] [WebAssembly] Implement prototype f32.store_f16
 instruction.

Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory.
Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is 
incorrect
and will be changed to 0xFC31 soon.
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  1 +
 clang/lib/CodeGen/CGBuiltin.cpp   |  6 +
 clang/test/CodeGen/builtins-wasm.c|  6 +
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td |  5 
 .../MCTargetDesc/WebAssemblyMCTargetDesc.h|  1 +
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  8 ++
 .../WebAssembly/WebAssemblyInstrMemory.td |  4 +++
 .../CodeGen/WebAssembly/half-precision.ll |  9 +++
 llvm/test/CodeGen/WebAssembly/offset.ll   | 27 +++
 llvm/test/MC/WebAssembly/simd-encodings.s |  3 +++
 10 files changed, 70 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index cf54f8f4422f..41fadd10e943 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -192,6 +192,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 
 // Half-Precision (fp16)
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index e8a6bd050e17..abb644d8eb50 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21308,6 +21308,12 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32);
 return Builder.CreateCall(Callee, {Addr});
   }
+  case WebAssembly::BI__builtin_wasm_storef16_f32: {
+Value *Val = EmitScalarExpr(E->getArg(0));
+Value *Addr = EmitScalarExpr(E->getArg(1));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32);
+return Builder.CreateCall(Callee, {Val, Addr});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index ab1c6cd494ae..bcb15969de1c 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) {
   // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}})
 }
 
+void store_f16_f32(float val, __fp16 *addr) {
+  return __builtin_wasm_storef16_f32(val, addr);
+  // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr 
%{{.*}})
+  // WEBASSEMBLY-NEXT: ret
+}
+
 __externref_t externref_null() {
   return __builtin_wasm_ref_null_extern();
   // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern()
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index f8142a8ca9e9..572d334ac955 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -332,6 +332,11 @@ def int_wasm_loadf16_f32:
 [llvm_ptr_ty],
 [IntrReadMem, IntrArgMemOnly],
  "", [SDNPMemOperand]>;
+def int_wasm_storef16_f32:
+  Intrinsic<[],
+[llvm_float_ty, llvm_ptr_ty],
+[IntrWriteMem, IntrArgMemOnly],
+ "", [SDNPMemOperand]>;
 
 
 
//===--===//
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h 
b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
index d3b496ae5917..d4e9fb057c44 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
@@ -207,6 +207,7 @@ inline unsigned GetDefaultP2AlignAny(unsigned Opc) {
   WASM_LOAD_STORE(LOAD_LANE_I16x8)
   WASM_LOAD_STORE(STORE_LANE_I16x8)
   WASM_LOAD_STORE(LOAD_F16_F32)
+  WASM_LOAD_STORE(STORE_F16_F32)
   return 1;
   WASM_LOAD_STORE(LOAD_I32)
   WASM_LOAD_STORE(LOAD_F32)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp 
b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index ed52fe53bc60..527bb

[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)

2024-05-09 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/91545

>From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 8 May 2024 23:10:07 +
Subject: [PATCH 1/4] [WebAssembly] Implement prototype f32.store_f16
 instruction.

Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory.
Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is 
incorrect
and will be changed to 0xFC31 soon.
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  1 +
 clang/lib/CodeGen/CGBuiltin.cpp   |  6 +
 clang/test/CodeGen/builtins-wasm.c|  6 +
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td |  5 
 .../MCTargetDesc/WebAssemblyMCTargetDesc.h|  1 +
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  8 ++
 .../WebAssembly/WebAssemblyInstrMemory.td |  4 +++
 .../CodeGen/WebAssembly/half-precision.ll |  9 +++
 llvm/test/CodeGen/WebAssembly/offset.ll   | 27 +++
 llvm/test/MC/WebAssembly/simd-encodings.s |  3 +++
 10 files changed, 70 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index cf54f8f4422f8..41fadd10e9432 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -192,6 +192,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 
 // Half-Precision (fp16)
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index e8a6bd050e17e..abb644d8eb506 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21308,6 +21308,12 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32);
 return Builder.CreateCall(Callee, {Addr});
   }
+  case WebAssembly::BI__builtin_wasm_storef16_f32: {
+Value *Val = EmitScalarExpr(E->getArg(0));
+Value *Addr = EmitScalarExpr(E->getArg(1));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32);
+return Builder.CreateCall(Callee, {Val, Addr});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index ab1c6cd494ae5..bcb15969de1c5 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) {
   // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}})
 }
 
+void store_f16_f32(float val, __fp16 *addr) {
+  return __builtin_wasm_storef16_f32(val, addr);
+  // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr 
%{{.*}})
+  // WEBASSEMBLY-NEXT: ret
+}
+
 __externref_t externref_null() {
   return __builtin_wasm_ref_null_extern();
   // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern()
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index f8142a8ca9e93..572d334ac9552 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -332,6 +332,11 @@ def int_wasm_loadf16_f32:
 [llvm_ptr_ty],
 [IntrReadMem, IntrArgMemOnly],
  "", [SDNPMemOperand]>;
+def int_wasm_storef16_f32:
+  Intrinsic<[],
+[llvm_float_ty, llvm_ptr_ty],
+[IntrWriteMem, IntrArgMemOnly],
+ "", [SDNPMemOperand]>;
 
 
 
//===--===//
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h 
b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
index d3b496ae59179..d4e9fb057c44d 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
@@ -207,6 +207,7 @@ inline unsigned GetDefaultP2AlignAny(unsigned Opc) {
   WASM_LOAD_STORE(LOAD_LANE_I16x8)
   WASM_LOAD_STORE(STORE_LANE_I16x8)
   WASM_LOAD_STORE(LOAD_F16_F32)
+  WASM_LOAD_STORE(STORE_F16_F32)
   return 1;
   WASM_LOAD_STORE(LOAD_I32)
   WASM_LOAD_STORE(LOAD_F32)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp 
b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index ed52fe53b

[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)

2024-05-09 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/91545

>From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 8 May 2024 23:10:07 +
Subject: [PATCH 1/5] [WebAssembly] Implement prototype f32.store_f16
 instruction.

Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory.
Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is 
incorrect
and will be changed to 0xFC31 soon.
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  1 +
 clang/lib/CodeGen/CGBuiltin.cpp   |  6 +
 clang/test/CodeGen/builtins-wasm.c|  6 +
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td |  5 
 .../MCTargetDesc/WebAssemblyMCTargetDesc.h|  1 +
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  8 ++
 .../WebAssembly/WebAssemblyInstrMemory.td |  4 +++
 .../CodeGen/WebAssembly/half-precision.ll |  9 +++
 llvm/test/CodeGen/WebAssembly/offset.ll   | 27 +++
 llvm/test/MC/WebAssembly/simd-encodings.s |  3 +++
 10 files changed, 70 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index cf54f8f4422f8..41fadd10e9432 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -192,6 +192,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 
 // Half-Precision (fp16)
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index e8a6bd050e17e..abb644d8eb506 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21308,6 +21308,12 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32);
 return Builder.CreateCall(Callee, {Addr});
   }
+  case WebAssembly::BI__builtin_wasm_storef16_f32: {
+Value *Val = EmitScalarExpr(E->getArg(0));
+Value *Addr = EmitScalarExpr(E->getArg(1));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32);
+return Builder.CreateCall(Callee, {Val, Addr});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index ab1c6cd494ae5..bcb15969de1c5 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) {
   // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}})
 }
 
+void store_f16_f32(float val, __fp16 *addr) {
+  return __builtin_wasm_storef16_f32(val, addr);
+  // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr 
%{{.*}})
+  // WEBASSEMBLY-NEXT: ret
+}
+
 __externref_t externref_null() {
   return __builtin_wasm_ref_null_extern();
   // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern()
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index f8142a8ca9e93..572d334ac9552 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -332,6 +332,11 @@ def int_wasm_loadf16_f32:
 [llvm_ptr_ty],
 [IntrReadMem, IntrArgMemOnly],
  "", [SDNPMemOperand]>;
+def int_wasm_storef16_f32:
+  Intrinsic<[],
+[llvm_float_ty, llvm_ptr_ty],
+[IntrWriteMem, IntrArgMemOnly],
+ "", [SDNPMemOperand]>;
 
 
 
//===--===//
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h 
b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
index d3b496ae59179..d4e9fb057c44d 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
@@ -207,6 +207,7 @@ inline unsigned GetDefaultP2AlignAny(unsigned Opc) {
   WASM_LOAD_STORE(LOAD_LANE_I16x8)
   WASM_LOAD_STORE(STORE_LANE_I16x8)
   WASM_LOAD_STORE(LOAD_F16_F32)
+  WASM_LOAD_STORE(STORE_F16_F32)
   return 1;
   WASM_LOAD_STORE(LOAD_I32)
   WASM_LOAD_STORE(LOAD_F32)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp 
b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index ed52fe53b

[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)

2024-05-23 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/93228

Adds a builtin and intrinsic for the f16x8.splat instruction.

Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect 
and will be changed to 0x120 soon.

>From 002e33294cae26796ca79a66dbd275f3e26807d2 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 21 May 2024 21:15:14 +
Subject: [PATCH] [WebAssembly] Implement prototype f16x8.splat instruction.

Adds a builtin and intrinsic for the f16x8.splat instruction.

Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect 
and will be changed to 0x120 soon.
---
 clang/include/clang/Basic/BuiltinsWebAssembly.def |  1 +
 clang/lib/Basic/Targets/WebAssembly.h |  3 +++
 clang/lib/CodeGen/CGBuiltin.cpp   |  5 +
 clang/test/CodeGen/builtins-wasm.c|  6 ++
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td |  4 
 .../Utils/WebAssemblyTypeUtilities.cpp|  1 +
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  3 +++
 .../Target/WebAssembly/WebAssemblyInstrSIMD.td| 15 +++
 .../Target/WebAssembly/WebAssemblyRegisterInfo.td |  5 +++--
 llvm/test/CodeGen/WebAssembly/half-precision.ll   | 12 ++--
 llvm/test/MC/WebAssembly/simd-encodings.s |  3 +++
 11 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 8645cff1e8679..dbe79aa39190d 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -193,6 +193,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 // Half-Precision (fp16)
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
 TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/Basic/Targets/WebAssembly.h 
b/clang/lib/Basic/Targets/WebAssembly.h
index 4db97867df607..63f4e72a9c2de 100644
--- a/clang/lib/Basic/Targets/WebAssembly.h
+++ b/clang/lib/Basic/Targets/WebAssembly.h
@@ -90,6 +90,9 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public 
TargetInfo {
 
   StringRef getABI() const override;
   bool setABI(const std::string &Name) override;
+  bool useFP16ConversionIntrinsics() const override {
+return false;
+  }
 
 protected:
   void getTargetDefines(const LangOptions &Opts,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index ba94bf89e4751..91083c1cfae96 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21230,6 +21230,11 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32);
 return Builder.CreateCall(Callee, {Val, Addr});
   }
+  case WebAssembly::BI__builtin_wasm_splat_f16x8: {
+Value *Val = EmitScalarExpr(E->getArg(0));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_splat_f16x8);
+return Builder.CreateCall(Callee, {Val});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index bcb15969de1c5..76c6305d422a2 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -11,6 +11,7 @@ typedef unsigned char u8x16 __attribute((vector_size(16)));
 typedef unsigned short u16x8 __attribute((vector_size(16)));
 typedef unsigned int u32x4 __attribute((vector_size(16)));
 typedef unsigned long long u64x2 __attribute((vector_size(16)));
+typedef __fp16 f16x8 __attribute((vector_size(16)));
 typedef float f32x4 __attribute((vector_size(16)));
 typedef double f64x2 __attribute((vector_size(16)));
 
@@ -813,6 +814,11 @@ void store_f16_f32(float val, __fp16 *addr) {
   // WEBASSEMBLY-NEXT: ret
 }
 
+f16x8 splat_f16x8(float a) {
+  // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.splat.f16x8(float %a)
+  // WEBASSEMBLY-NEXT: ret <8 x half> %0
+  return __builtin_wasm_splat_f16x8(a);
+}
 __externref_t externref_null() {
   return __builtin_wasm_ref_null_extern();
   // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern()
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
i

[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)

2024-05-23 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

cc @aheejin @dschuff 

As mentioned in the meeting, it looks like it will be a lot more work to get 
half value's working with normal patterns, so for now I'll stick to just 
built-ins and intrinsics.

https://github.com/llvm/llvm-project/pull/93228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)

2024-05-23 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/93228

>From 28cc678038feefffceba8cbe24349e1885b24c75 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 21 May 2024 21:15:14 +
Subject: [PATCH] [WebAssembly] Implement prototype f16x8.splat instruction.

Adds a builtin and intrinsic for the f16x8.splat instruction.

Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect 
and will be changed to 0x120 soon.
---
 clang/include/clang/Basic/BuiltinsWebAssembly.def |  1 +
 clang/lib/Basic/Targets/WebAssembly.h |  1 +
 clang/lib/CodeGen/CGBuiltin.cpp   |  5 +
 clang/test/CodeGen/builtins-wasm.c|  6 ++
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td |  4 
 .../Utils/WebAssemblyTypeUtilities.cpp|  1 +
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  3 +++
 .../Target/WebAssembly/WebAssemblyInstrSIMD.td| 15 +++
 .../Target/WebAssembly/WebAssemblyRegisterInfo.td |  5 +++--
 llvm/test/CodeGen/WebAssembly/half-precision.ll   | 12 ++--
 llvm/test/MC/WebAssembly/simd-encodings.s |  3 +++
 11 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 8645cff1e8679..dbe79aa39190d 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -193,6 +193,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 // Half-Precision (fp16)
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
 TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/Basic/Targets/WebAssembly.h 
b/clang/lib/Basic/Targets/WebAssembly.h
index 4db97867df607..46416d516b42f 100644
--- a/clang/lib/Basic/Targets/WebAssembly.h
+++ b/clang/lib/Basic/Targets/WebAssembly.h
@@ -90,6 +90,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public 
TargetInfo {
 
   StringRef getABI() const override;
   bool setABI(const std::string &Name) override;
+  bool useFP16ConversionIntrinsics() const override { return false; }
 
 protected:
   void getTargetDefines(const LangOptions &Opts,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index ba94bf89e4751..91083c1cfae96 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21230,6 +21230,11 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32);
 return Builder.CreateCall(Callee, {Val, Addr});
   }
+  case WebAssembly::BI__builtin_wasm_splat_f16x8: {
+Value *Val = EmitScalarExpr(E->getArg(0));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_splat_f16x8);
+return Builder.CreateCall(Callee, {Val});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index bcb15969de1c5..76c6305d422a2 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -11,6 +11,7 @@ typedef unsigned char u8x16 __attribute((vector_size(16)));
 typedef unsigned short u16x8 __attribute((vector_size(16)));
 typedef unsigned int u32x4 __attribute((vector_size(16)));
 typedef unsigned long long u64x2 __attribute((vector_size(16)));
+typedef __fp16 f16x8 __attribute((vector_size(16)));
 typedef float f32x4 __attribute((vector_size(16)));
 typedef double f64x2 __attribute((vector_size(16)));
 
@@ -813,6 +814,11 @@ void store_f16_f32(float val, __fp16 *addr) {
   // WEBASSEMBLY-NEXT: ret
 }
 
+f16x8 splat_f16x8(float a) {
+  // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.splat.f16x8(float %a)
+  // WEBASSEMBLY-NEXT: ret <8 x half> %0
+  return __builtin_wasm_splat_f16x8(a);
+}
 __externref_t externref_null() {
   return __builtin_wasm_ref_null_extern();
   // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern()
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index 572d334ac9552..c950b33182689 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -337,6 +337,10 @@ def int_wasm_storef16_f32:
 [llvm_float_ty, llvm_ptr_ty],
 [IntrWriteMem, IntrArgMemOnly],
  "", [SDNPMemOperand]>;
+def int_wasm_

[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)

2024-05-23 Thread Brendan Dahl via cfe-commits



@@ -90,6 +90,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public 
TargetInfo {
 
   StringRef getABI() const override;
   bool setABI(const std::string &Name) override;
+  bool useFP16ConversionIntrinsics() const override { return false; }

brendandahl wrote:

Yeah, this is what causes clang to start outputting `half` types. I could 
conditionally enable this with `return !HasHalfPrecision;` instead. Though 
doing a quick test with scalar `__fp16` in c, the `half` types seems to work 
correctly regardless of this setting.

https://github.com/llvm/llvm-project/pull/93228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)

2024-05-23 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/93228

>From 28cc678038feefffceba8cbe24349e1885b24c75 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 21 May 2024 21:15:14 +
Subject: [PATCH 1/2] [WebAssembly] Implement prototype f16x8.splat
 instruction.

Adds a builtin and intrinsic for the f16x8.splat instruction.

Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect 
and will be changed to 0x120 soon.
---
 clang/include/clang/Basic/BuiltinsWebAssembly.def |  1 +
 clang/lib/Basic/Targets/WebAssembly.h |  1 +
 clang/lib/CodeGen/CGBuiltin.cpp   |  5 +
 clang/test/CodeGen/builtins-wasm.c|  6 ++
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td |  4 
 .../Utils/WebAssemblyTypeUtilities.cpp|  1 +
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  3 +++
 .../Target/WebAssembly/WebAssemblyInstrSIMD.td| 15 +++
 .../Target/WebAssembly/WebAssemblyRegisterInfo.td |  5 +++--
 llvm/test/CodeGen/WebAssembly/half-precision.ll   | 12 ++--
 llvm/test/MC/WebAssembly/simd-encodings.s |  3 +++
 11 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 8645cff1e8679..dbe79aa39190d 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -193,6 +193,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 // Half-Precision (fp16)
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
 TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/Basic/Targets/WebAssembly.h 
b/clang/lib/Basic/Targets/WebAssembly.h
index 4db97867df607..46416d516b42f 100644
--- a/clang/lib/Basic/Targets/WebAssembly.h
+++ b/clang/lib/Basic/Targets/WebAssembly.h
@@ -90,6 +90,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public 
TargetInfo {
 
   StringRef getABI() const override;
   bool setABI(const std::string &Name) override;
+  bool useFP16ConversionIntrinsics() const override { return false; }
 
 protected:
   void getTargetDefines(const LangOptions &Opts,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index ba94bf89e4751..91083c1cfae96 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21230,6 +21230,11 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32);
 return Builder.CreateCall(Callee, {Val, Addr});
   }
+  case WebAssembly::BI__builtin_wasm_splat_f16x8: {
+Value *Val = EmitScalarExpr(E->getArg(0));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_splat_f16x8);
+return Builder.CreateCall(Callee, {Val});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index bcb15969de1c5..76c6305d422a2 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -11,6 +11,7 @@ typedef unsigned char u8x16 __attribute((vector_size(16)));
 typedef unsigned short u16x8 __attribute((vector_size(16)));
 typedef unsigned int u32x4 __attribute((vector_size(16)));
 typedef unsigned long long u64x2 __attribute((vector_size(16)));
+typedef __fp16 f16x8 __attribute((vector_size(16)));
 typedef float f32x4 __attribute((vector_size(16)));
 typedef double f64x2 __attribute((vector_size(16)));
 
@@ -813,6 +814,11 @@ void store_f16_f32(float val, __fp16 *addr) {
   // WEBASSEMBLY-NEXT: ret
 }
 
+f16x8 splat_f16x8(float a) {
+  // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.splat.f16x8(float %a)
+  // WEBASSEMBLY-NEXT: ret <8 x half> %0
+  return __builtin_wasm_splat_f16x8(a);
+}
 __externref_t externref_null() {
   return __builtin_wasm_ref_null_extern();
   // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern()
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index 572d334ac9552..c950b33182689 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -337,6 +337,10 @@ def int_wasm_storef16_f32:
 [llvm_float_ty, llvm_ptr_ty],
 [IntrWriteMem, IntrArgMemOnly],
  "", [SDNPMemOperand]>;
+def int_

[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)

2024-05-23 Thread Brendan Dahl via cfe-commits


brendandahl wrote:

> LGTM % `!HasHalfPrecision` thing
> 
> By the way I guess you can try getting commit access soon? I think it is 
> still "Send an email to Chris" though...

Done, can I get a squash and merge?

I'll look into getting commit access.

https://github.com/llvm/llvm-project/pull/93228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Implement prototype f16x8.extract_lane instruction. (PR #93272)

2024-05-23 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/93272

Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f16x8.extract_lane as opcode 0x124, but this is 
incorrect and will be changed to 0x121 soon.

>From ee046630b80786b920b5e7d0742c27443d3ea2b0 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Thu, 23 May 2024 21:04:31 +
Subject: [PATCH] [WebAssembly] Implement prototype f16x8.extract_lane
 instruction.

Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

Note: the current spec has f16x8.extract_lane as opcode 0x124, but this is 
incorrect and will be changed to 0x121 soon.
---
 clang/include/clang/Basic/BuiltinsWebAssembly.def| 1 +
 clang/lib/CodeGen/CGBuiltin.cpp  | 6 ++
 clang/test/CodeGen/builtins-wasm.c   | 6 ++
 llvm/include/llvm/IR/IntrinsicsWebAssembly.td| 4 
 .../WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h   | 2 ++
 llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp| 4 +++-
 llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td  | 9 +
 llvm/test/CodeGen/WebAssembly/half-precision.ll  | 8 
 llvm/test/MC/WebAssembly/simd-encodings.s| 3 +++
 9 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index dbe79aa39190d..fd8c1b480d6da 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -194,6 +194,7 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
 TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision")
 TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", 
"half-precision")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 91083c1cfae96..0549afa12e430 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21235,6 +21235,12 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_splat_f16x8);
 return Builder.CreateCall(Callee, {Val});
   }
+  case WebAssembly::BI__builtin_wasm_extract_lane_f16x8: {
+Value *Vector = EmitScalarExpr(E->getArg(0));
+Value *Index = EmitScalarExpr(E->getArg(1));
+Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_extract_lane_f16x8);
+return Builder.CreateCall(Callee, {Vector, Index});
+  }
   case WebAssembly::BI__builtin_wasm_table_get: {
 assert(E->getArg(0)->getType()->isArrayType());
 Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 76c6305d422a2..93a6ab06081c9 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -819,6 +819,12 @@ f16x8 splat_f16x8(float a) {
   // WEBASSEMBLY-NEXT: ret <8 x half> %0
   return __builtin_wasm_splat_f16x8(a);
 }
+
+float extract_lane_f16x8(f16x8 a, int i) {
+  // WEBASSEMBLY:  %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x 
half> %a, i32 %i)
+  // WEBASSEMBLY-NEXT: ret float %0
+  return __builtin_wasm_extract_lane_f16x8(a, i);
+}
 __externref_t externref_null() {
   return __builtin_wasm_ref_null_extern();
   // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern()
diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td 
b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
index c950b33182689..237f268784bb0 100644
--- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
+++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td
@@ -341,6 +341,10 @@ def int_wasm_splat_f16x8:
   DefaultAttrsIntrinsic<[llvm_v8f16_ty],
 [llvm_float_ty],
 [IntrNoMem, IntrSpeculatable]>;
+def int_wasm_extract_lane_f16x8:
+  DefaultAttrsIntrinsic<[llvm_float_ty],
+[llvm_v8f16_ty, llvm_i32_ty],
+[IntrNoMem, IntrSpeculatable]>;
 
 
 
//===--===//
diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h 
b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
index d4e9fb057c44d..34502170a5c71 100644
--- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
+++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h
@@ -345,6

[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)

2024-05-24 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/93360

This reuses most of the code that was created for f32x4 and f64x2 binary 
instructions and tries to follow how they were implemented.

add/sub/mul/div - use regular LL instructions
min/max - use the minimum/maximum intrinsic, and also have builtins pmin/pmax - 
use the wasm.pmax/pmin intrinsics and also have builtins

Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md

>From c33801afebb6720bc4b51fb4064b59529c40d298 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Thu, 23 May 2024 23:38:51 +
Subject: [PATCH] [WebAssembly] Implement all f16x8 binary instructions.

This reuses most of the code that was created for f32x4 and f64x2 binary
instructions and tries to follow how they were implemented.

add/sub/mul/div - use regular LL instructions
min/max - use the minimum/maximum intrinsic, and also have builtins
pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins

Specified at:
https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  4 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  4 ++
 clang/test/CodeGen/builtins-wasm.c| 24 +++
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  5 ++
 .../WebAssembly/WebAssemblyInstrSIMD.td   | 37 +++---
 .../CodeGen/WebAssembly/half-precision.ll | 68 +++
 llvm/test/MC/WebAssembly/simd-encodings.s | 24 +++
 7 files changed, 157 insertions(+), 9 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index fd8c1b480d6da..4e48ff48b60f5 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -135,6 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", 
"nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128")
+TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision")
 
 TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128")
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 0549afa12e430..f8be7182b5267 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -20779,6 +20779,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
   }
   case WebAssembly::BI__builtin_wasm_min_f32:
   case WebAssembly::BI__builtin_wasm_min_f64:
+  case WebAssembly::BI__builtin_wasm_min_f16x8:
   case WebAssembly::BI__builtin_wasm_min_f32x4:
   case WebAssembly::BI__builtin_wasm_min_f64x2: {
 Value *LHS = EmitScalarExpr(E->getArg(0));
@@ -20789,6 +20790,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
   }
   case WebAssembly::BI__builtin_wasm_max_f32:
   case WebAssembly::BI__builtin_wasm_max_f64:
+  case WebAssembly::BI__builtin_wasm_max_f16x8:
   case WebAssembly::BI__builtin_wasm_max_f32x4:
   case WebAssembly::BI__builtin_wasm_max_f64x2: {
 Value *LHS = EmitScalarExpr(E->getArg(0));
@@ -20797,6 +20799,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 CGM.getIntrinsic(Intrinsic::maximum, ConvertType(E->getType()));
 return Builder.CreateCall(Callee, {LHS, RHS});
   }
+  case WebAssembly::BI__builtin_wasm_pmin_f16x8:
   case WebAssembly::BI__builtin_wasm_pmin_f32x4:
   case WebAssembly::BI__builtin_wasm_pmin_f64x2: {
 Value *LHS = EmitScalarExpr(E->getArg(0));
@@ -20805,6 +20808,7 @@ Value 
*CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
 CGM.getIntrinsic(Intrinsic::wasm_pmin, ConvertType(E->getType()));
 return Builder.CreateCall(Callee, {LHS, RHS});
   }
+  case WebAssembly::BI__builtin_wasm_pmax_f16x8:
   case WebAssembly::BI__builtin_wasm_pmax_f32x4:
   case WebAssembly::BI__builtin_wasm_pmax_f64x2: {
 Value *LHS = EmitScalarExpr(E->getArg(0));
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 93a6ab06081c9..d6ee4f68700dc 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -825,6 +825,30 @@ float extract_lane_f16x8(f16x8 a, int i) {
   // WEBASSEMBLY-NEXT: ret float %0
   return __builtin_wasm_extract_lane_f16x8(a, i);
 }
+
+f16x8 min_f16x8(f16x8 a, f16x8 b) {
+  // WEBASSEMBLY:  %

[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)

2024-05-24 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl edited 
https://github.com/llvm/llvm-project/pull/93360
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [WebAssembly] Change half-precision feature name to fp16. (PR #105434)

2024-08-20 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl created 
https://github.com/llvm/llvm-project/pull/105434

This better aligns with how the feature is being referred to and what runtimes 
(V8) are calling it.

>From c4d120d4ec01f2af4e6ad748543ed195aa8f6721 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 20 Aug 2024 21:55:47 +
Subject: [PATCH] [WebAssembly] Change half-precision feature name to fp16.

This better aligns with how the feature is being referred to and
what runtimes (V8) are calling it.
---
 .../clang/Basic/BuiltinsWebAssembly.def   | 22 +--
 clang/include/clang/Driver/Options.td |  4 ++--
 clang/lib/Basic/Targets/WebAssembly.cpp   | 16 +++---
 clang/lib/Basic/Targets/WebAssembly.h |  4 ++--
 clang/test/CodeGen/builtins-wasm.c|  4 ++--
 clang/test/Driver/wasm-features.c |  8 +++
 .../test/Preprocessor/wasm-target-features.c  | 16 +++---
 llvm/lib/Target/WebAssembly/WebAssembly.td|  8 +++
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  4 ++--
 .../WebAssembly/WebAssemblyInstrInfo.td   |  6 ++---
 .../WebAssembly/WebAssemblyInstrMemory.td |  4 ++--
 .../WebAssembly/WebAssemblyInstrSIMD.td   | 12 +-
 .../Target/WebAssembly/WebAssemblySubtarget.h |  4 ++--
 .../CodeGen/WebAssembly/half-precision.ll |  4 ++--
 llvm/test/CodeGen/WebAssembly/offset.ll   |  2 +-
 .../WebAssembly/target-features-cpus.ll   |  6 ++---
 llvm/test/MC/WebAssembly/simd-encodings.s |  2 +-
 17 files changed, 63 insertions(+), 63 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index df304a71e475ec..034d32c6291b3d 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -135,10 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", 
"nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "fp16")
 
 TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128")
@@ -170,8 +170,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f32x4, 
"V4fV4fV4fV4f", "nc", "relaxed
 TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f32x4, "V4fV4fV4fV4f", "nc", 
"relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f64x2, "V2dV2dV2dV2d", "nc", 
"relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f64x2, "V2dV2dV2dV2d", "nc", 
"relaxed-simd")
-TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", 
"half-precision")
-TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", 
"half-precision")
+TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", 
"fp16")
 
 TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i8x16, 
"V16ScV16ScV16ScV16Sc", "nc", "relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i16x8, "V8sV8sV8sV8s", "nc", 
"relaxed-simd")
@@ -197,11 +197,11 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16S
 TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, 
"V4fV8UsV8UsV4f", "nc", "relaxed-simd")
 
 // Half-Precision (fp16)
-TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", 
"half-precision")
-TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", 
"half-precision")
+TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16")
+TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16")
+TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Drive

[clang] [llvm] [WebAssembly] Change half-precision feature name to fp16. (PR #105434)

2024-08-20 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/105434

>From fe8fc8201cd3ed5c2909ef512c55e70a30e14a5e Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 20 Aug 2024 21:55:47 +
Subject: [PATCH] [WebAssembly] Change half-precision feature name to fp16.

This better aligns with how the feature is being referred to and
what runtimes (V8) are calling it.
---
 .../clang/Basic/BuiltinsWebAssembly.def   | 22 +--
 clang/include/clang/Driver/Options.td |  4 ++--
 clang/lib/Basic/Targets/WebAssembly.cpp   | 16 +++---
 clang/lib/Basic/Targets/WebAssembly.h |  6 ++---
 clang/test/CodeGen/builtins-wasm.c|  4 ++--
 clang/test/Driver/wasm-features.c |  8 +++
 .../test/Preprocessor/wasm-target-features.c  | 16 +++---
 llvm/lib/Target/WebAssembly/WebAssembly.td|  8 +++
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  4 ++--
 .../WebAssembly/WebAssemblyInstrInfo.td   |  6 ++---
 .../WebAssembly/WebAssemblyInstrMemory.td |  4 ++--
 .../WebAssembly/WebAssemblyInstrSIMD.td   | 12 +-
 .../Target/WebAssembly/WebAssemblySubtarget.h |  4 ++--
 .../CodeGen/WebAssembly/half-precision.ll |  4 ++--
 llvm/test/CodeGen/WebAssembly/offset.ll   |  2 +-
 .../WebAssembly/target-features-cpus.ll   |  6 ++---
 llvm/test/MC/WebAssembly/simd-encodings.s |  2 +-
 17 files changed, 63 insertions(+), 65 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index df304a71e475ec..034d32c6291b3d 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -135,10 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", 
"nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "fp16")
 
 TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128")
@@ -170,8 +170,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f32x4, 
"V4fV4fV4fV4f", "nc", "relaxed
 TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f32x4, "V4fV4fV4fV4f", "nc", 
"relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f64x2, "V2dV2dV2dV2d", "nc", 
"relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f64x2, "V2dV2dV2dV2d", "nc", 
"relaxed-simd")
-TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", 
"half-precision")
-TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", 
"half-precision")
+TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", 
"fp16")
 
 TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i8x16, 
"V16ScV16ScV16ScV16Sc", "nc", "relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i16x8, "V8sV8sV8sV8s", "nc", 
"relaxed-simd")
@@ -197,11 +197,11 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16S
 TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, 
"V4fV8UsV8UsV4f", "nc", "relaxed-simd")
 
 // Half-Precision (fp16)
-TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", 
"half-precision")
-TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", 
"half-precision")
+TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16")
+TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16")
+TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index c204062b4f7353..89239789b3d492 100644
--- a/clang/include/clang/Driver/Options.td

[clang] [llvm] [WebAssembly] Change half-precision feature name to fp16. (PR #105434)

2024-08-21 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/105434

>From e992578b7269c365e619fe201e7cc703149c7067 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 20 Aug 2024 21:55:47 +
Subject: [PATCH] [WebAssembly] Change half-precision feature name to fp16.

This better aligns with how the feature is being referred to and
what runtimes (V8) are calling it.
---
 .../clang/Basic/BuiltinsWebAssembly.def   | 22 +--
 clang/include/clang/Driver/Options.td |  4 ++--
 clang/lib/Basic/Targets/WebAssembly.cpp   | 16 +++---
 clang/lib/Basic/Targets/WebAssembly.h |  6 ++---
 clang/test/CodeGen/builtins-wasm.c|  4 ++--
 clang/test/Driver/wasm-features.c |  8 +++
 .../test/Preprocessor/wasm-target-features.c  | 16 +++---
 llvm/lib/Target/WebAssembly/WebAssembly.td|  8 +++
 .../WebAssembly/WebAssemblyISelLowering.cpp   |  4 ++--
 .../WebAssembly/WebAssemblyInstrInfo.td   |  6 ++---
 .../WebAssembly/WebAssemblyInstrMemory.td |  4 ++--
 .../WebAssembly/WebAssemblyInstrSIMD.td   | 12 +-
 .../Target/WebAssembly/WebAssemblySubtarget.h |  4 ++--
 .../CodeGen/WebAssembly/half-precision.ll |  4 ++--
 llvm/test/CodeGen/WebAssembly/offset.ll   |  2 +-
 .../WebAssembly/target-features-cpus.ll   |  6 ++---
 llvm/test/MC/WebAssembly/simd-encodings.s |  2 +-
 17 files changed, 63 insertions(+), 65 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index df304a71e475ec..034d32c6291b3d 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -135,10 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", 
"nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128")
-TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision")
+TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "fp16")
 
 TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128")
 TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128")
@@ -170,8 +170,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f32x4, 
"V4fV4fV4fV4f", "nc", "relaxed
 TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f32x4, "V4fV4fV4fV4f", "nc", 
"relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f64x2, "V2dV2dV2dV2d", "nc", 
"relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f64x2, "V2dV2dV2dV2d", "nc", 
"relaxed-simd")
-TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", 
"half-precision")
-TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", 
"half-precision")
+TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", 
"fp16")
 
 TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i8x16, 
"V16ScV16ScV16ScV16Sc", "nc", "relaxed-simd")
 TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i16x8, "V8sV8sV8sV8s", "nc", 
"relaxed-simd")
@@ -197,11 +197,11 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16S
 TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, 
"V4fV8UsV8UsV4f", "nc", "relaxed-simd")
 
 // Half-Precision (fp16)
-TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision")
-TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", 
"half-precision")
-TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", 
"half-precision")
+TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16")
+TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16")
+TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index c204062b4f7353..89239789b3d492 100644
--- a/clang/include/clang/Driver/Options.td

[clang] [llvm] [WebAssembly] Change half-precision feature name to fp16. (PR #105434)

2024-08-22 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl closed 
https://github.com/llvm/llvm-project/pull/105434
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [WebAssembly] Change F16x8 extract lane to require constant integer. (PR #108116)

2024-09-11 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl updated 
https://github.com/llvm/llvm-project/pull/108116

>From 3b813cd5b0555e6b654f575140e4db9a57ed699a Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Tue, 10 Sep 2024 21:52:55 +
Subject: [PATCH 1/2] [WebAssembly] Change F16x8 extract lane to require
 constant integer.

Building with no optimizations resulted in failures since the lane
constant wasn't a constant in LL IR.
---
 .../clang/Basic/BuiltinsWebAssembly.def   |  4 ++--
 clang/lib/Headers/wasm_simd128.h  | 19 ---
 clang/test/CodeGen/builtins-wasm.c| 12 ++--
 3 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def 
b/clang/include/clang/Basic/BuiltinsWebAssembly.def
index 2e80eef2c8b9bc..ad73f031922a0b 100644
--- a/clang/include/clang/Basic/BuiltinsWebAssembly.def
+++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def
@@ -209,8 +209,8 @@ 
TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f"
 TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16")
 TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16")
 TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16")
-TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16")
-TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hIi", "nc", "fp16")
+TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hIif", "nc", "fp16")
 
 // Reference Types builtins
 // Some builtins are custom type-checked - see 't' as part of the third 
argument,
diff --git a/clang/lib/Headers/wasm_simd128.h b/clang/lib/Headers/wasm_simd128.h
index 67d12f6f2cf419..947bb9fe23029e 100644
--- a/clang/lib/Headers/wasm_simd128.h
+++ b/clang/lib/Headers/wasm_simd128.h
@@ -1888,18 +1888,15 @@ static __inline__ v128_t __FP16_FN_ATTRS 
wasm_f16x8_splat(float __a) {
   return (v128_t)__builtin_wasm_splat_f16x8(__a);
 }
 
-static __inline__ float __FP16_FN_ATTRS wasm_f16x8_extract_lane(v128_t __a,
-int __i)
-__REQUIRE_CONSTANT(__i) {
-  return __builtin_wasm_extract_lane_f16x8((__f16x8)__a, __i);
-}
+#ifdef __wasm_fp16__
 
-static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_replace_lane(v128_t __a,
- int __i,
- float __b)
-__REQUIRE_CONSTANT(__i) {
-  return (v128_t)__builtin_wasm_replace_lane_f16x8((__f16x8)__a, __i, __b);
-}
+#define wasm_f16x8_extract_lane(__a, __i)  
\
+  (__builtin_wasm_extract_lane_f16x8((__f16x8)(__a), __i))
+
+#define wasm_f16x8_replace_lane(__a, __i, __b) 
\
+  ((v128_t)__builtin_wasm_replace_lane_f16x8((__f16x8)(__a), __i, __b))
+
+#endif
 
 static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_abs(v128_t __a) {
   return (v128_t)__builtin_wasm_abs_f16x8((__f16x8)__a);
diff --git a/clang/test/CodeGen/builtins-wasm.c 
b/clang/test/CodeGen/builtins-wasm.c
index 3010b8954f1c2e..8943a92faad044 100644
--- a/clang/test/CodeGen/builtins-wasm.c
+++ b/clang/test/CodeGen/builtins-wasm.c
@@ -834,16 +834,16 @@ f16x8 splat_f16x8(float a) {
   return __builtin_wasm_splat_f16x8(a);
 }
 
-float extract_lane_f16x8(f16x8 a, int i) {
-  // WEBASSEMBLY:  %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x 
half> %a, i32 %i)
+float extract_lane_f16x8(f16x8 a) {
+  // WEBASSEMBLY:  %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x 
half> %a, i32 7)
   // WEBASSEMBLY-NEXT: ret float %0
-  return __builtin_wasm_extract_lane_f16x8(a, i);
+  return __builtin_wasm_extract_lane_f16x8(a, 7);
 }
 
-f16x8 replace_lane_f16x8(f16x8 a, int i, float v) {
-  // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 
x half> %a, i32 %i, float %v)
+f16x8 replace_lane_f16x8(f16x8 a, float v) {
+  // WEBASSEMBLY:  %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 
x half> %a, i32 7, float %v)
   // WEBASSEMBLY-NEXT: ret <8 x half> %0
-  return __builtin_wasm_replace_lane_f16x8(a, i, v);
+  return __builtin_wasm_replace_lane_f16x8(a, 7, v);
 }
 
 f16x8 min_f16x8(f16x8 a, f16x8 b) {

>From ab30566f242a88a238d4bfb0e5eee229ddf0eb54 Mon Sep 17 00:00:00 2001
From: Brendan Dahl 
Date: Wed, 11 Sep 2024 22:32:02 +
Subject: [PATCH 2/2] add todo

---
 clang/lib/Headers/wasm_simd128.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/clang/lib/Headers/wasm_simd128.h b/clang/lib/Headers/wasm_simd128.h
index 947bb9fe23029e..14e36e85da8efa 100644
--- a/clang/lib/Headers/wasm_simd128.h
+++ b/clang/lib/Headers/wasm_simd128.h
@@ -1889,6 +1889,8 @@ static __inline__ v128_t __FP16_FN_ATTRS 
wasm_f16x8_splat(float __a) {
 }
 
 #ifdef __wasm_fp16__
+// TODO Replace the following macros with regular C functions and use normal
+// target-independent vector code like the other repl

[clang] [WebAssembly] Change F16x8 extract lane to require constant integer. (PR #108116)

2024-09-11 Thread Brendan Dahl via cfe-commits


https://github.com/brendandahl closed 
https://github.com/llvm/llvm-project/pull/108116
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

53 matches

Mail list logo