[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)
brendandahl wrote: @efriedma-quic missed your comment. I don't have commit access. Can you merge for me? Thanks! https://github.com/llvm/llvm-project/pull/66716 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/66716 Previously, annotations were only emitted for function definitions. With this change annotations are also emitted for declarations. Also, emitting function annotations is now deferred until the end so that the most up to date declaration is used which will have any inherited annotations. >From 846deb6e2055a8e458530c9e27bbd512a68deb5c Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 12 Sep 2023 12:53:24 -0700 Subject: [PATCH] [clang][CodeGen] Emit annotations for function declarations. Previously, annotations were only emitted for function definitions. With this change annotations are also emitted for declarations. Also, emitting function annotations is now deferred until the end so that the most up to date declaration is used which will have any inherited annotations. --- clang/lib/CodeGen/CodeGenModule.cpp | 23 +-- clang/lib/CodeGen/CodeGenModule.h | 4 .../test/CodeGen/annotations-decl-use-decl.c | 16 + .../CodeGen/annotations-decl-use-define.c | 16 + clang/test/CodeGen/annotations-declaration.c | 17 ++ clang/test/CodeGen/annotations-global.c | 8 +++ .../CodeGenCXX/attr-annotate-constructor.cpp | 10 .../CodeGenCXX/attr-annotate-destructor.cpp | 10 clang/test/CodeGenCXX/attr-annotate.cpp | 6 ++--- 9 files changed, 101 insertions(+), 9 deletions(-) create mode 100644 clang/test/CodeGen/annotations-decl-use-decl.c create mode 100644 clang/test/CodeGen/annotations-decl-use-define.c create mode 100644 clang/test/CodeGen/annotations-declaration.c create mode 100644 clang/test/CodeGenCXX/attr-annotate-constructor.cpp create mode 100644 clang/test/CodeGenCXX/attr-annotate-destructor.cpp diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 8b0c9340775cbe9..5108e6c91bfb30c 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -697,6 +697,7 @@ void CodeGenModule::checkAliases() { void CodeGenModule::clear() { DeferredDeclsToEmit.clear(); EmittedDeferredDecls.clear(); + DeferredAnnotations.clear(); if (OpenMPRuntime) OpenMPRuntime->clear(); } @@ -3093,6 +3094,13 @@ void CodeGenModule::EmitVTablesOpportunistically() { } void CodeGenModule::EmitGlobalAnnotations() { + for (const auto& [MangledName, VD] : DeferredAnnotations) { +llvm::GlobalValue *GV = GetGlobalValue(MangledName); +if (GV) + AddGlobalAnnotations(VD, GV); + } + DeferredAnnotations.clear(); + if (Annotations.empty()) return; @@ -3597,6 +3605,14 @@ void CodeGenModule::EmitGlobal(GlobalDecl GD) { // Ignore declarations, they will be emitted on their first use. if (const auto *FD = dyn_cast(Global)) { +// Update deferred annotations with the latest declaration if the function +// function was already used or defined. +if (FD->hasAttr()) { + StringRef MangledName = getMangledName(GD); + if (GetGlobalValue(MangledName)) +DeferredAnnotations[MangledName] = FD; +} + // Forward declarations are emitted lazily on first use. if (!FD->doesThisDeclarationHaveABody()) { if (!FD->doesDeclarationForceExternallyVisibleDefinition()) @@ -4370,6 +4386,11 @@ llvm::Constant *CodeGenModule::GetOrCreateLLVMFunction( llvm::Function::Create(FTy, llvm::Function::ExternalLinkage, Entry ? StringRef() : MangledName, &getModule()); + // Store the declaration associated with this function so it is potentially + // updated by further declarations or definitions and emitted at the end. + if (D && D->hasAttr()) +DeferredAnnotations[MangledName] = cast(D); + // If we already created a function with the same mangled name (but different // type) before, take its name and add it to the list of functions to be // replaced with F at the end of CodeGen. @@ -5664,8 +5685,6 @@ void CodeGenModule::EmitGlobalFunctionDefinition(GlobalDecl GD, AddGlobalCtor(Fn, CA->getPriority()); if (const DestructorAttr *DA = D->getAttr()) AddGlobalDtor(Fn, DA->getPriority(), true); - if (D->hasAttr()) -AddGlobalAnnotations(D, Fn); if (getLangOpts().OpenMP && D->hasAttr()) getOpenMPRuntime().emitDeclareTargetFunction(D, GV); } diff --git a/clang/lib/CodeGen/CodeGenModule.h b/clang/lib/CodeGen/CodeGenModule.h index 073b471c6e3cc11..8b0d68afbd0ecd2 100644 --- a/clang/lib/CodeGen/CodeGenModule.h +++ b/clang/lib/CodeGen/CodeGenModule.h @@ -431,6 +431,10 @@ class CodeGenModule : public CodeGenTypeCache { /// Global annotations. std::vector Annotations; + // Store deferred function annotations so they can be emitted at the end with + // most up to date ValueDecl that will have all the inherited annotations. + llvm::DenseMap DeferredAnnotations; + /// Map used to get unique annotation strings. llvm::St
[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)
brendandahl wrote: This is relanding the patch from [here](https://reviews.llvm.org/D156172). It fixes the [backout failure](https://reviews.llvm.org/rG88b7e06dcf9723d0869b0c6bee030b4140e4366d) and adds a test for it. https://github.com/llvm/llvm-project/pull/66716 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)
brendandahl wrote: @efriedma-quic could you re-review? The only changes were https://github.com/llvm/llvm-project/pull/66716/files#diff-e724febedab9c1a2832bf2056d208ff02ddcb2e6f90b5a653afc9b19ac78a5d7R3098-R3100 https://github.com/llvm/llvm-project/pull/66716 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][CodeGen] Emit annotations for function declarations. (PR #66716)
brendandahl wrote: @AaronBallman or @efriedma-quic ping are you able to add reviewers? https://github.com/llvm/llvm-project/pull/66716 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [WebAssembly] Change F16x8 extract lane to require constant integer. (PR #108116)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/108116 Building with no optimizations resulted in failures since the lane constant wasn't a constant in LLVM IR. >From 3b813cd5b0555e6b654f575140e4db9a57ed699a Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 10 Sep 2024 21:52:55 + Subject: [PATCH] [WebAssembly] Change F16x8 extract lane to require constant integer. Building with no optimizations resulted in failures since the lane constant wasn't a constant in LL IR. --- .../clang/Basic/BuiltinsWebAssembly.def | 4 ++-- clang/lib/Headers/wasm_simd128.h | 19 --- clang/test/CodeGen/builtins-wasm.c| 12 ++-- 3 files changed, 16 insertions(+), 19 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 2e80eef2c8b9bc..ad73f031922a0b 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -209,8 +209,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16") TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16") TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16") -TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16") -TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hIi", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hIif", "nc", "fp16") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/Headers/wasm_simd128.h b/clang/lib/Headers/wasm_simd128.h index 67d12f6f2cf419..947bb9fe23029e 100644 --- a/clang/lib/Headers/wasm_simd128.h +++ b/clang/lib/Headers/wasm_simd128.h @@ -1888,18 +1888,15 @@ static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_splat(float __a) { return (v128_t)__builtin_wasm_splat_f16x8(__a); } -static __inline__ float __FP16_FN_ATTRS wasm_f16x8_extract_lane(v128_t __a, -int __i) -__REQUIRE_CONSTANT(__i) { - return __builtin_wasm_extract_lane_f16x8((__f16x8)__a, __i); -} +#ifdef __wasm_fp16__ -static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_replace_lane(v128_t __a, - int __i, - float __b) -__REQUIRE_CONSTANT(__i) { - return (v128_t)__builtin_wasm_replace_lane_f16x8((__f16x8)__a, __i, __b); -} +#define wasm_f16x8_extract_lane(__a, __i) \ + (__builtin_wasm_extract_lane_f16x8((__f16x8)(__a), __i)) + +#define wasm_f16x8_replace_lane(__a, __i, __b) \ + ((v128_t)__builtin_wasm_replace_lane_f16x8((__f16x8)(__a), __i, __b)) + +#endif static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_abs(v128_t __a) { return (v128_t)__builtin_wasm_abs_f16x8((__f16x8)__a); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 3010b8954f1c2e..8943a92faad044 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -834,16 +834,16 @@ f16x8 splat_f16x8(float a) { return __builtin_wasm_splat_f16x8(a); } -float extract_lane_f16x8(f16x8 a, int i) { - // WEBASSEMBLY: %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x half> %a, i32 %i) +float extract_lane_f16x8(f16x8 a) { + // WEBASSEMBLY: %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x half> %a, i32 7) // WEBASSEMBLY-NEXT: ret float %0 - return __builtin_wasm_extract_lane_f16x8(a, i); + return __builtin_wasm_extract_lane_f16x8(a, 7); } -f16x8 replace_lane_f16x8(f16x8 a, int i, float v) { - // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 x half> %a, i32 %i, float %v) +f16x8 replace_lane_f16x8(f16x8 a, float v) { + // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 x half> %a, i32 7, float %v) // WEBASSEMBLY-NEXT: ret <8 x half> %0 - return __builtin_wasm_replace_lane_f16x8(a, i, v); + return __builtin_wasm_replace_lane_f16x8(a, 7, v); } f16x8 min_f16x8(f16x8 a, f16x8 b) { ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [WebAssembly] Change F16x8 extract lane to require constant integer. (PR #108116)
@@ -1888,18 +1888,15 @@ static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_splat(float __a) { return (v128_t)__builtin_wasm_splat_f16x8(__a); } -static __inline__ float __FP16_FN_ATTRS wasm_f16x8_extract_lane(v128_t __a, -int __i) -__REQUIRE_CONSTANT(__i) { brendandahl wrote: It does require a constant in C code, but in a no-opt build it is not a constant in LLVM IR. https://github.com/llvm/llvm-project/pull/108116 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement f16x8 madd and nmadd instructions. (PR #95151)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/95151 Implemented with intrinsics and builtins. Specified at: https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md >From fd5ea6036e97e504e3286d218fe6b966e5bead82 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 11 Jun 2024 17:12:09 + Subject: [PATCH] [WebAssembly] Implement f16x8 madd and nmadd instructions. Implemented with intrinsics and builtins. Specified at: https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md --- .../clang/Basic/BuiltinsWebAssembly.def | 2 ++ clang/lib/CodeGen/CGBuiltin.cpp | 4 +++ clang/test/CodeGen/builtins-wasm.c| 14 ++ .../WebAssembly/WebAssemblyInstrSIMD.td | 27 ++- llvm/test/MC/WebAssembly/simd-encodings.s | 6 + 5 files changed, 40 insertions(+), 13 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 4e48ff48b60f5..2a45f8a6582a2 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -170,6 +170,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f32x4, "V4fV4fV4fV4f", "nc", "relaxed TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f32x4, "V4fV4fV4fV4f", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f64x2, "V2dV2dV2dV2d", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f64x2, "V2dV2dV2dV2d", "nc", "relaxed-simd") +TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", "half-precision") TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i8x16, "V16ScV16ScV16ScV16Sc", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i16x8, "V8sV8sV8sV8s", "nc", "relaxed-simd") diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 06e201fa71e6f..511e1fd4016d7 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21149,6 +21149,8 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_shuffle); return Builder.CreateCall(Callee, Ops); } + case WebAssembly::BI__builtin_wasm_relaxed_madd_f16x8: + case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f16x8: case WebAssembly::BI__builtin_wasm_relaxed_madd_f32x4: case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f32x4: case WebAssembly::BI__builtin_wasm_relaxed_madd_f64x2: @@ -21158,10 +21160,12 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Value *C = EmitScalarExpr(E->getArg(2)); unsigned IntNo; switch (BuiltinID) { +case WebAssembly::BI__builtin_wasm_relaxed_madd_f16x8: case WebAssembly::BI__builtin_wasm_relaxed_madd_f32x4: case WebAssembly::BI__builtin_wasm_relaxed_madd_f64x2: IntNo = Intrinsic::wasm_relaxed_madd; break; +case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f16x8: case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f32x4: case WebAssembly::BI__builtin_wasm_relaxed_nmadd_f64x2: IntNo = Intrinsic::wasm_relaxed_nmadd; diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index d6ee4f68700dc..75861b1b4bd6d 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -690,6 +690,20 @@ f64x2 nmadd_f64x2(f64x2 a, f64x2 b, f64x2 c) { // WEBASSEMBLY-NEXT: ret } +f16x8 madd_f16x8(f16x8 a, f16x8 b, f16x8 c) { + return __builtin_wasm_relaxed_madd_f16x8(a, b, c); + // WEBASSEMBLY: call <8 x half> @llvm.wasm.relaxed.madd.v8f16( + // WEBASSEMBLY-SAME: <8 x half> %a, <8 x half> %b, <8 x half> %c) + // WEBASSEMBLY-NEXT: ret +} + +f16x8 nmadd_f16x8(f16x8 a, f16x8 b, f16x8 c) { + return __builtin_wasm_relaxed_nmadd_f16x8(a, b, c); + // WEBASSEMBLY: call <8 x half> @llvm.wasm.relaxed.nmadd.v8f16( + // WEBASSEMBLY-SAME: <8 x half> %a, <8 x half> %b, <8 x half> %c) + // WEBASSEMBLY-NEXT: ret +} + i8x16 laneselect_i8x16(i8x16 a, i8x16 b, i8x16 c) { return __builtin_wasm_relaxed_laneselect_i8x16(a, b, c); // WEBASSEMBLY: call <16 x i8> @llvm.wasm.relaxed.laneselect.v16i8( diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td index 3c97befcea1a4..3888175efd115 100644 --- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td +++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td @@ -1480,23 +1480,24 @@ defm "" : RelaxedConvert simdopA, bits<32> simdopS> { +multiclass SIMDMADD simdopA, bits<32> simdopS, list reqs> { defm MADD_#vec : -RELAXED_I<(outs V128:$dst), (ins V128:$a, V128:$b, V128:$c), (outs), (ins), - [(set (vec.vt V128:$dst), (int_wasm_relaxed_madd -(vec.vt V128
[clang] [llvm] [WebAssembly] Implement f16x8 madd and nmadd instructions. (PR #95151)
brendandahl wrote: Note: I've [opened an issue](https://github.com/WebAssembly/half-precision/issues/5) about the `relaxed_` prefix and whether it should be included in the instruction name. https://github.com/llvm/llvm-project/pull/95151 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement f16x8 madd and nmadd instructions. (PR #95151)
https://github.com/brendandahl closed https://github.com/llvm/llvm-project/pull/95151 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/93360 >From c33801afebb6720bc4b51fb4064b59529c40d298 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Thu, 23 May 2024 23:38:51 + Subject: [PATCH 1/2] [WebAssembly] Implement all f16x8 binary instructions. This reuses most of the code that was created for f32x4 and f64x2 binary instructions and tries to follow how they were implemented. add/sub/mul/div - use regular LL instructions min/max - use the minimum/maximum intrinsic, and also have builtins pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md --- .../clang/Basic/BuiltinsWebAssembly.def | 4 ++ clang/lib/CodeGen/CGBuiltin.cpp | 4 ++ clang/test/CodeGen/builtins-wasm.c| 24 +++ .../WebAssembly/WebAssemblyISelLowering.cpp | 5 ++ .../WebAssembly/WebAssemblyInstrSIMD.td | 37 +++--- .../CodeGen/WebAssembly/half-precision.ll | 68 +++ llvm/test/MC/WebAssembly/simd-encodings.s | 24 +++ 7 files changed, 157 insertions(+), 9 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index fd8c1b480d6da..4e48ff48b60f5 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -135,6 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128") +TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision") TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128") diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 0549afa12e430..f8be7182b5267 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -20779,6 +20779,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, } case WebAssembly::BI__builtin_wasm_min_f32: case WebAssembly::BI__builtin_wasm_min_f64: + case WebAssembly::BI__builtin_wasm_min_f16x8: case WebAssembly::BI__builtin_wasm_min_f32x4: case WebAssembly::BI__builtin_wasm_min_f64x2: { Value *LHS = EmitScalarExpr(E->getArg(0)); @@ -20789,6 +20790,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, } case WebAssembly::BI__builtin_wasm_max_f32: case WebAssembly::BI__builtin_wasm_max_f64: + case WebAssembly::BI__builtin_wasm_max_f16x8: case WebAssembly::BI__builtin_wasm_max_f32x4: case WebAssembly::BI__builtin_wasm_max_f64x2: { Value *LHS = EmitScalarExpr(E->getArg(0)); @@ -20797,6 +20799,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::maximum, ConvertType(E->getType())); return Builder.CreateCall(Callee, {LHS, RHS}); } + case WebAssembly::BI__builtin_wasm_pmin_f16x8: case WebAssembly::BI__builtin_wasm_pmin_f32x4: case WebAssembly::BI__builtin_wasm_pmin_f64x2: { Value *LHS = EmitScalarExpr(E->getArg(0)); @@ -20805,6 +20808,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::wasm_pmin, ConvertType(E->getType())); return Builder.CreateCall(Callee, {LHS, RHS}); } + case WebAssembly::BI__builtin_wasm_pmax_f16x8: case WebAssembly::BI__builtin_wasm_pmax_f32x4: case WebAssembly::BI__builtin_wasm_pmax_f64x2: { Value *LHS = EmitScalarExpr(E->getArg(0)); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 93a6ab06081c9..d6ee4f68700dc 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -825,6 +825,30 @@ float extract_lane_f16x8(f16x8 a, int i) { // WEBASSEMBLY-NEXT: ret float %0 return __builtin_wasm_extract_lane_f16x8(a, i); } + +f16x8 min_f16x8(f16x8 a, f16x8 b) { + // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.minimum.v8f16(<8 x half> %a, <8 x half> %b) + // WEBASSEMBLY-NEXT: ret <8 x half> %0 + return __builtin_wasm_min_f16x8(a, b); +} + +f16x8 max_f16x8(f16x8 a, f16x8 b) { + // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.maximum.v8f16(<8 x half> %a, <8 x half> %b) + // WEBASSEMBLY-NEXT: ret <8 x half> %0 + return __builtin_wasm_max_f16x8(a, b); +} + +f16x8 pmin_f16x8(f16x8 a, f16x8 b) { + // WEBASSEMBLY: %0 = tail call
[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)
@@ -152,6 +153,18 @@ def F64x2 : Vec { let prefix = "f64x2"; } +def F16x8 : Vec { + let vt = v8f16; + let int_vt = v8i16; + let lane_vt = f32; + let lane_rc = F32; + let lane_bits = 16; + let lane_idx = LaneIdx8; + let lane_load = int_wasm_loadf16_f32; + let splat = PatFrag<(ops node:$x), (v8f16 (splat_vector (f16 $x)))>; + let prefix = "f16x8"; +} + defvar AllVecs = [I8x16, I16x8, I32x4, I64x2, F32x4, F64x2]; brendandahl wrote: I hope to include F16x8 here when we better support it and the regular patterns work for it. I've added a comment for now, but can change the name if wanted. https://github.com/llvm/llvm-project/pull/93360 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)
@@ -1199,6 +1213,7 @@ def : Pat<(v2f64 (froundeven (v2f64 V128:$src))), (NEAREST_F64x2 V128:$src)>; multiclass SIMDBinaryFP baseInst> { defm "" : SIMDBinary; defm "" : SIMDBinary; + defm "" : SIMDBinary; brendandahl wrote: I ended up adding `HalfPrecisionBinary`. I was hoping there was some way I could pass a multiclass id as a parameter so i could then pass in `SIMD_I` or `HALF_PRECISION_I` as an argument, but I couldn't figure out a way to make that work. https://github.com/llvm/llvm-project/pull/93360 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)
https://github.com/brendandahl closed https://github.com/llvm/llvm-project/pull/93360 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement f16x8.replace_lane instruction. (PR #99388)
https://github.com/brendandahl closed https://github.com/llvm/llvm-project/pull/99388 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement f16x8.replace_lane instruction. (PR #99388)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/99388 Use a builtin and intrinsic until half types are better supported for instruction selection. >From a6d65f276fba7487fdecf2e31edef457f74fbafe Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 17 Jul 2024 20:10:20 + Subject: [PATCH] [WebAssembly] Implement f16x8.replace_lane instruction. Use a builtin and intrinsic until half types are better supported for instruction selection. --- clang/include/clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/CodeGen/CGBuiltin.cpp | 7 +++ clang/test/CodeGen/builtins-wasm.c | 6 ++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 4 llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td | 13 + llvm/test/CodeGen/WebAssembly/half-precision.ll | 8 llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 7 files changed, 42 insertions(+) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 2a45f8a6582a2..df304a71e475e 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -201,6 +201,7 @@ TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision") TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision") TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 67027f8aa93f3..402b7a7b20e61 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21386,6 +21386,13 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_extract_lane_f16x8); return Builder.CreateCall(Callee, {Vector, Index}); } + case WebAssembly::BI__builtin_wasm_replace_lane_f16x8: { +Value *Vector = EmitScalarExpr(E->getArg(0)); +Value *Index = EmitScalarExpr(E->getArg(1)); +Value *Val = EmitScalarExpr(E->getArg(2)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_replace_lane_f16x8); +return Builder.CreateCall(Callee, {Vector, Index, Val}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 75861b1b4bd6d..f494aeada0157 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -840,6 +840,12 @@ float extract_lane_f16x8(f16x8 a, int i) { return __builtin_wasm_extract_lane_f16x8(a, i); } +f16x8 replace_lane_f16x8(f16x8 a, int i, float v) { + // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 x half> %a, i32 %i, float %v) + // WEBASSEMBLY-NEXT: ret <8 x half> %0 + return __builtin_wasm_replace_lane_f16x8(a, i, v); +} + f16x8 min_f16x8(f16x8 a, f16x8 b) { // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.minimum.v8f16(<8 x half> %a, <8 x half> %b) // WEBASSEMBLY-NEXT: ret <8 x half> %0 diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index 47aab196a6d4f..4d2df1c44ebce 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -363,6 +363,10 @@ def int_wasm_extract_lane_f16x8: DefaultAttrsIntrinsic<[llvm_float_ty], [llvm_v8f16_ty, llvm_i32_ty], [IntrNoMem, IntrSpeculatable]>; +def int_wasm_replace_lane_f16x8: + DefaultAttrsIntrinsic<[llvm_v8f16_ty], +[llvm_v8f16_ty, llvm_i32_ty, llvm_float_ty], +[IntrNoMem, IntrSpeculatable]>; //===--===// diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td index 2ee430c88169d..f11fe12c6ecb8 100644 --- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td +++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td @@ -702,6 +702,19 @@ defm "" : ReplaceLane; defm "" : ReplaceLane; defm "" : ReplaceLane; +// For now use an instrinsic for f16x8.replace_lane instead of ReplaceLane above +// since LL generated with half type arguments is not well supported and creates +// conversions from f16->f32. +defm REPLACE_LANE_F16x8 : + HALF_PRECISION_I<(outs V128:$dst), (ins V128:$vec, vec_i8imm_op:$idx, F32:$x), + (outs), (ins vec_i8imm_
[clang] [llvm] [WebAssembly] Implement f16x8.replace_lane instruction. (PR #99388)
@@ -702,6 +702,19 @@ defm "" : ReplaceLane; defm "" : ReplaceLane; defm "" : ReplaceLane; +// For now use an instrinsic for f16x8.replace_lane instead of ReplaceLane above +// since LL generated with half type arguments is not well supported and creates brendandahl wrote: Yeah, I'll update. https://github.com/llvm/llvm-project/pull/99388 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement f16x8.replace_lane instruction. (PR #99388)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/99388 >From 8320b1f7f45f42363547cefb748627cfe1bb7af6 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 17 Jul 2024 20:10:20 + Subject: [PATCH] [WebAssembly] Implement f16x8.replace_lane instruction. Use a builtin and intrinsic until half types are better supported for instruction selection. --- clang/include/clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/CodeGen/CGBuiltin.cpp | 7 +++ clang/test/CodeGen/builtins-wasm.c | 6 ++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 4 llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td | 13 + llvm/test/CodeGen/WebAssembly/half-precision.ll | 8 llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 7 files changed, 42 insertions(+) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 2a45f8a6582a2..df304a71e475e 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -201,6 +201,7 @@ TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision") TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision") TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 67027f8aa93f3..402b7a7b20e61 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21386,6 +21386,13 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_extract_lane_f16x8); return Builder.CreateCall(Callee, {Vector, Index}); } + case WebAssembly::BI__builtin_wasm_replace_lane_f16x8: { +Value *Vector = EmitScalarExpr(E->getArg(0)); +Value *Index = EmitScalarExpr(E->getArg(1)); +Value *Val = EmitScalarExpr(E->getArg(2)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_replace_lane_f16x8); +return Builder.CreateCall(Callee, {Vector, Index, Val}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 75861b1b4bd6d..f494aeada0157 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -840,6 +840,12 @@ float extract_lane_f16x8(f16x8 a, int i) { return __builtin_wasm_extract_lane_f16x8(a, i); } +f16x8 replace_lane_f16x8(f16x8 a, int i, float v) { + // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 x half> %a, i32 %i, float %v) + // WEBASSEMBLY-NEXT: ret <8 x half> %0 + return __builtin_wasm_replace_lane_f16x8(a, i, v); +} + f16x8 min_f16x8(f16x8 a, f16x8 b) { // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.minimum.v8f16(<8 x half> %a, <8 x half> %b) // WEBASSEMBLY-NEXT: ret <8 x half> %0 diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index 47aab196a6d4f..4d2df1c44ebce 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -363,6 +363,10 @@ def int_wasm_extract_lane_f16x8: DefaultAttrsIntrinsic<[llvm_float_ty], [llvm_v8f16_ty, llvm_i32_ty], [IntrNoMem, IntrSpeculatable]>; +def int_wasm_replace_lane_f16x8: + DefaultAttrsIntrinsic<[llvm_v8f16_ty], +[llvm_v8f16_ty, llvm_i32_ty, llvm_float_ty], +[IntrNoMem, IntrSpeculatable]>; //===--===// diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td index 2ee430c88169d..8eaf107b2cc40 100644 --- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td +++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td @@ -702,6 +702,19 @@ defm "" : ReplaceLane; defm "" : ReplaceLane; defm "" : ReplaceLane; +// For now use an instrinsic for f16x8.replace_lane instead of ReplaceLane above +// since LLVM IR generated with half type arguments is not well supported and +// creates conversions from f16->f32. +defm REPLACE_LANE_F16x8 : + HALF_PRECISION_I<(outs V128:$dst), (ins V128:$vec, vec_i8imm_op:$idx, F32:$x), + (outs), (ins vec_i8imm_op:$idx), + [(set (v8f16 V128:$dst), (int_wasm_replace_lane_f16x8 +
[clang] [llvm] [WebAssembly] Add half-precision feature (PR #90248)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/90248 This currently only defines a constant, but in the future will be used to gate builtins for experimenting and prototyping half-precision proposal (https://github.com/WebAssembly/half-precision). >From 098342189d16b653a189889de43fe5a3d38592c8 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Fri, 26 Apr 2024 18:30:48 + Subject: [PATCH] [WebAssembly] Add half-precision feature This currently only defines a constant, but in the future will be used to gate builtins for experimenting and prototyping half-precision proposal (https://github.com/WebAssembly/half-precision). --- clang/include/clang/Driver/Options.td | 2 ++ clang/lib/Basic/Targets/WebAssembly.cpp | 11 +++ clang/lib/Basic/Targets/WebAssembly.h | 1 + llvm/lib/Target/WebAssembly/WebAssembly.td | 3 +++ llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td | 4 llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h | 2 ++ 6 files changed, 23 insertions(+) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 922bda721dc780..0a3c4494443cad 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -4876,6 +4876,8 @@ def msimd128 : Flag<["-"], "msimd128">, Group; def mno_simd128 : Flag<["-"], "mno-simd128">, Group; def mrelaxed_simd : Flag<["-"], "mrelaxed-simd">, Group; def mno_relaxed_simd : Flag<["-"], "mno-relaxed-simd">, Group; +def mhalf_precision : Flag<["-"], "mhalf-precision">, Group; +def mno_half_precision : Flag<["-"], "mno-half-precision">, Group; def mnontrapping_fptoint : Flag<["-"], "mnontrapping-fptoint">, Group; def mno_nontrapping_fptoint : Flag<["-"], "mno-nontrapping-fptoint">, Group; def msign_ext : Flag<["-"], "msign-ext">, Group; diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp b/clang/lib/Basic/Targets/WebAssembly.cpp index d473fd19086460..3d76411f890a86 100644 --- a/clang/lib/Basic/Targets/WebAssembly.cpp +++ b/clang/lib/Basic/Targets/WebAssembly.cpp @@ -47,6 +47,7 @@ bool WebAssemblyTargetInfo::hasFeature(StringRef Feature) const { return llvm::StringSwitch(Feature) .Case("simd128", SIMDLevel >= SIMD128) .Case("relaxed-simd", SIMDLevel >= RelaxedSIMD) + .Case("half-precision", HasHalfPrecision) .Case("nontrapping-fptoint", HasNontrappingFPToInt) .Case("sign-ext", HasSignExt) .Case("exception-handling", HasExceptionHandling) @@ -156,6 +157,7 @@ bool WebAssemblyTargetInfo::initFeatureMap( Features["reference-types"] = true; Features["sign-ext"] = true; Features["tail-call"] = true; +Features["half-precision"] = true; setSIMDLevel(Features, SIMD128, true); } else if (CPU == "generic") { Features["mutable-globals"] = true; @@ -216,6 +218,15 @@ bool WebAssemblyTargetInfo::handleTargetFeatures( HasBulkMemory = false; continue; } +if (Feature == "+half-precision") { + SIMDLevel = std::max(SIMDLevel, SIMD128); + HasHalfPrecision = true; + continue; +} +if (Feature == "-half-precision") { + HasHalfPrecision = false; + continue; +} if (Feature == "+atomics") { HasAtomics = true; continue; diff --git a/clang/lib/Basic/Targets/WebAssembly.h b/clang/lib/Basic/Targets/WebAssembly.h index 5568aa28eaefa7..e4c18879182ed7 100644 --- a/clang/lib/Basic/Targets/WebAssembly.h +++ b/clang/lib/Basic/Targets/WebAssembly.h @@ -64,6 +64,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public TargetInfo { bool HasReferenceTypes = false; bool HasExtendedConst = false; bool HasMultiMemory = false; + bool HasHalfPrecision = false; std::string ABI; diff --git a/llvm/lib/Target/WebAssembly/WebAssembly.td b/llvm/lib/Target/WebAssembly/WebAssembly.td index d538197450b65b..f00974531209d2 100644 --- a/llvm/lib/Target/WebAssembly/WebAssembly.td +++ b/llvm/lib/Target/WebAssembly/WebAssembly.td @@ -28,6 +28,9 @@ def FeatureSIMD128 : SubtargetFeature<"simd128", "SIMDLevel", "SIMD128", def FeatureRelaxedSIMD : SubtargetFeature<"relaxed-simd", "SIMDLevel", "RelaxedSIMD", "Enable relaxed-simd instructions">; +def FeatureHalfPrecision : SubtargetFeature<"half-precision", "HasHalfPrecision", "true", +"Enable half precision instructions">; + def FeatureAtomics : SubtargetFeature<"atomics", "HasAtomics", "true", "Enable Atomics">; diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td index 59ea9247bd86f5..7b57f8ce90e066 100644 --- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td +++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td @@ -30,6 +30,10 @@ def HasRelaxedSIMD : Predicate<"Subtarget->hasRelaxedSIMD()">, Assembl
[clang] [llvm] [WebAssembly] Add half-precision feature (PR #90248)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/90248 >From 85e5e1660ad1e6fda8ecf8984aab0cba96130b4f Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Fri, 26 Apr 2024 18:30:48 + Subject: [PATCH] [WebAssembly] Add half-precision feature This currently only defines a constant, but in the future will be used to gate builtins for experimenting and prototyping half-precision proposal (https://github.com/WebAssembly/half-precision). --- clang/include/clang/Driver/Options.td | 2 ++ clang/lib/Basic/Targets/WebAssembly.cpp | 11 +++ clang/lib/Basic/Targets/WebAssembly.h | 1 + clang/test/Driver/wasm-features.c | 6 ++ llvm/lib/Target/WebAssembly/WebAssembly.td | 3 +++ llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td | 4 llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h | 2 ++ 7 files changed, 29 insertions(+) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 922bda721dc780..0a3c4494443cad 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -4876,6 +4876,8 @@ def msimd128 : Flag<["-"], "msimd128">, Group; def mno_simd128 : Flag<["-"], "mno-simd128">, Group; def mrelaxed_simd : Flag<["-"], "mrelaxed-simd">, Group; def mno_relaxed_simd : Flag<["-"], "mno-relaxed-simd">, Group; +def mhalf_precision : Flag<["-"], "mhalf-precision">, Group; +def mno_half_precision : Flag<["-"], "mno-half-precision">, Group; def mnontrapping_fptoint : Flag<["-"], "mnontrapping-fptoint">, Group; def mno_nontrapping_fptoint : Flag<["-"], "mno-nontrapping-fptoint">, Group; def msign_ext : Flag<["-"], "msign-ext">, Group; diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp b/clang/lib/Basic/Targets/WebAssembly.cpp index d473fd19086460..3d76411f890a86 100644 --- a/clang/lib/Basic/Targets/WebAssembly.cpp +++ b/clang/lib/Basic/Targets/WebAssembly.cpp @@ -47,6 +47,7 @@ bool WebAssemblyTargetInfo::hasFeature(StringRef Feature) const { return llvm::StringSwitch(Feature) .Case("simd128", SIMDLevel >= SIMD128) .Case("relaxed-simd", SIMDLevel >= RelaxedSIMD) + .Case("half-precision", HasHalfPrecision) .Case("nontrapping-fptoint", HasNontrappingFPToInt) .Case("sign-ext", HasSignExt) .Case("exception-handling", HasExceptionHandling) @@ -156,6 +157,7 @@ bool WebAssemblyTargetInfo::initFeatureMap( Features["reference-types"] = true; Features["sign-ext"] = true; Features["tail-call"] = true; +Features["half-precision"] = true; setSIMDLevel(Features, SIMD128, true); } else if (CPU == "generic") { Features["mutable-globals"] = true; @@ -216,6 +218,15 @@ bool WebAssemblyTargetInfo::handleTargetFeatures( HasBulkMemory = false; continue; } +if (Feature == "+half-precision") { + SIMDLevel = std::max(SIMDLevel, SIMD128); + HasHalfPrecision = true; + continue; +} +if (Feature == "-half-precision") { + HasHalfPrecision = false; + continue; +} if (Feature == "+atomics") { HasAtomics = true; continue; diff --git a/clang/lib/Basic/Targets/WebAssembly.h b/clang/lib/Basic/Targets/WebAssembly.h index 5568aa28eaefa7..e4c18879182ed7 100644 --- a/clang/lib/Basic/Targets/WebAssembly.h +++ b/clang/lib/Basic/Targets/WebAssembly.h @@ -64,6 +64,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public TargetInfo { bool HasReferenceTypes = false; bool HasExtendedConst = false; bool HasMultiMemory = false; + bool HasHalfPrecision = false; std::string ABI; diff --git a/clang/test/Driver/wasm-features.c b/clang/test/Driver/wasm-features.c index 5dae5dbc89b905..1f7fb213498265 100644 --- a/clang/test/Driver/wasm-features.c +++ b/clang/test/Driver/wasm-features.c @@ -77,6 +77,12 @@ // RELAXED-SIMD: "-target-feature" "+relaxed-simd" // NO-RELAXED-SIMD: "-target-feature" "-relaxed-simd" +// RUN: %clang --target=wasm32-unknown-unknown -### %s -mhalf-precision 2>&1 | FileCheck %s -check-prefix=HALF-PRECISION +// RUN: %clang --target=wasm32-unknown-unknown -### %s -mno-half-precision 2>&1 | FileCheck %s -check-prefix=NO-HALF-PRECISION + +// HALF-PRECISION: "-target-feature" "+half-precision" +// NO-HALF-PRECISION: "-target-feature" "-half-precision" + // RUN: %clang --target=wasm32-unknown-unknown -### %s -mexception-handling 2>&1 | FileCheck %s -check-prefix=EXCEPTION-HANDLING // RUN: %clang --target=wasm32-unknown-unknown -### %s -mno-exception-handling 2>&1 | FileCheck %s -check-prefix=NO-EXCEPTION-HANDLING diff --git a/llvm/lib/Target/WebAssembly/WebAssembly.td b/llvm/lib/Target/WebAssembly/WebAssembly.td index d538197450b65b..f00974531209d2 100644 --- a/llvm/lib/Target/WebAssembly/WebAssembly.td +++ b/llvm/lib/Target/WebAssembly/WebAssembly.td @@ -28,6 +28,9 @@ def FeatureSIMD128 : SubtargetFeature<"simd128", "SIMDLevel", "SIMD128", def Feature
[clang] [llvm] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (PR #106465)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/106465 Getting this to work required a few additional changes: - Add builtins for any instructions that can't be done with plain C currently. - Add support for the saturating version of fp_to__I16x8. Other vector sizes supported this already. - Support bitcast of f16x8 to v128. Needed to return a __f16x8 as v128_t. >From 4df403d1d3e32a591b6994acea8f7daa9df78c7b Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 28 Aug 2024 22:56:09 + Subject: [PATCH] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions. Getting this to work required a few additional changes: - Add builtins for any instructions that can't be done with plain C currently. - Add support for the saturating version of fp_to__I16x8. Other vector sizes supported this already. - Support bitcast of f16x8 to v128. Needed to return a __f16x8 as v128_t. --- .../clang/Basic/BuiltinsWebAssembly.def | 9 ++ clang/lib/CodeGen/CGBuiltin.cpp | 12 ++ clang/lib/Headers/wasm_simd128.h | 147 ++ .../intrinsic-header-tests/wasm_simd128.c | 138 +++- .../WebAssembly/WebAssemblyISelLowering.cpp | 9 +- .../WebAssembly/WebAssemblyInstrSIMD.td | 28 ++-- .../CodeGen/WebAssembly/half-precision.ll | 18 +++ 7 files changed, 348 insertions(+), 13 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 034d32c6291b3d..2e80eef2c8b9bc 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -124,6 +124,7 @@ TARGET_BUILTIN(__builtin_wasm_bitmask_i16x8, "UiV8s", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_bitmask_i32x4, "UiV4i", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_bitmask_i64x2, "UiV2LLi", "nc", "simd128") +TARGET_BUILTIN(__builtin_wasm_abs_f16x8, "V8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_abs_f32x4, "V4fV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_abs_f64x2, "V2dV2d", "nc", "simd128") @@ -140,6 +141,10 @@ TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_ceil_f16x8, "V8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_floor_f16x8, "V8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_trunc_f16x8, "V8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_nearest_f16x8, "V8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_trunc_f32x4, "V4fV4f", "nc", "simd128") @@ -151,9 +156,13 @@ TARGET_BUILTIN(__builtin_wasm_nearest_f64x2, "V2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_dot_s_i32x4_i16x8, "V4iV8sV8s", "nc", "simd128") +TARGET_BUILTIN(__builtin_wasm_sqrt_f16x8, "V8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_sqrt_f32x4, "V4fV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_sqrt_f64x2, "V2dV2d", "nc", "simd128") +TARGET_BUILTIN(__builtin_wasm_trunc_saturate_s_i16x8_f16x8, "V8sV8h", "nc", "simd128") +TARGET_BUILTIN(__builtin_wasm_trunc_saturate_u_i16x8_f16x8, "V8sV8h", "nc", "simd128") + TARGET_BUILTIN(__builtin_wasm_trunc_saturate_s_i32x4_f32x4, "V4iV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_trunc_saturate_u_i32x4_f32x4, "V4iV4f", "nc", "simd128") diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 2a733e4d834cfa..bb5367c29b1c3a 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21208,6 +21208,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i32_f64: case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i64_f32: case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i64_f64: + case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i16x8_f16x8: case WebAssembly::BI__builtin_wasm_trunc_saturate_s_i32x4_f32x4: { Value *Src = EmitScalarExpr(E->getArg(0)); llvm::Type *ResT = ConvertType(E->getType()); @@ -21219,6 +21220,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i32_f64: case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i64_f32: case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i64_f64: + case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i16x8_f16x8: case WebAssembly::BI__builtin_wasm_trunc_saturate_u_i32x4_f32x4: { Value *Src = EmitScalarExpr(E->getArg(0)); llvm::Type *ResT = ConvertType(E->getType()); @@ -21266,6 +21268,10 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::was
[clang] [llvm] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (PR #106465)
@@ -165,8 +165,9 @@ def F16x8 : Vec { let prefix = "f16x8"; } -// TODO: Include F16x8 here when half precision is better supported. -defvar AllVecs = [I8x16, I16x8, I32x4, I64x2, F32x4, F64x2]; +// TODO: Remove StdVecs when the F16x8 works every where StdVecs is used. brendandahl wrote: It's not obvious from this patch, but now `AllVecs` is only used in one place for bitcast (which means it now works for f16x8 vectors too). Alternatively, I can leave `AllVecs` alone and just concat F16x8 down where bitcast is supported. https://github.com/llvm/llvm-project/pull/106465 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (PR #106465)
brendandahl wrote: > Would it make sense to put these declarations behind `#ifdef __wasm_fp16__` > so that they aren't declared if fp16 support isn't enabled? I could do that, if that's preferred. I followed what the relaxed instructions did and use the target attribute `__target__("fp16")`. https://github.com/llvm/llvm-project/pull/106465 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (PR #106465)
https://github.com/brendandahl closed https://github.com/llvm/llvm-project/pull/106465 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/90906 Adds a builtin and intrinsic for the f32.load_f16 instruction. The instruction loads an f16 value from memory and puts it in an f32. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is incorrect and will be changed to 0xFC30 soon. >From 14313fa9ef33b4cbc8cf18f280ee885b38015ca4 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 1 May 2024 21:53:39 + Subject: [PATCH] [WebAssembly] Implement prototype f32.load_f16 instruction. Adds a builtin and intrinsic for the f32.load_f16 instruction. The instruction loads an f16 value from memory and puts it in an f32. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is incorrect and will be changed to 0xFC30 soon. --- clang/include/clang/Basic/BuiltinsWebAssembly.def| 3 +++ clang/lib/CodeGen/CGBuiltin.cpp | 5 + clang/test/CodeGen/builtins-wasm.c | 9 +++-- llvm/include/llvm/IR/IntrinsicsWebAssembly.td| 12 .../MCTargetDesc/WebAssemblyMCTargetDesc.h | 1 + .../Target/WebAssembly/WebAssemblyISelLowering.cpp | 8 .../lib/Target/WebAssembly/WebAssemblyInstrMemory.td | 5 + llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td | 7 +++ llvm/test/CodeGen/WebAssembly/half-precision.ll | 12 llvm/test/MC/WebAssembly/simd-encodings.s| 5 - 10 files changed, 64 insertions(+), 3 deletions(-) create mode 100644 llvm/test/CodeGen/WebAssembly/half-precision.ll diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 7e950914ad946d..8b0a1d4579d84c 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -190,6 +190,9 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_s_i16x8, "V8sV16ScV16Sc", TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16ScV4i", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f", "nc", "relaxed-simd") +// Half-Precision (fp16) +TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fi*", "nU", "half-precision") + // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, // in which case the argument spec (second argument) is unused. diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index a370734e00d3e1..e9d465bd2a6b01 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21257,6 +21257,11 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::wasm_relaxed_dot_bf16x8_add_f32); return Builder.CreateCall(Callee, {LHS, RHS, Acc}); } + case WebAssembly::BI__builtin_wasm_loadf16_f32: { +Value *Addr = EmitScalarExpr(E->getArg(0)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32); +return Builder.CreateCall(Callee, {Addr}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 9a323da9a8e846..a845d5429039d4 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -1,5 +1,5 @@ -// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY32 -// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY64 +// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -target-feature +half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY3
[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/90906 >From 14313fa9ef33b4cbc8cf18f280ee885b38015ca4 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 1 May 2024 21:53:39 + Subject: [PATCH 1/2] [WebAssembly] Implement prototype f32.load_f16 instruction. Adds a builtin and intrinsic for the f32.load_f16 instruction. The instruction loads an f16 value from memory and puts it in an f32. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is incorrect and will be changed to 0xFC30 soon. --- clang/include/clang/Basic/BuiltinsWebAssembly.def| 3 +++ clang/lib/CodeGen/CGBuiltin.cpp | 5 + clang/test/CodeGen/builtins-wasm.c | 9 +++-- llvm/include/llvm/IR/IntrinsicsWebAssembly.td| 12 .../MCTargetDesc/WebAssemblyMCTargetDesc.h | 1 + .../Target/WebAssembly/WebAssemblyISelLowering.cpp | 8 .../lib/Target/WebAssembly/WebAssemblyInstrMemory.td | 5 + llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td | 7 +++ llvm/test/CodeGen/WebAssembly/half-precision.ll | 12 llvm/test/MC/WebAssembly/simd-encodings.s| 5 - 10 files changed, 64 insertions(+), 3 deletions(-) create mode 100644 llvm/test/CodeGen/WebAssembly/half-precision.ll diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 7e950914ad946d..8b0a1d4579d84c 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -190,6 +190,9 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_s_i16x8, "V8sV16ScV16Sc", TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16ScV4i", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f", "nc", "relaxed-simd") +// Half-Precision (fp16) +TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fi*", "nU", "half-precision") + // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, // in which case the argument spec (second argument) is unused. diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index a370734e00d3e1..e9d465bd2a6b01 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21257,6 +21257,11 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::wasm_relaxed_dot_bf16x8_add_f32); return Builder.CreateCall(Callee, {LHS, RHS, Acc}); } + case WebAssembly::BI__builtin_wasm_loadf16_f32: { +Value *Addr = EmitScalarExpr(E->getArg(0)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32); +return Builder.CreateCall(Callee, {Addr}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 9a323da9a8e846..a845d5429039d4 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -1,5 +1,5 @@ -// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY32 -// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY64 +// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -target-feature +half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY32 +// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -target-feature +half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBAS
[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)
brendandahl wrote: /cc @tlively @dschuff (I guess I can't assign reviewers since I don't have commit access.) https://github.com/llvm/llvm-project/pull/90906 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)
@@ -321,6 +321,18 @@ def int_wasm_relaxed_dot_bf16x8_add_f32: [llvm_v8i16_ty, llvm_v8i16_ty, llvm_v4f32_ty], [IntrNoMem, IntrSpeculatable]>; +//===--===// +// Half-precision intrinsics (experimental) +//===--===// + +// TODO: Replace these intrinsic with normal ISel patterns once the XXX +// instructions are merged to the proposal. +def int_wasm_loadf16_f32: + Intrinsic<[llvm_float_ty], +[llvm_ptr_ty], +[IntrReadMem, IntrArgMemOnly], + "", [SDNPMemOperand]>; brendandahl wrote: It looks like we have empty/missing names for nearly all of the intrinsics except for `int_wasm_ref_is_null_extern` and `int_wasm_ref_is_null_func`. Do you know what the name is used for? Maybe it's for when the name can't be automatically translated to an llvm name? https://github.com/llvm/llvm-project/pull/90906 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)
@@ -38,6 +38,13 @@ multiclass RELAXED_I; } +multiclass HALF_PRECISION_I pattern_r, string asmstr_r = "", +string asmstr_s = "", bits<32> simdop = -1> { + defm "" : ABSTRACT_SIMD_I; +} + brendandahl wrote: This will be for my next PRs. I'll remove. https://github.com/llvm/llvm-project/pull/90906 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)
brendandahl wrote: > Overall this looks good, and I think it makes sense to model this as short* > for now. I think it will be interesting to see if that ends up causing > issues. Out of curiosity does this work if you try `_fp16`? I was trying _Float16 and that wasn't working since it requires the target to support it. `__fp16` does work though. I'll change it. https://github.com/llvm/llvm-project/pull/90906 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/90906 >From 14313fa9ef33b4cbc8cf18f280ee885b38015ca4 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 1 May 2024 21:53:39 + Subject: [PATCH 1/3] [WebAssembly] Implement prototype f32.load_f16 instruction. Adds a builtin and intrinsic for the f32.load_f16 instruction. The instruction loads an f16 value from memory and puts it in an f32. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is incorrect and will be changed to 0xFC30 soon. --- clang/include/clang/Basic/BuiltinsWebAssembly.def| 3 +++ clang/lib/CodeGen/CGBuiltin.cpp | 5 + clang/test/CodeGen/builtins-wasm.c | 9 +++-- llvm/include/llvm/IR/IntrinsicsWebAssembly.td| 12 .../MCTargetDesc/WebAssemblyMCTargetDesc.h | 1 + .../Target/WebAssembly/WebAssemblyISelLowering.cpp | 8 .../lib/Target/WebAssembly/WebAssemblyInstrMemory.td | 5 + llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td | 7 +++ llvm/test/CodeGen/WebAssembly/half-precision.ll | 12 llvm/test/MC/WebAssembly/simd-encodings.s| 5 - 10 files changed, 64 insertions(+), 3 deletions(-) create mode 100644 llvm/test/CodeGen/WebAssembly/half-precision.ll diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 7e950914ad946d..8b0a1d4579d84c 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -190,6 +190,9 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_s_i16x8, "V8sV16ScV16Sc", TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16ScV4i", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f", "nc", "relaxed-simd") +// Half-Precision (fp16) +TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fi*", "nU", "half-precision") + // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, // in which case the argument spec (second argument) is unused. diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index a370734e00d3e1..e9d465bd2a6b01 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21257,6 +21257,11 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::wasm_relaxed_dot_bf16x8_add_f32); return Builder.CreateCall(Callee, {LHS, RHS, Acc}); } + case WebAssembly::BI__builtin_wasm_loadf16_f32: { +Value *Addr = EmitScalarExpr(E->getArg(0)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32); +return Builder.CreateCall(Callee, {Addr}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 9a323da9a8e846..a845d5429039d4 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -1,5 +1,5 @@ -// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY32 -// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY64 +// RUN: %clang_cc1 -triple wasm32-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -target-feature +half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBASSEMBLY,WEBASSEMBLY32 +// RUN: %clang_cc1 -triple wasm64-unknown-unknown -target-feature +reference-types -target-feature +simd128 -target-feature +relaxed-simd -target-feature +nontrapping-fptoint -target-feature +exception-handling -target-feature +bulk-memory -target-feature +atomics -target-feature +half-precision -flax-vector-conversions=none -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes WEBAS
[clang] [llvm] [WebAssembly] Implement prototype f32.load_f16 instruction. (PR #90906)
@@ -666,3 +666,29 @@ define {i32,i32,i32,i32} @aggregate_return() { define {i64,i32,i16,i8} @aggregate_return_without_merge() { ret {i64,i32,i16,i8} zeroinitializer } + +;=== brendandahl wrote: I didn't add many tests here, since it seems pretty well covered for other load patters. https://github.com/llvm/llvm-project/pull/90906 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/91545 Adds a builtin and intrinsic for the f32.store_f16 instruction. The instruction stores an f32 value as an f16 memory. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is incorrect and will be changed to 0xFC31 soon. >From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 8 May 2024 23:10:07 + Subject: [PATCH] [WebAssembly] Implement prototype f32.store_f16 instruction. Adds a builtin and intrinsic for the f32.store_f16 instruction. The instruction stores an f32 value as an f16 memory. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is incorrect and will be changed to 0xFC31 soon. --- .../clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/CodeGen/CGBuiltin.cpp | 6 + clang/test/CodeGen/builtins-wasm.c| 6 + llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 5 .../MCTargetDesc/WebAssemblyMCTargetDesc.h| 1 + .../WebAssembly/WebAssemblyISelLowering.cpp | 8 ++ .../WebAssembly/WebAssemblyInstrMemory.td | 4 +++ .../CodeGen/WebAssembly/half-precision.ll | 9 +++ llvm/test/CodeGen/WebAssembly/offset.ll | 27 +++ llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 10 files changed, 70 insertions(+) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index cf54f8f4422f8..41fadd10e9432 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -192,6 +192,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" // Half-Precision (fp16) TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") +TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index e8a6bd050e17e..abb644d8eb506 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21308,6 +21308,12 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32); return Builder.CreateCall(Callee, {Addr}); } + case WebAssembly::BI__builtin_wasm_storef16_f32: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Value *Addr = EmitScalarExpr(E->getArg(1)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32); +return Builder.CreateCall(Callee, {Val, Addr}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index ab1c6cd494ae5..bcb15969de1c5 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) { // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}}) } +void store_f16_f32(float val, __fp16 *addr) { + return __builtin_wasm_storef16_f32(val, addr); + // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr %{{.*}}) + // WEBASSEMBLY-NEXT: ret +} + __externref_t externref_null() { return __builtin_wasm_ref_null_extern(); // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern() diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index f8142a8ca9e93..572d334ac9552 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -332,6 +332,11 @@ def int_wasm_loadf16_f32: [llvm_ptr_ty], [IntrReadMem, IntrArgMemOnly], "", [SDNPMemOperand]>; +def int_wasm_storef16_f32: + Intrinsic<[], +[llvm_float_ty, llvm_ptr_ty], +[IntrWriteMem, IntrArgMemOnly], + "", [SDNPMemOperand]>; //===--===// diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h index d3b496ae59179..d4e9fb057c44d 100644 --- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h +++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h @@ -207,6 +207,7 @@ inline unsigned GetDefaultP2
[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)
@@ -192,6 +192,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" // Half-Precision (fp16) TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") +TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision") brendandahl wrote: What does `pure` mean in this context? The docs in clang/Basic/Builtins.def don't have any info on this. https://github.com/llvm/llvm-project/pull/91545 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/91545 >From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 8 May 2024 23:10:07 + Subject: [PATCH 1/2] [WebAssembly] Implement prototype f32.store_f16 instruction. Adds a builtin and intrinsic for the f32.store_f16 instruction. The instruction stores an f32 value as an f16 memory. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is incorrect and will be changed to 0xFC31 soon. --- .../clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/CodeGen/CGBuiltin.cpp | 6 + clang/test/CodeGen/builtins-wasm.c| 6 + llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 5 .../MCTargetDesc/WebAssemblyMCTargetDesc.h| 1 + .../WebAssembly/WebAssemblyISelLowering.cpp | 8 ++ .../WebAssembly/WebAssemblyInstrMemory.td | 4 +++ .../CodeGen/WebAssembly/half-precision.ll | 9 +++ llvm/test/CodeGen/WebAssembly/offset.ll | 27 +++ llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 10 files changed, 70 insertions(+) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index cf54f8f4422f..41fadd10e943 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -192,6 +192,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" // Half-Precision (fp16) TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") +TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index e8a6bd050e17..abb644d8eb50 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21308,6 +21308,12 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32); return Builder.CreateCall(Callee, {Addr}); } + case WebAssembly::BI__builtin_wasm_storef16_f32: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Value *Addr = EmitScalarExpr(E->getArg(1)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32); +return Builder.CreateCall(Callee, {Val, Addr}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index ab1c6cd494ae..bcb15969de1c 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) { // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}}) } +void store_f16_f32(float val, __fp16 *addr) { + return __builtin_wasm_storef16_f32(val, addr); + // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr %{{.*}}) + // WEBASSEMBLY-NEXT: ret +} + __externref_t externref_null() { return __builtin_wasm_ref_null_extern(); // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern() diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index f8142a8ca9e9..572d334ac955 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -332,6 +332,11 @@ def int_wasm_loadf16_f32: [llvm_ptr_ty], [IntrReadMem, IntrArgMemOnly], "", [SDNPMemOperand]>; +def int_wasm_storef16_f32: + Intrinsic<[], +[llvm_float_ty, llvm_ptr_ty], +[IntrWriteMem, IntrArgMemOnly], + "", [SDNPMemOperand]>; //===--===// diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h index d3b496ae5917..d4e9fb057c44 100644 --- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h +++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h @@ -207,6 +207,7 @@ inline unsigned GetDefaultP2AlignAny(unsigned Opc) { WASM_LOAD_STORE(LOAD_LANE_I16x8) WASM_LOAD_STORE(STORE_LANE_I16x8) WASM_LOAD_STORE(LOAD_F16_F32) + WASM_LOAD_STORE(STORE_F16_F32) return 1; WASM_LOAD_STORE(LOAD_I32) WASM_LOAD_STORE(LOAD_F32) diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp index ed52fe53bc60..527bb
[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/91545 >From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 8 May 2024 23:10:07 + Subject: [PATCH 1/3] [WebAssembly] Implement prototype f32.store_f16 instruction. Adds a builtin and intrinsic for the f32.store_f16 instruction. The instruction stores an f32 value as an f16 memory. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is incorrect and will be changed to 0xFC31 soon. --- .../clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/CodeGen/CGBuiltin.cpp | 6 + clang/test/CodeGen/builtins-wasm.c| 6 + llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 5 .../MCTargetDesc/WebAssemblyMCTargetDesc.h| 1 + .../WebAssembly/WebAssemblyISelLowering.cpp | 8 ++ .../WebAssembly/WebAssemblyInstrMemory.td | 4 +++ .../CodeGen/WebAssembly/half-precision.ll | 9 +++ llvm/test/CodeGen/WebAssembly/offset.ll | 27 +++ llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 10 files changed, 70 insertions(+) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index cf54f8f4422f..41fadd10e943 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -192,6 +192,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" // Half-Precision (fp16) TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") +TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index e8a6bd050e17..abb644d8eb50 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21308,6 +21308,12 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32); return Builder.CreateCall(Callee, {Addr}); } + case WebAssembly::BI__builtin_wasm_storef16_f32: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Value *Addr = EmitScalarExpr(E->getArg(1)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32); +return Builder.CreateCall(Callee, {Val, Addr}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index ab1c6cd494ae..bcb15969de1c 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) { // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}}) } +void store_f16_f32(float val, __fp16 *addr) { + return __builtin_wasm_storef16_f32(val, addr); + // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr %{{.*}}) + // WEBASSEMBLY-NEXT: ret +} + __externref_t externref_null() { return __builtin_wasm_ref_null_extern(); // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern() diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index f8142a8ca9e9..572d334ac955 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -332,6 +332,11 @@ def int_wasm_loadf16_f32: [llvm_ptr_ty], [IntrReadMem, IntrArgMemOnly], "", [SDNPMemOperand]>; +def int_wasm_storef16_f32: + Intrinsic<[], +[llvm_float_ty, llvm_ptr_ty], +[IntrWriteMem, IntrArgMemOnly], + "", [SDNPMemOperand]>; //===--===// diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h index d3b496ae5917..d4e9fb057c44 100644 --- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h +++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h @@ -207,6 +207,7 @@ inline unsigned GetDefaultP2AlignAny(unsigned Opc) { WASM_LOAD_STORE(LOAD_LANE_I16x8) WASM_LOAD_STORE(STORE_LANE_I16x8) WASM_LOAD_STORE(LOAD_F16_F32) + WASM_LOAD_STORE(STORE_F16_F32) return 1; WASM_LOAD_STORE(LOAD_I32) WASM_LOAD_STORE(LOAD_F32) diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp index ed52fe53bc60..527bb
[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/91545 >From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 8 May 2024 23:10:07 + Subject: [PATCH 1/4] [WebAssembly] Implement prototype f32.store_f16 instruction. Adds a builtin and intrinsic for the f32.store_f16 instruction. The instruction stores an f32 value as an f16 memory. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is incorrect and will be changed to 0xFC31 soon. --- .../clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/CodeGen/CGBuiltin.cpp | 6 + clang/test/CodeGen/builtins-wasm.c| 6 + llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 5 .../MCTargetDesc/WebAssemblyMCTargetDesc.h| 1 + .../WebAssembly/WebAssemblyISelLowering.cpp | 8 ++ .../WebAssembly/WebAssemblyInstrMemory.td | 4 +++ .../CodeGen/WebAssembly/half-precision.ll | 9 +++ llvm/test/CodeGen/WebAssembly/offset.ll | 27 +++ llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 10 files changed, 70 insertions(+) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index cf54f8f4422f8..41fadd10e9432 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -192,6 +192,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" // Half-Precision (fp16) TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") +TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index e8a6bd050e17e..abb644d8eb506 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21308,6 +21308,12 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32); return Builder.CreateCall(Callee, {Addr}); } + case WebAssembly::BI__builtin_wasm_storef16_f32: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Value *Addr = EmitScalarExpr(E->getArg(1)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32); +return Builder.CreateCall(Callee, {Val, Addr}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index ab1c6cd494ae5..bcb15969de1c5 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) { // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}}) } +void store_f16_f32(float val, __fp16 *addr) { + return __builtin_wasm_storef16_f32(val, addr); + // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr %{{.*}}) + // WEBASSEMBLY-NEXT: ret +} + __externref_t externref_null() { return __builtin_wasm_ref_null_extern(); // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern() diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index f8142a8ca9e93..572d334ac9552 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -332,6 +332,11 @@ def int_wasm_loadf16_f32: [llvm_ptr_ty], [IntrReadMem, IntrArgMemOnly], "", [SDNPMemOperand]>; +def int_wasm_storef16_f32: + Intrinsic<[], +[llvm_float_ty, llvm_ptr_ty], +[IntrWriteMem, IntrArgMemOnly], + "", [SDNPMemOperand]>; //===--===// diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h index d3b496ae59179..d4e9fb057c44d 100644 --- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h +++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h @@ -207,6 +207,7 @@ inline unsigned GetDefaultP2AlignAny(unsigned Opc) { WASM_LOAD_STORE(LOAD_LANE_I16x8) WASM_LOAD_STORE(STORE_LANE_I16x8) WASM_LOAD_STORE(LOAD_F16_F32) + WASM_LOAD_STORE(STORE_F16_F32) return 1; WASM_LOAD_STORE(LOAD_I32) WASM_LOAD_STORE(LOAD_F32) diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp index ed52fe53b
[clang] [llvm] [WebAssembly] Implement prototype f32.store_f16 instruction. (PR #91545)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/91545 >From adcb77e15d09f466f217d754f6f80aeb729aadc4 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 8 May 2024 23:10:07 + Subject: [PATCH 1/5] [WebAssembly] Implement prototype f32.store_f16 instruction. Adds a builtin and intrinsic for the f32.store_f16 instruction. The instruction stores an f32 value as an f16 memory. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is incorrect and will be changed to 0xFC31 soon. --- .../clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/CodeGen/CGBuiltin.cpp | 6 + clang/test/CodeGen/builtins-wasm.c| 6 + llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 5 .../MCTargetDesc/WebAssemblyMCTargetDesc.h| 1 + .../WebAssembly/WebAssemblyISelLowering.cpp | 8 ++ .../WebAssembly/WebAssemblyInstrMemory.td | 4 +++ .../CodeGen/WebAssembly/half-precision.ll | 9 +++ llvm/test/CodeGen/WebAssembly/offset.ll | 27 +++ llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 10 files changed, 70 insertions(+) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index cf54f8f4422f8..41fadd10e9432 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -192,6 +192,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" // Half-Precision (fp16) TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") +TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "nU", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index e8a6bd050e17e..abb644d8eb506 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21308,6 +21308,12 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_loadf16_f32); return Builder.CreateCall(Callee, {Addr}); } + case WebAssembly::BI__builtin_wasm_storef16_f32: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Value *Addr = EmitScalarExpr(E->getArg(1)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32); +return Builder.CreateCall(Callee, {Val, Addr}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index ab1c6cd494ae5..bcb15969de1c5 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -807,6 +807,12 @@ float load_f16_f32(__fp16 *addr) { // WEBASSEMBLY: call float @llvm.wasm.loadf16.f32(ptr %{{.*}}) } +void store_f16_f32(float val, __fp16 *addr) { + return __builtin_wasm_storef16_f32(val, addr); + // WEBASSEMBLY: tail call void @llvm.wasm.storef16.f32(float %val, ptr %{{.*}}) + // WEBASSEMBLY-NEXT: ret +} + __externref_t externref_null() { return __builtin_wasm_ref_null_extern(); // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern() diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index f8142a8ca9e93..572d334ac9552 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -332,6 +332,11 @@ def int_wasm_loadf16_f32: [llvm_ptr_ty], [IntrReadMem, IntrArgMemOnly], "", [SDNPMemOperand]>; +def int_wasm_storef16_f32: + Intrinsic<[], +[llvm_float_ty, llvm_ptr_ty], +[IntrWriteMem, IntrArgMemOnly], + "", [SDNPMemOperand]>; //===--===// diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h index d3b496ae59179..d4e9fb057c44d 100644 --- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h +++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h @@ -207,6 +207,7 @@ inline unsigned GetDefaultP2AlignAny(unsigned Opc) { WASM_LOAD_STORE(LOAD_LANE_I16x8) WASM_LOAD_STORE(STORE_LANE_I16x8) WASM_LOAD_STORE(LOAD_F16_F32) + WASM_LOAD_STORE(STORE_F16_F32) return 1; WASM_LOAD_STORE(LOAD_I32) WASM_LOAD_STORE(LOAD_F32) diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp index ed52fe53b
[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/93228 Adds a builtin and intrinsic for the f16x8.splat instruction. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect and will be changed to 0x120 soon. >From 002e33294cae26796ca79a66dbd275f3e26807d2 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 21 May 2024 21:15:14 + Subject: [PATCH] [WebAssembly] Implement prototype f16x8.splat instruction. Adds a builtin and intrinsic for the f16x8.splat instruction. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect and will be changed to 0x120 soon. --- clang/include/clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/Basic/Targets/WebAssembly.h | 3 +++ clang/lib/CodeGen/CGBuiltin.cpp | 5 + clang/test/CodeGen/builtins-wasm.c| 6 ++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 4 .../Utils/WebAssemblyTypeUtilities.cpp| 1 + .../WebAssembly/WebAssemblyISelLowering.cpp | 3 +++ .../Target/WebAssembly/WebAssemblyInstrSIMD.td| 15 +++ .../Target/WebAssembly/WebAssemblyRegisterInfo.td | 5 +++-- llvm/test/CodeGen/WebAssembly/half-precision.ll | 12 ++-- llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 11 files changed, 54 insertions(+), 4 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 8645cff1e8679..dbe79aa39190d 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -193,6 +193,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" // Half-Precision (fp16) TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision") +TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/Basic/Targets/WebAssembly.h b/clang/lib/Basic/Targets/WebAssembly.h index 4db97867df607..63f4e72a9c2de 100644 --- a/clang/lib/Basic/Targets/WebAssembly.h +++ b/clang/lib/Basic/Targets/WebAssembly.h @@ -90,6 +90,9 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public TargetInfo { StringRef getABI() const override; bool setABI(const std::string &Name) override; + bool useFP16ConversionIntrinsics() const override { +return false; + } protected: void getTargetDefines(const LangOptions &Opts, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index ba94bf89e4751..91083c1cfae96 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21230,6 +21230,11 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32); return Builder.CreateCall(Callee, {Val, Addr}); } + case WebAssembly::BI__builtin_wasm_splat_f16x8: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_splat_f16x8); +return Builder.CreateCall(Callee, {Val}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index bcb15969de1c5..76c6305d422a2 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -11,6 +11,7 @@ typedef unsigned char u8x16 __attribute((vector_size(16))); typedef unsigned short u16x8 __attribute((vector_size(16))); typedef unsigned int u32x4 __attribute((vector_size(16))); typedef unsigned long long u64x2 __attribute((vector_size(16))); +typedef __fp16 f16x8 __attribute((vector_size(16))); typedef float f32x4 __attribute((vector_size(16))); typedef double f64x2 __attribute((vector_size(16))); @@ -813,6 +814,11 @@ void store_f16_f32(float val, __fp16 *addr) { // WEBASSEMBLY-NEXT: ret } +f16x8 splat_f16x8(float a) { + // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.splat.f16x8(float %a) + // WEBASSEMBLY-NEXT: ret <8 x half> %0 + return __builtin_wasm_splat_f16x8(a); +} __externref_t externref_null() { return __builtin_wasm_ref_null_extern(); // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern() diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td i
[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)
brendandahl wrote: cc @aheejin @dschuff As mentioned in the meeting, it looks like it will be a lot more work to get half value's working with normal patterns, so for now I'll stick to just built-ins and intrinsics. https://github.com/llvm/llvm-project/pull/93228 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/93228 >From 28cc678038feefffceba8cbe24349e1885b24c75 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 21 May 2024 21:15:14 + Subject: [PATCH] [WebAssembly] Implement prototype f16x8.splat instruction. Adds a builtin and intrinsic for the f16x8.splat instruction. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect and will be changed to 0x120 soon. --- clang/include/clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/Basic/Targets/WebAssembly.h | 1 + clang/lib/CodeGen/CGBuiltin.cpp | 5 + clang/test/CodeGen/builtins-wasm.c| 6 ++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 4 .../Utils/WebAssemblyTypeUtilities.cpp| 1 + .../WebAssembly/WebAssemblyISelLowering.cpp | 3 +++ .../Target/WebAssembly/WebAssemblyInstrSIMD.td| 15 +++ .../Target/WebAssembly/WebAssemblyRegisterInfo.td | 5 +++-- llvm/test/CodeGen/WebAssembly/half-precision.ll | 12 ++-- llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 11 files changed, 52 insertions(+), 4 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 8645cff1e8679..dbe79aa39190d 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -193,6 +193,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" // Half-Precision (fp16) TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision") +TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/Basic/Targets/WebAssembly.h b/clang/lib/Basic/Targets/WebAssembly.h index 4db97867df607..46416d516b42f 100644 --- a/clang/lib/Basic/Targets/WebAssembly.h +++ b/clang/lib/Basic/Targets/WebAssembly.h @@ -90,6 +90,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public TargetInfo { StringRef getABI() const override; bool setABI(const std::string &Name) override; + bool useFP16ConversionIntrinsics() const override { return false; } protected: void getTargetDefines(const LangOptions &Opts, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index ba94bf89e4751..91083c1cfae96 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21230,6 +21230,11 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32); return Builder.CreateCall(Callee, {Val, Addr}); } + case WebAssembly::BI__builtin_wasm_splat_f16x8: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_splat_f16x8); +return Builder.CreateCall(Callee, {Val}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index bcb15969de1c5..76c6305d422a2 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -11,6 +11,7 @@ typedef unsigned char u8x16 __attribute((vector_size(16))); typedef unsigned short u16x8 __attribute((vector_size(16))); typedef unsigned int u32x4 __attribute((vector_size(16))); typedef unsigned long long u64x2 __attribute((vector_size(16))); +typedef __fp16 f16x8 __attribute((vector_size(16))); typedef float f32x4 __attribute((vector_size(16))); typedef double f64x2 __attribute((vector_size(16))); @@ -813,6 +814,11 @@ void store_f16_f32(float val, __fp16 *addr) { // WEBASSEMBLY-NEXT: ret } +f16x8 splat_f16x8(float a) { + // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.splat.f16x8(float %a) + // WEBASSEMBLY-NEXT: ret <8 x half> %0 + return __builtin_wasm_splat_f16x8(a); +} __externref_t externref_null() { return __builtin_wasm_ref_null_extern(); // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern() diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index 572d334ac9552..c950b33182689 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -337,6 +337,10 @@ def int_wasm_storef16_f32: [llvm_float_ty, llvm_ptr_ty], [IntrWriteMem, IntrArgMemOnly], "", [SDNPMemOperand]>; +def int_wasm_
[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)
@@ -90,6 +90,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public TargetInfo { StringRef getABI() const override; bool setABI(const std::string &Name) override; + bool useFP16ConversionIntrinsics() const override { return false; } brendandahl wrote: Yeah, this is what causes clang to start outputting `half` types. I could conditionally enable this with `return !HasHalfPrecision;` instead. Though doing a quick test with scalar `__fp16` in c, the `half` types seems to work correctly regardless of this setting. https://github.com/llvm/llvm-project/pull/93228 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/93228 >From 28cc678038feefffceba8cbe24349e1885b24c75 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 21 May 2024 21:15:14 + Subject: [PATCH 1/2] [WebAssembly] Implement prototype f16x8.splat instruction. Adds a builtin and intrinsic for the f16x8.splat instruction. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect and will be changed to 0x120 soon. --- clang/include/clang/Basic/BuiltinsWebAssembly.def | 1 + clang/lib/Basic/Targets/WebAssembly.h | 1 + clang/lib/CodeGen/CGBuiltin.cpp | 5 + clang/test/CodeGen/builtins-wasm.c| 6 ++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td | 4 .../Utils/WebAssemblyTypeUtilities.cpp| 1 + .../WebAssembly/WebAssemblyISelLowering.cpp | 3 +++ .../Target/WebAssembly/WebAssemblyInstrSIMD.td| 15 +++ .../Target/WebAssembly/WebAssemblyRegisterInfo.td | 5 +++-- llvm/test/CodeGen/WebAssembly/half-precision.ll | 12 ++-- llvm/test/MC/WebAssembly/simd-encodings.s | 3 +++ 11 files changed, 52 insertions(+), 4 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 8645cff1e8679..dbe79aa39190d 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -193,6 +193,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" // Half-Precision (fp16) TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision") +TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/Basic/Targets/WebAssembly.h b/clang/lib/Basic/Targets/WebAssembly.h index 4db97867df607..46416d516b42f 100644 --- a/clang/lib/Basic/Targets/WebAssembly.h +++ b/clang/lib/Basic/Targets/WebAssembly.h @@ -90,6 +90,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyTargetInfo : public TargetInfo { StringRef getABI() const override; bool setABI(const std::string &Name) override; + bool useFP16ConversionIntrinsics() const override { return false; } protected: void getTargetDefines(const LangOptions &Opts, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index ba94bf89e4751..91083c1cfae96 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21230,6 +21230,11 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_storef16_f32); return Builder.CreateCall(Callee, {Val, Addr}); } + case WebAssembly::BI__builtin_wasm_splat_f16x8: { +Value *Val = EmitScalarExpr(E->getArg(0)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_splat_f16x8); +return Builder.CreateCall(Callee, {Val}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index bcb15969de1c5..76c6305d422a2 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -11,6 +11,7 @@ typedef unsigned char u8x16 __attribute((vector_size(16))); typedef unsigned short u16x8 __attribute((vector_size(16))); typedef unsigned int u32x4 __attribute((vector_size(16))); typedef unsigned long long u64x2 __attribute((vector_size(16))); +typedef __fp16 f16x8 __attribute((vector_size(16))); typedef float f32x4 __attribute((vector_size(16))); typedef double f64x2 __attribute((vector_size(16))); @@ -813,6 +814,11 @@ void store_f16_f32(float val, __fp16 *addr) { // WEBASSEMBLY-NEXT: ret } +f16x8 splat_f16x8(float a) { + // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.splat.f16x8(float %a) + // WEBASSEMBLY-NEXT: ret <8 x half> %0 + return __builtin_wasm_splat_f16x8(a); +} __externref_t externref_null() { return __builtin_wasm_ref_null_extern(); // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern() diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index 572d334ac9552..c950b33182689 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -337,6 +337,10 @@ def int_wasm_storef16_f32: [llvm_float_ty, llvm_ptr_ty], [IntrWriteMem, IntrArgMemOnly], "", [SDNPMemOperand]>; +def int_
[clang] [llvm] [WebAssembly] Implement prototype f16x8.splat instruction. (PR #93228)
brendandahl wrote: > LGTM % `!HasHalfPrecision` thing > > By the way I guess you can try getting commit access soon? I think it is > still "Send an email to Chris" though... Done, can I get a squash and merge? I'll look into getting commit access. https://github.com/llvm/llvm-project/pull/93228 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Implement prototype f16x8.extract_lane instruction. (PR #93272)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/93272 Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f16x8.extract_lane as opcode 0x124, but this is incorrect and will be changed to 0x121 soon. >From ee046630b80786b920b5e7d0742c27443d3ea2b0 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Thu, 23 May 2024 21:04:31 + Subject: [PATCH] [WebAssembly] Implement prototype f16x8.extract_lane instruction. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md Note: the current spec has f16x8.extract_lane as opcode 0x124, but this is incorrect and will be changed to 0x121 soon. --- clang/include/clang/Basic/BuiltinsWebAssembly.def| 1 + clang/lib/CodeGen/CGBuiltin.cpp | 6 ++ clang/test/CodeGen/builtins-wasm.c | 6 ++ llvm/include/llvm/IR/IntrinsicsWebAssembly.td| 4 .../WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h | 2 ++ llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp| 4 +++- llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td | 9 + llvm/test/CodeGen/WebAssembly/half-precision.ll | 8 llvm/test/MC/WebAssembly/simd-encodings.s| 3 +++ 9 files changed, 42 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index dbe79aa39190d..fd8c1b480d6da 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -194,6 +194,7 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision") TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "half-precision") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 91083c1cfae96..0549afa12e430 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -21235,6 +21235,12 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_splat_f16x8); return Builder.CreateCall(Callee, {Val}); } + case WebAssembly::BI__builtin_wasm_extract_lane_f16x8: { +Value *Vector = EmitScalarExpr(E->getArg(0)); +Value *Index = EmitScalarExpr(E->getArg(1)); +Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_extract_lane_f16x8); +return Builder.CreateCall(Callee, {Vector, Index}); + } case WebAssembly::BI__builtin_wasm_table_get: { assert(E->getArg(0)->getType()->isArrayType()); Value *Table = EmitArrayToPointerDecay(E->getArg(0)).emitRawPointer(*this); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 76c6305d422a2..93a6ab06081c9 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -819,6 +819,12 @@ f16x8 splat_f16x8(float a) { // WEBASSEMBLY-NEXT: ret <8 x half> %0 return __builtin_wasm_splat_f16x8(a); } + +float extract_lane_f16x8(f16x8 a, int i) { + // WEBASSEMBLY: %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x half> %a, i32 %i) + // WEBASSEMBLY-NEXT: ret float %0 + return __builtin_wasm_extract_lane_f16x8(a, i); +} __externref_t externref_null() { return __builtin_wasm_ref_null_extern(); // WEBASSEMBLY: tail call ptr addrspace(10) @llvm.wasm.ref.null.extern() diff --git a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td index c950b33182689..237f268784bb0 100644 --- a/llvm/include/llvm/IR/IntrinsicsWebAssembly.td +++ b/llvm/include/llvm/IR/IntrinsicsWebAssembly.td @@ -341,6 +341,10 @@ def int_wasm_splat_f16x8: DefaultAttrsIntrinsic<[llvm_v8f16_ty], [llvm_float_ty], [IntrNoMem, IntrSpeculatable]>; +def int_wasm_extract_lane_f16x8: + DefaultAttrsIntrinsic<[llvm_float_ty], +[llvm_v8f16_ty, llvm_i32_ty], +[IntrNoMem, IntrSpeculatable]>; //===--===// diff --git a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h index d4e9fb057c44d..34502170a5c71 100644 --- a/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h +++ b/llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h @@ -345,6
[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/93360 This reuses most of the code that was created for f32x4 and f64x2 binary instructions and tries to follow how they were implemented. add/sub/mul/div - use regular LL instructions min/max - use the minimum/maximum intrinsic, and also have builtins pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md >From c33801afebb6720bc4b51fb4064b59529c40d298 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Thu, 23 May 2024 23:38:51 + Subject: [PATCH] [WebAssembly] Implement all f16x8 binary instructions. This reuses most of the code that was created for f32x4 and f64x2 binary instructions and tries to follow how they were implemented. add/sub/mul/div - use regular LL instructions min/max - use the minimum/maximum intrinsic, and also have builtins pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md --- .../clang/Basic/BuiltinsWebAssembly.def | 4 ++ clang/lib/CodeGen/CGBuiltin.cpp | 4 ++ clang/test/CodeGen/builtins-wasm.c| 24 +++ .../WebAssembly/WebAssemblyISelLowering.cpp | 5 ++ .../WebAssembly/WebAssemblyInstrSIMD.td | 37 +++--- .../CodeGen/WebAssembly/half-precision.ll | 68 +++ llvm/test/MC/WebAssembly/simd-encodings.s | 24 +++ 7 files changed, 157 insertions(+), 9 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index fd8c1b480d6da..4e48ff48b60f5 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -135,6 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128") +TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision") TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128") diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 0549afa12e430..f8be7182b5267 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -20779,6 +20779,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, } case WebAssembly::BI__builtin_wasm_min_f32: case WebAssembly::BI__builtin_wasm_min_f64: + case WebAssembly::BI__builtin_wasm_min_f16x8: case WebAssembly::BI__builtin_wasm_min_f32x4: case WebAssembly::BI__builtin_wasm_min_f64x2: { Value *LHS = EmitScalarExpr(E->getArg(0)); @@ -20789,6 +20790,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, } case WebAssembly::BI__builtin_wasm_max_f32: case WebAssembly::BI__builtin_wasm_max_f64: + case WebAssembly::BI__builtin_wasm_max_f16x8: case WebAssembly::BI__builtin_wasm_max_f32x4: case WebAssembly::BI__builtin_wasm_max_f64x2: { Value *LHS = EmitScalarExpr(E->getArg(0)); @@ -20797,6 +20799,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::maximum, ConvertType(E->getType())); return Builder.CreateCall(Callee, {LHS, RHS}); } + case WebAssembly::BI__builtin_wasm_pmin_f16x8: case WebAssembly::BI__builtin_wasm_pmin_f32x4: case WebAssembly::BI__builtin_wasm_pmin_f64x2: { Value *LHS = EmitScalarExpr(E->getArg(0)); @@ -20805,6 +20808,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::wasm_pmin, ConvertType(E->getType())); return Builder.CreateCall(Callee, {LHS, RHS}); } + case WebAssembly::BI__builtin_wasm_pmax_f16x8: case WebAssembly::BI__builtin_wasm_pmax_f32x4: case WebAssembly::BI__builtin_wasm_pmax_f64x2: { Value *LHS = EmitScalarExpr(E->getArg(0)); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 93a6ab06081c9..d6ee4f68700dc 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -825,6 +825,30 @@ float extract_lane_f16x8(f16x8 a, int i) { // WEBASSEMBLY-NEXT: ret float %0 return __builtin_wasm_extract_lane_f16x8(a, i); } + +f16x8 min_f16x8(f16x8 a, f16x8 b) { + // WEBASSEMBLY: %
[clang] [llvm] [WebAssembly] Implement all f16x8 binary instructions. (PR #93360)
https://github.com/brendandahl edited https://github.com/llvm/llvm-project/pull/93360 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [WebAssembly] Change half-precision feature name to fp16. (PR #105434)
https://github.com/brendandahl created https://github.com/llvm/llvm-project/pull/105434 This better aligns with how the feature is being referred to and what runtimes (V8) are calling it. >From c4d120d4ec01f2af4e6ad748543ed195aa8f6721 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 20 Aug 2024 21:55:47 + Subject: [PATCH] [WebAssembly] Change half-precision feature name to fp16. This better aligns with how the feature is being referred to and what runtimes (V8) are calling it. --- .../clang/Basic/BuiltinsWebAssembly.def | 22 +-- clang/include/clang/Driver/Options.td | 4 ++-- clang/lib/Basic/Targets/WebAssembly.cpp | 16 +++--- clang/lib/Basic/Targets/WebAssembly.h | 4 ++-- clang/test/CodeGen/builtins-wasm.c| 4 ++-- clang/test/Driver/wasm-features.c | 8 +++ .../test/Preprocessor/wasm-target-features.c | 16 +++--- llvm/lib/Target/WebAssembly/WebAssembly.td| 8 +++ .../WebAssembly/WebAssemblyISelLowering.cpp | 4 ++-- .../WebAssembly/WebAssemblyInstrInfo.td | 6 ++--- .../WebAssembly/WebAssemblyInstrMemory.td | 4 ++-- .../WebAssembly/WebAssemblyInstrSIMD.td | 12 +- .../Target/WebAssembly/WebAssemblySubtarget.h | 4 ++-- .../CodeGen/WebAssembly/half-precision.ll | 4 ++-- llvm/test/CodeGen/WebAssembly/offset.ll | 2 +- .../WebAssembly/target-features-cpus.ll | 6 ++--- llvm/test/MC/WebAssembly/simd-encodings.s | 2 +- 17 files changed, 63 insertions(+), 63 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index df304a71e475ec..034d32c6291b3d 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -135,10 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128") -TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128") @@ -170,8 +170,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f32x4, "V4fV4fV4fV4f", "nc", "relaxed TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f32x4, "V4fV4fV4fV4f", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f64x2, "V2dV2dV2dV2d", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f64x2, "V2dV2dV2dV2d", "nc", "relaxed-simd") -TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i8x16, "V16ScV16ScV16ScV16Sc", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i16x8, "V8sV8sV8sV8s", "nc", "relaxed-simd") @@ -197,11 +197,11 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16S TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f", "nc", "relaxed-simd") // Half-Precision (fp16) -TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") -TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision") -TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16") +TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16") +TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Drive
[clang] [llvm] [WebAssembly] Change half-precision feature name to fp16. (PR #105434)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/105434 >From fe8fc8201cd3ed5c2909ef512c55e70a30e14a5e Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 20 Aug 2024 21:55:47 + Subject: [PATCH] [WebAssembly] Change half-precision feature name to fp16. This better aligns with how the feature is being referred to and what runtimes (V8) are calling it. --- .../clang/Basic/BuiltinsWebAssembly.def | 22 +-- clang/include/clang/Driver/Options.td | 4 ++-- clang/lib/Basic/Targets/WebAssembly.cpp | 16 +++--- clang/lib/Basic/Targets/WebAssembly.h | 6 ++--- clang/test/CodeGen/builtins-wasm.c| 4 ++-- clang/test/Driver/wasm-features.c | 8 +++ .../test/Preprocessor/wasm-target-features.c | 16 +++--- llvm/lib/Target/WebAssembly/WebAssembly.td| 8 +++ .../WebAssembly/WebAssemblyISelLowering.cpp | 4 ++-- .../WebAssembly/WebAssemblyInstrInfo.td | 6 ++--- .../WebAssembly/WebAssemblyInstrMemory.td | 4 ++-- .../WebAssembly/WebAssemblyInstrSIMD.td | 12 +- .../Target/WebAssembly/WebAssemblySubtarget.h | 4 ++-- .../CodeGen/WebAssembly/half-precision.ll | 4 ++-- llvm/test/CodeGen/WebAssembly/offset.ll | 2 +- .../WebAssembly/target-features-cpus.ll | 6 ++--- llvm/test/MC/WebAssembly/simd-encodings.s | 2 +- 17 files changed, 63 insertions(+), 65 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index df304a71e475ec..034d32c6291b3d 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -135,10 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128") -TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128") @@ -170,8 +170,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f32x4, "V4fV4fV4fV4f", "nc", "relaxed TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f32x4, "V4fV4fV4fV4f", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f64x2, "V2dV2dV2dV2d", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f64x2, "V2dV2dV2dV2d", "nc", "relaxed-simd") -TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i8x16, "V16ScV16ScV16ScV16Sc", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i16x8, "V8sV8sV8sV8s", "nc", "relaxed-simd") @@ -197,11 +197,11 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16S TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f", "nc", "relaxed-simd") // Half-Precision (fp16) -TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") -TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision") -TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16") +TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16") +TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index c204062b4f7353..89239789b3d492 100644 --- a/clang/include/clang/Driver/Options.td
[clang] [llvm] [WebAssembly] Change half-precision feature name to fp16. (PR #105434)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/105434 >From e992578b7269c365e619fe201e7cc703149c7067 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 20 Aug 2024 21:55:47 + Subject: [PATCH] [WebAssembly] Change half-precision feature name to fp16. This better aligns with how the feature is being referred to and what runtimes (V8) are calling it. --- .../clang/Basic/BuiltinsWebAssembly.def | 22 +-- clang/include/clang/Driver/Options.td | 4 ++-- clang/lib/Basic/Targets/WebAssembly.cpp | 16 +++--- clang/lib/Basic/Targets/WebAssembly.h | 6 ++--- clang/test/CodeGen/builtins-wasm.c| 4 ++-- clang/test/Driver/wasm-features.c | 8 +++ .../test/Preprocessor/wasm-target-features.c | 16 +++--- llvm/lib/Target/WebAssembly/WebAssembly.td| 8 +++ .../WebAssembly/WebAssemblyISelLowering.cpp | 4 ++-- .../WebAssembly/WebAssemblyInstrInfo.td | 6 ++--- .../WebAssembly/WebAssemblyInstrMemory.td | 4 ++-- .../WebAssembly/WebAssemblyInstrSIMD.td | 12 +- .../Target/WebAssembly/WebAssemblySubtarget.h | 4 ++-- .../CodeGen/WebAssembly/half-precision.ll | 4 ++-- llvm/test/CodeGen/WebAssembly/offset.ll | 2 +- .../WebAssembly/target-features-cpus.ll | 6 ++--- llvm/test/MC/WebAssembly/simd-encodings.s | 2 +- 17 files changed, 63 insertions(+), 65 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index df304a71e475ec..034d32c6291b3d 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -135,10 +135,10 @@ TARGET_BUILTIN(__builtin_wasm_min_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_max_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmin_f64x2, "V2dV2dV2d", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_pmax_f64x2, "V2dV2dV2d", "nc", "simd128") -TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_min_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_max_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_pmin_f16x8, "V8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_pmax_f16x8, "V8hV8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_ceil_f32x4, "V4fV4f", "nc", "simd128") TARGET_BUILTIN(__builtin_wasm_floor_f32x4, "V4fV4f", "nc", "simd128") @@ -170,8 +170,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f32x4, "V4fV4fV4fV4f", "nc", "relaxed TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f32x4, "V4fV4fV4fV4f", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f64x2, "V2dV2dV2dV2d", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f64x2, "V2dV2dV2dV2d", "nc", "relaxed-simd") -TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_relaxed_madd_f16x8, "V8hV8hV8hV8h", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_relaxed_nmadd_f16x8, "V8hV8hV8hV8h", "nc", "fp16") TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i8x16, "V16ScV16ScV16ScV16Sc", "nc", "relaxed-simd") TARGET_BUILTIN(__builtin_wasm_relaxed_laneselect_i16x8, "V8sV8sV8sV8s", "nc", "relaxed-simd") @@ -197,11 +197,11 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_i8x16_i7x16_add_s_i32x4, "V4iV16ScV16S TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f", "nc", "relaxed-simd") // Half-Precision (fp16) -TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "half-precision") -TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "half-precision") -TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "half-precision") -TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "half-precision") +TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16") +TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16") +TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index c204062b4f7353..89239789b3d492 100644 --- a/clang/include/clang/Driver/Options.td
[clang] [llvm] [WebAssembly] Change half-precision feature name to fp16. (PR #105434)
https://github.com/brendandahl closed https://github.com/llvm/llvm-project/pull/105434 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [WebAssembly] Change F16x8 extract lane to require constant integer. (PR #108116)
https://github.com/brendandahl updated https://github.com/llvm/llvm-project/pull/108116 >From 3b813cd5b0555e6b654f575140e4db9a57ed699a Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Tue, 10 Sep 2024 21:52:55 + Subject: [PATCH 1/2] [WebAssembly] Change F16x8 extract lane to require constant integer. Building with no optimizations resulted in failures since the lane constant wasn't a constant in LL IR. --- .../clang/Basic/BuiltinsWebAssembly.def | 4 ++-- clang/lib/Headers/wasm_simd128.h | 19 --- clang/test/CodeGen/builtins-wasm.c| 12 ++-- 3 files changed, 16 insertions(+), 19 deletions(-) diff --git a/clang/include/clang/Basic/BuiltinsWebAssembly.def b/clang/include/clang/Basic/BuiltinsWebAssembly.def index 2e80eef2c8b9bc..ad73f031922a0b 100644 --- a/clang/include/clang/Basic/BuiltinsWebAssembly.def +++ b/clang/include/clang/Basic/BuiltinsWebAssembly.def @@ -209,8 +209,8 @@ TARGET_BUILTIN(__builtin_wasm_relaxed_dot_bf16x8_add_f32_f32x4, "V4fV8UsV8UsV4f" TARGET_BUILTIN(__builtin_wasm_loadf16_f32, "fh*", "nU", "fp16") TARGET_BUILTIN(__builtin_wasm_storef16_f32, "vfh*", "n", "fp16") TARGET_BUILTIN(__builtin_wasm_splat_f16x8, "V8hf", "nc", "fp16") -TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hi", "nc", "fp16") -TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hif", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_extract_lane_f16x8, "fV8hIi", "nc", "fp16") +TARGET_BUILTIN(__builtin_wasm_replace_lane_f16x8, "V8hV8hIif", "nc", "fp16") // Reference Types builtins // Some builtins are custom type-checked - see 't' as part of the third argument, diff --git a/clang/lib/Headers/wasm_simd128.h b/clang/lib/Headers/wasm_simd128.h index 67d12f6f2cf419..947bb9fe23029e 100644 --- a/clang/lib/Headers/wasm_simd128.h +++ b/clang/lib/Headers/wasm_simd128.h @@ -1888,18 +1888,15 @@ static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_splat(float __a) { return (v128_t)__builtin_wasm_splat_f16x8(__a); } -static __inline__ float __FP16_FN_ATTRS wasm_f16x8_extract_lane(v128_t __a, -int __i) -__REQUIRE_CONSTANT(__i) { - return __builtin_wasm_extract_lane_f16x8((__f16x8)__a, __i); -} +#ifdef __wasm_fp16__ -static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_replace_lane(v128_t __a, - int __i, - float __b) -__REQUIRE_CONSTANT(__i) { - return (v128_t)__builtin_wasm_replace_lane_f16x8((__f16x8)__a, __i, __b); -} +#define wasm_f16x8_extract_lane(__a, __i) \ + (__builtin_wasm_extract_lane_f16x8((__f16x8)(__a), __i)) + +#define wasm_f16x8_replace_lane(__a, __i, __b) \ + ((v128_t)__builtin_wasm_replace_lane_f16x8((__f16x8)(__a), __i, __b)) + +#endif static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_abs(v128_t __a) { return (v128_t)__builtin_wasm_abs_f16x8((__f16x8)__a); diff --git a/clang/test/CodeGen/builtins-wasm.c b/clang/test/CodeGen/builtins-wasm.c index 3010b8954f1c2e..8943a92faad044 100644 --- a/clang/test/CodeGen/builtins-wasm.c +++ b/clang/test/CodeGen/builtins-wasm.c @@ -834,16 +834,16 @@ f16x8 splat_f16x8(float a) { return __builtin_wasm_splat_f16x8(a); } -float extract_lane_f16x8(f16x8 a, int i) { - // WEBASSEMBLY: %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x half> %a, i32 %i) +float extract_lane_f16x8(f16x8 a) { + // WEBASSEMBLY: %0 = tail call float @llvm.wasm.extract.lane.f16x8(<8 x half> %a, i32 7) // WEBASSEMBLY-NEXT: ret float %0 - return __builtin_wasm_extract_lane_f16x8(a, i); + return __builtin_wasm_extract_lane_f16x8(a, 7); } -f16x8 replace_lane_f16x8(f16x8 a, int i, float v) { - // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 x half> %a, i32 %i, float %v) +f16x8 replace_lane_f16x8(f16x8 a, float v) { + // WEBASSEMBLY: %0 = tail call <8 x half> @llvm.wasm.replace.lane.f16x8(<8 x half> %a, i32 7, float %v) // WEBASSEMBLY-NEXT: ret <8 x half> %0 - return __builtin_wasm_replace_lane_f16x8(a, i, v); + return __builtin_wasm_replace_lane_f16x8(a, 7, v); } f16x8 min_f16x8(f16x8 a, f16x8 b) { >From ab30566f242a88a238d4bfb0e5eee229ddf0eb54 Mon Sep 17 00:00:00 2001 From: Brendan Dahl Date: Wed, 11 Sep 2024 22:32:02 + Subject: [PATCH 2/2] add todo --- clang/lib/Headers/wasm_simd128.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/clang/lib/Headers/wasm_simd128.h b/clang/lib/Headers/wasm_simd128.h index 947bb9fe23029e..14e36e85da8efa 100644 --- a/clang/lib/Headers/wasm_simd128.h +++ b/clang/lib/Headers/wasm_simd128.h @@ -1889,6 +1889,8 @@ static __inline__ v128_t __FP16_FN_ATTRS wasm_f16x8_splat(float __a) { } #ifdef __wasm_fp16__ +// TODO Replace the following macros with regular C functions and use normal +// target-independent vector code like the other repl
[clang] [WebAssembly] Change F16x8 extract lane to require constant integer. (PR #108116)
https://github.com/brendandahl closed https://github.com/llvm/llvm-project/pull/108116 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits