[clang] [llvm] [AMDGPU][True16][MC] true16 for v_alignbyte_b32 (PR #119750)

Ivan Kosarev via cfe-commits Tue, 07 Jan 2025 09:29:08 -0800

================
@@ -2353,8 +2353,8 @@ def int_amdgcn_writelane :
   [IntrNoMem, IntrConvergent, IntrWillReturn, IntrNoCallback, IntrNoFree]
 >;
 
-def int_amdgcn_alignbyte : ClangBuiltin<"__builtin_amdgcn_alignbyte">,
-  DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty, llvm_i32_ty],
+def int_amdgcn_alignbyte : DefaultAttrsIntrinsic<[llvm_i32_ty],
+  [llvm_i32_ty, llvm_i32_ty, llvm_anyint_ty],
----------------
kosarev wrote:


> pattern match extract of high bits. We generally shouldn't expose the direct 
> op_sel reads

That's what I meant by masking subtarget specifics. Nevermind then.

(Though I do still have doubts as to whether having pattern matching on top of 
such as a no-side-effects intrinsic is a good idea, because normally the 
intention behind using intrinsics would be not to rely on pattern matching and 
get it translated with guarantee to something of a very particular form. 
Otherwise, we could just pattern-match an equivalent tree that doesn't use any 
intrinsics at all.)

@broxigarchen Guo, so can we leave the intrinsic as is and on codegen just use 
the low half of src2 on GFX11+?

https://github.com/llvm/llvm-project/pull/119750
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][True16][MC] true16 for v_alignbyte_b32 (PR #119750)

Reply via email to