[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

David Green via cfe-commits Mon, 12 Aug 2024 09:44:43 -0700

davemgreen wrote:

AArch64 has a udot and sdot instruction (and a usdot instruction). They perform 
a "partial" reduction though, producing a v4i32 from two v16i8 inputs. We would 
like to use those from the vectorizer and have recently added a 
partial-reduction intrinsic, but doing it with a higher level intrinsic might 
be a little nicer.


It would seem like a "udot" can be represented already as 
`vecreduce.add(mul(zext, zext))`, and fdot is simpler still. Is there any 
particular reason to add a new intrinsic for it if it is already representable 
as a vecreduce? And it would feel like a shame if it couldn't be used with the 
actual AArch64 instructions. 

@SamTebbs33 @NickGuy-Arm FYI.

https://github.com/llvm/llvm-project/pull/102872
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

Reply via email to