SamTebbs33 wrote: > > It would seem like a "udot" can be represented already as > > `vecreduce.add(mul(zext, zext))`, and fdot is simpler still. Is there any > > particular reason to add a new intrinsic for it if it is already > > representable as a vecreduce? And it would feel like a shame if it couldn't > > be used with the actual AArch64 instructions. > > There was a whole discussion on dot in > https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294/13 check out > `kparzysz` posts. Essentially Yes we could represent dot this way, but then > we would not be able to benefit from the ubquity of the hardware specific dot > lowerings that are showing up across gpu and convolution use cases. >
Why would using the partial reduction intrinsic stop you from using hardware-specific dot product lowerings for GPUs? The lowering is quite trivial, see [here](https://github.com/llvm/llvm-project/pull/101010). I think it would be best to not introduce another way of doing the same thing. https://github.com/llvm/llvm-project/pull/102872 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits