SamTebbs33 wrote:

> > It would seem like a "udot" can be represented already as 
> > `vecreduce.add(mul(zext, zext))`, and fdot is simpler still. Is there any 
> > particular reason to add a new intrinsic for it if it is already 
> > representable as a vecreduce? And it would feel like a shame if it couldn't 
> > be used with the actual AArch64 instructions.
> 
> There was a whole discussion on dot in 
> https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294/13 check out 
> `kparzysz` posts. Essentially Yes we could represent dot this way, but then 
> we would not be able to benefit from the ubquity of the hardware specific dot 
> lowerings that are showing up across gpu and convolution use cases.
> 

Why would using the partial reduction intrinsic stop you from using 
hardware-specific dot product lowerings for GPUs? The lowering is quite 
trivial, see [here](https://github.com/llvm/llvm-project/pull/101010). I think 
it would be best to not introduce another way of doing the same thing.



https://github.com/llvm/llvm-project/pull/102872
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to