keshavvinayak01 wrote:

> It is not necessary to add new intrinsics for these operations. You are 
> better off writing the med3 in terms of min and max and letting the backend 
> deal with it. The effort of fully supporting all analyses and optimizations 
> on a new operation is very high

@arsenm I see that we already have support for lowering fmed3 all the way down 
to the supported AMDGPU V_MED_F32 / V_MED_F16 ops, see 
[here](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf).
 Why can't we also add similar intrinsics for SMED and UMED when the hardware 
already supports those instructions?

https://github.com/llvm/llvm-project/pull/157748
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to