================ @@ -0,0 +1,21 @@ +#ifndef __CLC_INTEGER_CLC_CLZ_H__ +#define __CLC_INTEGER_CLC_CLZ_H__ + +#if defined(CLC_CLSPV) || defined(CLC_SPIRV) +// clspv and spir-v targets provide their own OpenCL-compatible clz +#define __clc_clz clz ---------------- frasercrmck wrote:
> > The clspv/spirv targets don't provide implementations of all OpenCL builtins > > What do you mean by this exactly? I think the system today is backwards and > it results in confusing terminology. There are 2 possible interpretation of > "target provided" library functions. The backend notion and the libclc notion. > Apologies; it is confusing. What I meant was that the libclc libraries built for those targets intentionally don't supply implementation of all OpenCL functions. If you disassemble `clspv64--.bc` you'll see it make calls to `_Z3clzi` but `_Z3clzi` is left as a declaration. By "provide" I mean "expose an implementation for". And, just to be clear, it's not just `clz`. You can see all of the `declare`s if you disassemble the libclc libraries. I can't claim to fully understand what they're doing with these functions once they've linked with their libclc libraries. > > we need to replace that with a redirection to clz somehow > > By `clz` do you mean OpenCL `clz` (Itanium mangled function defined in this > bitcode lib) or system/target provided (libc) `clz`? Either way, it shouldn't > matter. `__clc_clz` should not be defined in terms of either. The goal here > is to just get a call to `llvm.ctlz` (which should be the universal > implementation) in the IR. __clc_clz should always just be a wrapper around > the intrinsic call. What you have in the impl file should already produce > this after instcombine in the scalar case. By `clz` I meant OpenCL `clz` (or `_Z3clc*`) but as I explained earlier, it's not *defined* in this bitcode lib (for those targets). I don't want to get too hung up specifically on `clz` (not all are as simple as being intrinsic wrappers), but I agree with you in principle. I'm not sure why wrapping the intrinsic _isn't_ a suitable implementation for SPIR-V targets. I keep going back and forth between whether libclc _should_ be judgemental about that, or whether it should allow targets to opt out of any definition they want. The AMD/NVIDIA (and our own downstream) targets are certainly a lot simpler in this regard, in that they desire an implementation of all OpenCL builtins. This macro business is a way of avoiding us having to customize the "implementations" of `clz` for SPIR-V targets, by having them compile an implementation of `__clc_clz` that just calls back out to OpenCL `clz`, which is left as a declaration in the final libclc library. This is all working from the assumptions that: 1. I want to make as few changes to the bitcode codegen for the SPIR-V targets as possible, in the interests of throughput of the CLC-related patches. 2. Whichever OpenCL builtins they're currently opting out of providing must be for a reason and I don't want to change that 3. The people using SPIR-V targets are harder to reach than the other users of libclc Perhaps those assumptions are invalid. > I see some of the target overrides are defined directly on the opencl > entrypoints. These should be overriding the __clc wrappers, rather than > directly overriding the entries. rocm device libs never tries to use the > opencl entrypoints internally, and (libclc is originally a fork of the same > lib from ~2014). I think the targets should only provide overrides of __clc > wrappers, and not directly override the public entrypoints. Those wrappers > may end up being implemented with the libm/libc system names (preferably by > codegen) I totally agree - this is the direction I want to take it in. OpenCL builtins call `__clc` functions, and `__clc` functions call each other internally. Targets override `__clc` functions if they wish, and the OpenCL builtins are left 100% generic. At that point I'd also like to re-organize the library so we have `libclc/clc` and `libclc/opencl` to better differentiate the code. I just haven't round gotten to any `__clc` versions of functions that are currently being overridden by any target. Once I get `sign`/`nextafter` moved over, we can move over the conversion builtins, and then the more complicated ones will follow, like maths builtins. https://github.com/llvm/llvm-project/pull/116786 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits