================
@@ -0,0 +1,21 @@
+#ifndef __CLC_INTEGER_CLC_CLZ_H__
+#define __CLC_INTEGER_CLC_CLZ_H__
+
+#if defined(CLC_CLSPV) || defined(CLC_SPIRV)
+// clspv and spir-v targets provide their own OpenCL-compatible clz
+#define __clc_clz clz
----------------
frasercrmck wrote:

> > The clspv/spirv targets don't provide implementations of all OpenCL builtins
> 
> What do you mean by this exactly? I think the system today is backwards and 
> it results in confusing terminology. There are 2 possible interpretation of 
> "target provided" library functions. The backend notion and the libclc notion.
>

Apologies; it is confusing. What I meant was that the libclc libraries built 
for those targets intentionally don't supply implementation of all OpenCL 
functions. If you disassemble `clspv64--.bc` you'll see it make calls to 
`_Z3clzi` but `_Z3clzi` is left as a declaration. By "provide" I mean "expose 
an implementation for". And, just to be clear, it's not just `clz`. You can see 
all of the `declare`s if you disassemble the libclc libraries. I can't claim to 
fully understand what they're doing with these functions once they've linked 
with their libclc libraries.

> > we need to replace that with a redirection to clz somehow
> 
> By `clz` do you mean OpenCL `clz` (Itanium mangled function defined in this 
> bitcode lib) or system/target provided (libc) `clz`? Either way, it shouldn't 
> matter. `__clc_clz` should not be defined in terms of either. The goal here 
> is to just get a call to `llvm.ctlz` (which should be the universal 
> implementation) in the IR. __clc_clz should always just be a wrapper around 
> the intrinsic call. What you have in the impl file should already produce 
> this after instcombine in the scalar case.

By `clz` I meant OpenCL `clz` (or `_Z3clc*`) but as I explained earlier, it's 
not *defined* in this bitcode lib (for those targets).

I don't want to get too hung up specifically on `clz` (not all are as simple as 
being intrinsic wrappers), but I agree with you in principle. I'm not sure why 
wrapping the intrinsic _isn't_ a suitable implementation for SPIR-V targets. I 
keep going back and forth between whether libclc _should_ be judgemental about 
that, or whether it should allow targets to opt out of any definition they 
want. The AMD/NVIDIA (and our own downstream) targets are certainly a lot 
simpler in this regard, in that they desire an implementation of all OpenCL 
builtins.

This macro business is a way of avoiding us having to customize the 
"implementations" of `clz` for SPIR-V targets, by having them compile an 
implementation of `__clc_clz` that just calls back out to OpenCL `clz`, which 
is left as a declaration in the final libclc library.

This is all working from the assumptions that:

1. I want to make as few changes to the bitcode codegen for the SPIR-V targets 
as possible, in the interests of throughput of the CLC-related patches.
2. Whichever OpenCL builtins they're currently opting out of providing must be 
for a reason and I don't want to change that
3. The people using SPIR-V targets are harder to reach than the other users of 
libclc

Perhaps those assumptions are invalid.

> I see some of the target overrides are defined directly on the opencl 
> entrypoints. These should be overriding the __clc wrappers, rather than 
> directly overriding the entries. rocm device libs never tries to use the 
> opencl entrypoints internally, and (libclc is originally a fork of the same 
> lib from ~2014). I think the targets should only provide overrides of __clc 
> wrappers, and not directly override the public entrypoints. Those wrappers 
> may end up being implemented with the libm/libc system names (preferably by 
> codegen)

I totally agree - this is the direction I want to take it in. OpenCL builtins 
call `__clc` functions, and `__clc` functions call each other internally. 
Targets override `__clc` functions if they wish, and the OpenCL builtins are 
left 100% generic. At that point I'd also like to re-organize the library so we 
have `libclc/clc` and `libclc/opencl` to better differentiate the code.

I just haven't round gotten to any `__clc` versions of functions that are 
currently being overridden by any target. Once I get `sign`/`nextafter` moved 
over, we can move over the conversion builtins, and then the more complicated 
ones will follow, like maths builtins.

https://github.com/llvm/llvm-project/pull/116786
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to