[clang] [lld] [llvm] [AMDGPU] Rename COV module flag to amdhsa_code_object_version (PR #79905)
https://github.com/epilk closed https://github.com/llvm/llvm-project/pull/79905 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [lld] [clang] [AMDGPU] Rename COV module flag to amdhsa_code_object_version (PR #79905)
@@ -25,4 +25,4 @@ entry: } !llvm.module.flags = !{!0} -!0 = !{i32 1, !"amdgpu_code_object_version", i32 500} +!0 = !{i32 1, !"amdhsa_code_object_version", i32 500} epilk wrote: Sure, I'll make a PR for that too after this lands. https://github.com/llvm/llvm-project/pull/79905 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [lld] [AMDGPU] Rename COV module flag to amdhsa_code_object_version (PR #79905)
epilk wrote: @tstellar there was some uncertainty about whether to leave the spelling as-is since mesa used to following HSA code object versions (but doesn't anymore). Would you have any objections to this change? https://github.com/llvm/llvm-project/pull/79905 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Enable unguarded availability diagnostic on instantiated template functions (PR #91699)
@@ -177,16 +177,19 @@ void justAtAvailable(void) { #ifdef OBJCPP -int f(char) AVAILABLE_10_12; +int f(char) AVAILABLE_10_12; // #f_char_def int f(int); template int use_f() { - // FIXME: We should warn here! - return f(T()); epilk wrote: Could you check that we don't emit a warning if there is an availability attribute on the enclosing function or if the use is guarded by an `if (@available(...))` check? IIRC that that was what I was concerned about when I wrote this. @jansvoboda11 : someone at Apple should probably review this. https://github.com/llvm/llvm-project/pull/91699 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][CGRecordLayout] Remove dependency on isZeroSize (PR #96422)
epilk wrote: Hello, apologies for jumping in here so late, but this commit is changing the device function ABI on AMDGPU. For instance, for a device function returning this struct: ``` struct Empty {}; struct RetTy { int f1[3]; Empty e; int f2; }; __device__ RetTy deviceFn() { ... } ``` Clang used to emit the following IR: ``` %struct.Empty = type { i8 } %struct.RetTy = type { [3 x i32], %struct.Empty, i32 } declare %struct.RetTy @deviceFn() ;; AMDGPU backend will return RetTy in 5 vpgr32 registers ``` And now is emitting this: ``` %struct.RetTy = type { [3 x i32], [4 x i8], i32 } declare %struct.RetTy @deviceFn() ;; AMDGPU backend will return RetTy in 8 vgpr32 registers ``` It seems to me like this changes makes it more difficult to implement an ABI that are does a decomposition like this. I guess I can see a few potential paths forward here: 1) revert this patch to get the old behaviour, 2) rework it such that the number of padding bytes is minimized (e.g. a single i8 instead of [4 x i8]), or 3) add a custom clang CodeGen lowering for AMDGPU device functions that can strip away these extra padding bytes for return types. Do you have any thoughts on this? Is the AMDGPU backend relying on a misguided assumption about what the IR type for a struct will look like? https://github.com/llvm/llvm-project/pull/96422 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits