[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-13 Thread Alex Voicu via cfe-commits
AlexVlx wrote: > > not everything should end up in Clang, even though it is individually > > "convenient" > > We should at least have some builtins for those SPIR-V intrinsics you listed. > I think right now we only have `reflect`. Probably, I think there's a sense that in general it's desira

[clang] clang: Do not implicitly addrspacecast in EmitAggExprToLValue (PR #129837)

2025-03-05 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx requested changes to this pull request. I'm temporarily blocking this to go through a bit more than OCL - I just want to ensure that this doesn't break some C/C++ corner case. Otherwise LGTM. https://github.com/llvm/llvm-project/pull/129837 ___

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-14 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx closed https://github.com/llvm/llvm-project/pull/114062 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-21 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/128166 >From 77db8a15d472e112be93cb7566cf56e26dc80501 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Fri, 21 Feb 2025 13:47:42 +0200 Subject: [PATCH 1/2] Handle HLL casts on sret args, remove ill advised cast strip a

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-21 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/128166 >From 77db8a15d472e112be93cb7566cf56e26dc80501 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Fri, 21 Feb 2025 13:47:42 +0200 Subject: [PATCH 1/4] Handle HLL casts on sret args, remove ill advised cast strip a

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-21 Thread Alex Voicu via cfe-commits
@@ -0,0 +1,38 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --version 5 +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa -fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -fopenmp-is-t

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-21 Thread Alex Voicu via cfe-commits
@@ -2352,6 +2353,22 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) { Value *Src = Visit(const_cast(E)); llvm::Type *SrcTy = Src->getType(); llvm::Type *DstTy = ConvertType(DestTy); + +// FIXME: this is a gross but seemingly necessary workaround for an

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-21 Thread Alex Voicu via cfe-commits
@@ -2352,6 +2353,22 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) { Value *Src = Visit(const_cast(E)); llvm::Type *SrcTy = Src->getType(); llvm::Type *DstTy = ConvertType(DestTy); + +// FIXME: this is a gross but seemingly necessary workaround for an

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-21 Thread Alex Voicu via cfe-commits
@@ -0,0 +1,32 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa-gnu -target-cpu gfx900 -emit-llvm -o - %s | FileCheck %s + +struct X { int z[17]; }; + +// CHECK-LABEL: define dso_

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-21 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx edited https://github.com/llvm/llvm-project/pull/128166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-21 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/128166 >From 77db8a15d472e112be93cb7566cf56e26dc80501 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Fri, 21 Feb 2025 13:47:42 +0200 Subject: [PATCH 1/3] Handle HLL casts on sret args, remove ill advised cast strip a

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-21 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx created https://github.com/llvm/llvm-project/pull/128166 This addresses two issues introduced by moving indirect args into an explicit AS (please see and

[clang] [clang][HIP] Make some math not not work with AMDGCN SPIR-V (PR #128360)

2025-02-22 Thread Alex Voicu via cfe-commits
@@ -14,6 +14,12 @@ #include "hip/hip_version.h" #endif // __has_include("hip/hip_version.h") +#ifdef __SPIRV__ +#define __PRIVATE_AS __attribute__((address_space(0))) AlexVlx wrote: Acknowledged, but the header was already using numbered address spaces (ROCDL

[clang] [spirv][amdgpu] Set atomic size in the clang target info (PR #128569)

2025-02-24 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx approved this pull request. LGTM, thanks. FWIW we've only done HIP bring-up, so there will probably be a few of these lingering. https://github.com/llvm/llvm-project/pull/128569 ___ cfe-commits mailing list cfe-commits@lists

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-26 Thread Alex Voicu via cfe-commits
@@ -2352,6 +2353,22 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) { Value *Src = Visit(const_cast(E)); llvm::Type *SrcTy = Src->getType(); llvm::Type *DstTy = ConvertType(DestTy); + +// FIXME: this is a gross but seemingly necessary workaround for an

[clang] [NFC] Fix Sanitizer breakage introduced in #128166 (PR #128990)

2025-02-26 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx closed https://github.com/llvm/llvm-project/pull/128990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-26 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/128166 >From 77db8a15d472e112be93cb7566cf56e26dc80501 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Fri, 21 Feb 2025 13:47:42 +0200 Subject: [PATCH 1/5] Handle HLL casts on sret args, remove ill advised cast strip a

[clang] [clang][CodeGen] Additional fixes for #114062 (PR #128166)

2025-02-26 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/128166 >From 77db8a15d472e112be93cb7566cf56e26dc80501 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Fri, 21 Feb 2025 13:47:42 +0200 Subject: [PATCH 1/5] Handle HLL casts on sret args, remove ill advised cast strip a

[clang] [llvm] [clang][OpenMP][SPIR-V] Fix addrspace of globals and global constants (PR #134399)

2025-04-04 Thread Alex Voicu via cfe-commits
@@ -5384,6 +5384,11 @@ LangAS CodeGenModule::GetGlobalVarAddressSpace(const VarDecl *D) { LangAS AS; if (OpenMPRuntime->hasAllocateAttributeForGlobalVar(D, AS)) return AS; +if (LangOpts.OpenMPIsTargetDevice && getTriple().isSPIRV()) AlexVlx w

[clang] [llvm] [clang][OpenMP][SPIR-V] Fix addrspace of globals and global constants (PR #134399)

2025-04-04 Thread Alex Voicu via cfe-commits
@@ -5402,6 +5407,10 @@ LangAS CodeGenModule::GetGlobalConstantAddressSpace() const { // UniformConstant storage class is not viable as pointers to it may not be // casted to Generic pointers which are used to model HIP's "flat" pointers. return LangAS::cuda_device

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
@@ -284,6 +284,18 @@ void CodeGenFunction::AddAMDGPUFenceAddressSpaceMMRA(llvm::Instruction *Inst, Inst->setMetadata(LLVMContext::MD_mmra, MMRAMetadata::getMD(Ctx, MMRAs)); } +static Value *GetOrInsertAMDGPUPredicate(CodeGenFunction &CGF, Twine Name) { + auto PTy = Integer

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
@@ -15576,6 +15609,38 @@ static bool isOverflowingIntegerType(ASTContext &Ctx, QualType T) { return Ctx.getIntWidth(T) >= Ctx.getIntWidth(Ctx.IntTy); } +static Expr *ExpandAMDGPUPredicateBI(ASTContext &Ctx, CallExpr *CE) { + if (!CE->getBuiltinCallee()) +return CXXBool

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/134016 >From 91eeaf02336e539f14dcb0a79ff15dbe8befe6f1 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Wed, 2 Apr 2025 02:47:42 +0100 Subject: [PATCH 1/8] Add the functional identity and feature queries. --- clang/doc

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/134016 >From 91eeaf02336e539f14dcb0a79ff15dbe8befe6f1 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Wed, 2 Apr 2025 02:47:42 +0100 Subject: [PATCH 1/9] Add the functional identity and feature queries. --- clang/doc

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/134016 >From 91eeaf02336e539f14dcb0a79ff15dbe8befe6f1 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Wed, 2 Apr 2025 02:47:42 +0100 Subject: [PATCH 01/10] Add the functional identity and feature queries. --- clang/d

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-03 Thread Alex Voicu via cfe-commits
AlexVlx wrote: > So in short: what you're trying to prevent is "this was stored in a variable, > then checked later when we are no longer on the device, thus the answer is > different". Not quite, although that is definitely an interesting consideration. What I am trying to address here is t

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-03 Thread Alex Voicu via cfe-commits
AlexVlx wrote: > > as a must have, it allows featureful AMDGCN flavoured SPIR-V to be > > produced, where target specific capability is guarded and chosen or > > discarded when finalising compilation for a concrete target. > > I understand the reasoning behind providing such mechanisms to guar

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
@@ -0,0 +1,64 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --check-globals all --version 5 +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx900 -emit-llvm %s -o - | FileCheck --check-prefix=AMDGCN-GFX900 %s +// RUN: %cla

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
@@ -585,6 +597,23 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, llvm::Value *Env = EmitScalarExpr(E->getArg(0)); return Builder.CreateCall(F, {Env}); } + case AMDGPU::BI__builtin_amdgcn_processor_is: { +assert(CGM.getTriple().isSPIRV() &&

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
@@ -4920,6 +4920,116 @@ If no address spaces names are provided, all address spaces are fenced. __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local") __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local", "global") +__builtin_amdgcn_processor_is and __bui

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
@@ -585,6 +597,23 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, llvm::Value *Env = EmitScalarExpr(E->getArg(0)); return Builder.CreateCall(F, {Env}); } + case AMDGPU::BI__builtin_amdgcn_processor_is: { +assert(CGM.getTriple().isSPIRV() &&

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
@@ -4920,6 +4920,116 @@ If no address spaces names are provided, all address spaces are fenced. __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local") __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local", "global") +__builtin_amdgcn_processor_is and __bui

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
@@ -4920,6 +4920,116 @@ If no address spaces names are provided, all address spaces are fenced. __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local") __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local", "global") +__builtin_amdgcn_processor_is and __bui

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/134016 >From 91eeaf02336e539f14dcb0a79ff15dbe8befe6f1 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Wed, 2 Apr 2025 02:47:42 +0100 Subject: [PATCH 01/11] Add the functional identity and feature queries. --- clang/d

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-02 Thread Alex Voicu via cfe-commits
@@ -4920,6 +4920,116 @@ If no address spaces names are provided, all address spaces are fenced. __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local") __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local", "global") +__builtin_amdgcn_processor_is and __bui

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-04 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/134016 >From 91eeaf02336e539f14dcb0a79ff15dbe8befe6f1 Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Wed, 2 Apr 2025 02:47:42 +0100 Subject: [PATCH 1/7] Add the functional identity and feature queries. --- clang/doc

[clang] [llvm] [HIP][HIPSTDPAR] Re-work allocation interposition for `hipstdpar` (PR #138790)

2025-05-07 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx closed https://github.com/llvm/llvm-project/pull/138790 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-05-12 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/134016 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-s

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-05-07 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx commented: > A side question, is it legal to use the builtin in unstructured control flow, > like here: https://godbolt.org/z/no7Kzv19r ? Note, if the answer is "no", > then enforcing the builtin to initialize something would (probably) > automatically prevent this c

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-05-07 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx edited https://github.com/llvm/llvm-project/pull/134016 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-05-07 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/134016 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-s

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-05-08 Thread Alex Voicu via cfe-commits
AlexVlx wrote: @erichkeane @AaronBallman if, when you have time, you could please indicate if the new direction is at least generally aligned with what you had in mind, it'd be appreciated! https://github.com/llvm/llvm-project/pull/134016 ___ cfe-com

[clang] [HIP] change default offload archs (PR #139281)

2025-05-13 Thread Alex Voicu via cfe-commits
AlexVlx wrote: > The main obstacle of letting clang emit error when `--offload-arch` is not > specified is HIP apps using hipcc as CMAKE_CXX_COMPILER. hipcc adds -xhip by > default for .cpp programs. This is a known and long existing issue. > > Another option is to have multiple `--offload-arc

[clang] [llvm] [HIP][HIPSTDPAR] Re-work allocation interposition for `hipstdpar` (PR #138790)

2025-05-06 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx created https://github.com/llvm/llvm-project/pull/138790 The allocation interposition mode had a number of issues, which are primarily addressed in the library component via . However, it is necessary to interpose some add

[clang] [llvm] [HIP][HIPSTDPAR] Re-work allocation interposition for `hipstdpar` (PR #138790)

2025-05-06 Thread Alex Voicu via cfe-commits
https://github.com/AlexVlx updated https://github.com/llvm/llvm-project/pull/138790 >From 865ff3dff1833607f0d546ab0ebd95b98a8ed71b Mon Sep 17 00:00:00 2001 From: Alex Voicu Date: Wed, 7 May 2025 01:25:17 +0100 Subject: [PATCH 1/2] Re-work allocation interposition for `hipstdpar`. --- clang/do

[clang] [HIP] change default offload archs (PR #139281)

2025-05-13 Thread Alex Voicu via cfe-commits
AlexVlx wrote: > > > The main obstacle of letting clang emit error when `--offload-arch` is > > > not specified is HIP apps using hipcc as CMAKE_CXX_COMPILER. hipcc adds > > > -xhip by default for .cpp programs. This is a known and long existing > > > issue. > > > Another option is to have mul

[clang] [HIP] change default offload archs (PR #139281)

2025-05-13 Thread Alex Voicu via cfe-commits
AlexVlx wrote: I think that in general we also need to decide on what happens when you pick an amdgcn— triple. IMHO for that case we should probably error out if no mcpu is provided, since there’s no reasonable default, except for “all”, but that would be incredibly disruptive. https://github

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-05-16 Thread Alex Voicu via cfe-commits
AlexVlx wrote: Gentle ping. https://github.com/llvm/llvm-project/pull/134016 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-05-22 Thread Alex Voicu via cfe-commits
@@ -13338,4 +13338,23 @@ def err_acc_device_type_multiple_archs // AMDGCN builtins diagnostics def err_amdgcn_load_lds_size_invalid_value : Error<"invalid size value">; def note_amdgcn_load_lds_size_valid_value : Note<"size must be %select{1, 2, or 4|1, 2, 4, 12 or 16}0">; +de

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-05-22 Thread Alex Voicu via cfe-commits
@@ -29,6 +29,8 @@ MODULE_PASS("amdgpu-printf-runtime-binding", AMDGPUPrintfRuntimeBindingPass()) MODULE_PASS("amdgpu-remove-incompatible-functions", AMDGPURemoveIncompatibleFunctionsPass(*this)) MODULE_PASS("amdgpu-sw-lower-lds", AMDGPUSwLowerLDSPass(*this)) MODULE_PASS("amdg

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-05-22 Thread Alex Voicu via cfe-commits
@@ -4966,6 +4966,89 @@ If no address spaces names are provided, all address spaces are fenced. __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local") __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local", "global") +__builtin_amdgcn_processor_is and __buil

<    2   3   4   5   6   7