[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
https://github.com/lucas-rami updated https://github.com/llvm/llvm-project/pull/138284 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Allow 0 as min./max. of amdgpu-waves-per-eu (PR #138284)
https://github.com/lucas-rami updated https://github.com/llvm/llvm-project/pull/138284 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Allow 0 as min./max. of amdgpu-waves-per-eu (PR #138284)
lucas-rami wrote: > It doesn't add the attribute if it's 0? Indeed I missed this, sorry. > I don't see a reason to support this on the IR level. We should just make it > a verifier error to use 0. After further investigation and from what I can understand sema already rejects a minimum of 0 if the maximum is not 0 i.e., it will only accept amdgpu_waves_per_eu(0, 0) or amdgpu_waves_per_eu(N) or amdgpu_waves_per_eu(N, M) where N>0 and M>=N. > Supporting 0 is only a shortcut to the default behavior without knowing the > target limits in the source I see your point. With the current way the attribute affects the range of waves/EU returned by the subtarget method a user may set the range to [1,N] to only limit the maximum number of waves/EU, so accepting 0 is indeed useless as-is. I think it only makes sense with what I intended to be the followup change to this, which I integrated here now (PR description updated to reflect the change). Hopefully letting the minimum be 0 makes more sense now. https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Allow 0 as min./max. of amdgpu-waves-per-eu (PR #138284)
https://github.com/lucas-rami edited https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
https://github.com/lucas-rami edited https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
https://github.com/lucas-rami edited https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
@@ -279,7 +268,7 @@ define amdgpu_kernel void @kernel_3_6() #12 { ; 3,6 -> 6,9 define internal void @refine_upper_func_3_6() #13 { ; CHECK-LABEL: define internal void @refine_upper_func_3_6 -; CHECK-SAME: () #[[ATTR9]] { +; CHECK-SAME: () #[[ATTR14:[0-9]+]] { lucas-rami wrote: This is the last "real" change in the file. The waves/EU range for this function goes from [4,10] to [3,6]. All attribute changes below are just renames. https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
https://github.com/lucas-rami updated https://github.com/llvm/llvm-project/pull/138284 >From 9906334cbabf160221dea11aa93e2cf7a154e6da Mon Sep 17 00:00:00 2001 From: Lucas Ramirez Date: Thu, 22 May 2025 13:26:57 + Subject: [PATCH] Rebase on main (integrate changes from 1b34722) --- clang/lib/CodeGen/Targets/AMDGPU.cpp | 27 - clang/lib/Sema/SemaAMDGPU.cpp | 5 -- clang/test/SemaOpenCL/amdgpu-attrs.cl | 1 - llvm/lib/IR/Verifier.cpp | 23 llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp | 4 ++ llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp| 46 --- llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h | 4 +- .../AMDGPU/attr-amdgpu-waves-per-eu.ll| 12 ...-work-group-size-overrides-waves-per-eu.ll | 4 +- ...ine-scheduler-sink-trivial-remats-attr.mir | 12 ++-- .../CodeGen/AMDGPU/propagate-waves-per-eu.ll | 56 --- .../Verifier/AMDGPU/amdgpu-waves-per-eu.ll| 40 + 12 files changed, 153 insertions(+), 81 deletions(-) create mode 100644 llvm/test/Verifier/AMDGPU/amdgpu-waves-per-eu.ll diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp b/clang/lib/CodeGen/Targets/AMDGPU.cpp index c8921c434db47..8072d4f363feb 100644 --- a/clang/lib/CodeGen/Targets/AMDGPU.cpp +++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp @@ -743,20 +743,21 @@ void CodeGenModule::handleAMDGPUWavesPerEUAttr( llvm::Function *F, const AMDGPUWavesPerEUAttr *Attr) { unsigned Min = Attr->getMin()->EvaluateKnownConstInt(getContext()).getExtValue(); - unsigned Max = - Attr->getMax() - ? Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue() - : 0; - if (Min != 0) { -assert((Max == 0 || Min <= Max) && "Min must be less than or equal Max"); - -std::string AttrVal = llvm::utostr(Min); -if (Max != 0) - AttrVal = AttrVal + "," + llvm::utostr(Max); -F->addFnAttr("amdgpu-waves-per-eu", AttrVal); - } else -assert(Max == 0 && "Max must be zero"); + if (Attr->getMax()) { +unsigned Max = +Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue(); +assert(Min == 0 || (Min != 0 && Max != 0) && + "Min must be non-zero when Max is non-zero"); +assert(Min <= Max && "Min must be less than or equal to Max"); +// Do not add the attribute if min,max=0,0. +if (Min != 0) { + std::string AttrVal = llvm::utostr(Min) + "," + llvm::utostr(Max); + F->addFnAttr("amdgpu-waves-per-eu", AttrVal); +} + } else if (Min != 0) { +F->addFnAttr("amdgpu-waves-per-eu", llvm::utostr(Min)); + } } std::unique_ptr diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp index e6414a623b929..9ae3ec1289def 100644 --- a/clang/lib/Sema/SemaAMDGPU.cpp +++ b/clang/lib/Sema/SemaAMDGPU.cpp @@ -244,11 +244,6 @@ static bool checkAMDGPUWavesPerEUArguments(Sema &S, Expr *MinExpr, if (MaxExpr && !S.checkUInt32Argument(Attr, MaxExpr, Max, 1)) return true; - if (Min == 0 && Max != 0) { -S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid) -<< &Attr << 0; -return true; - } if (Max != 0 && Min > Max) { S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid) << &Attr << 1; diff --git a/clang/test/SemaOpenCL/amdgpu-attrs.cl b/clang/test/SemaOpenCL/amdgpu-attrs.cl index 89ba3f86803c5..b9b44dff4d4a9 100644 --- a/clang/test/SemaOpenCL/amdgpu-attrs.cl +++ b/clang/test/SemaOpenCL/amdgpu-attrs.cl @@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void kernel_num_sgpr_L() {} __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() {} // expected-error {{integer constant expression evaluates to value 4294967296 that cannot be represented in a 32-bit unsigned integer type}} __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void kernel_flat_work_group_size_0_64() {} // expected-error {{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 since min is 0}} -__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' attribute argument is invalid: max must be 0 since min is 0}} __attribute__((amdgpu_flat_work_group_size(64, 32))) kernel void kernel_flat_work_group_size_64_32() {} // expected-error {{'amdgpu_flat_work_group_size' attribute argument is invalid: min must not be greater than max}} __attribute__((amdgpu_waves_per_eu(4, 2))) kernel void kernel_waves_per_eu_4_2() {} // expected-error {{'amdgpu_waves_per_eu' attribute argument is invalid: min must not be greater than max}} diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 73b4274a41ee6..402af44e38cbf 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -2473,6 +2473,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, AttributeList Attrs, CheckFailed("invalid value for 'denormal-fp-math
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
https://github.com/lucas-rami updated https://github.com/llvm/llvm-project/pull/138284 >From 9906334cbabf160221dea11aa93e2cf7a154e6da Mon Sep 17 00:00:00 2001 From: Lucas Ramirez Date: Thu, 22 May 2025 13:26:57 + Subject: [PATCH 1/2] Rebase on main (integrate changes from 1b34722) --- clang/lib/CodeGen/Targets/AMDGPU.cpp | 27 - clang/lib/Sema/SemaAMDGPU.cpp | 5 -- clang/test/SemaOpenCL/amdgpu-attrs.cl | 1 - llvm/lib/IR/Verifier.cpp | 23 llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp | 4 ++ llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp| 46 --- llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h | 4 +- .../AMDGPU/attr-amdgpu-waves-per-eu.ll| 12 ...-work-group-size-overrides-waves-per-eu.ll | 4 +- ...ine-scheduler-sink-trivial-remats-attr.mir | 12 ++-- .../CodeGen/AMDGPU/propagate-waves-per-eu.ll | 56 --- .../Verifier/AMDGPU/amdgpu-waves-per-eu.ll| 40 + 12 files changed, 153 insertions(+), 81 deletions(-) create mode 100644 llvm/test/Verifier/AMDGPU/amdgpu-waves-per-eu.ll diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp b/clang/lib/CodeGen/Targets/AMDGPU.cpp index c8921c434db47..8072d4f363feb 100644 --- a/clang/lib/CodeGen/Targets/AMDGPU.cpp +++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp @@ -743,20 +743,21 @@ void CodeGenModule::handleAMDGPUWavesPerEUAttr( llvm::Function *F, const AMDGPUWavesPerEUAttr *Attr) { unsigned Min = Attr->getMin()->EvaluateKnownConstInt(getContext()).getExtValue(); - unsigned Max = - Attr->getMax() - ? Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue() - : 0; - if (Min != 0) { -assert((Max == 0 || Min <= Max) && "Min must be less than or equal Max"); - -std::string AttrVal = llvm::utostr(Min); -if (Max != 0) - AttrVal = AttrVal + "," + llvm::utostr(Max); -F->addFnAttr("amdgpu-waves-per-eu", AttrVal); - } else -assert(Max == 0 && "Max must be zero"); + if (Attr->getMax()) { +unsigned Max = +Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue(); +assert(Min == 0 || (Min != 0 && Max != 0) && + "Min must be non-zero when Max is non-zero"); +assert(Min <= Max && "Min must be less than or equal to Max"); +// Do not add the attribute if min,max=0,0. +if (Min != 0) { + std::string AttrVal = llvm::utostr(Min) + "," + llvm::utostr(Max); + F->addFnAttr("amdgpu-waves-per-eu", AttrVal); +} + } else if (Min != 0) { +F->addFnAttr("amdgpu-waves-per-eu", llvm::utostr(Min)); + } } std::unique_ptr diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp index e6414a623b929..9ae3ec1289def 100644 --- a/clang/lib/Sema/SemaAMDGPU.cpp +++ b/clang/lib/Sema/SemaAMDGPU.cpp @@ -244,11 +244,6 @@ static bool checkAMDGPUWavesPerEUArguments(Sema &S, Expr *MinExpr, if (MaxExpr && !S.checkUInt32Argument(Attr, MaxExpr, Max, 1)) return true; - if (Min == 0 && Max != 0) { -S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid) -<< &Attr << 0; -return true; - } if (Max != 0 && Min > Max) { S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid) << &Attr << 1; diff --git a/clang/test/SemaOpenCL/amdgpu-attrs.cl b/clang/test/SemaOpenCL/amdgpu-attrs.cl index 89ba3f86803c5..b9b44dff4d4a9 100644 --- a/clang/test/SemaOpenCL/amdgpu-attrs.cl +++ b/clang/test/SemaOpenCL/amdgpu-attrs.cl @@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void kernel_num_sgpr_L() {} __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() {} // expected-error {{integer constant expression evaluates to value 4294967296 that cannot be represented in a 32-bit unsigned integer type}} __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void kernel_flat_work_group_size_0_64() {} // expected-error {{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 since min is 0}} -__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' attribute argument is invalid: max must be 0 since min is 0}} __attribute__((amdgpu_flat_work_group_size(64, 32))) kernel void kernel_flat_work_group_size_64_32() {} // expected-error {{'amdgpu_flat_work_group_size' attribute argument is invalid: min must not be greater than max}} __attribute__((amdgpu_waves_per_eu(4, 2))) kernel void kernel_waves_per_eu_4_2() {} // expected-error {{'amdgpu_waves_per_eu' attribute argument is invalid: min must not be greater than max}} diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 73b4274a41ee6..402af44e38cbf 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -2473,6 +2473,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, AttributeList Attrs, CheckFailed("invalid value for 'denormal-fp-
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
https://github.com/lucas-rami updated https://github.com/llvm/llvm-project/pull/138284 >From b277b2e90a3665c5e945ddff47c6c55a7fdb8d33 Mon Sep 17 00:00:00 2001 From: Lucas Ramirez Date: Thu, 22 May 2025 13:26:57 + Subject: [PATCH 1/2] Rebase on main (integrate changes from 1b34722) --- clang/lib/CodeGen/Targets/AMDGPU.cpp | 27 - clang/lib/Sema/SemaAMDGPU.cpp | 5 -- clang/test/SemaOpenCL/amdgpu-attrs.cl | 1 - llvm/lib/IR/Verifier.cpp | 23 llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp | 4 ++ llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp| 46 --- llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h | 4 +- .../AMDGPU/attr-amdgpu-waves-per-eu.ll| 12 ...-work-group-size-overrides-waves-per-eu.ll | 4 +- .../CodeGen/AMDGPU/propagate-waves-per-eu.ll | 56 --- .../Verifier/AMDGPU/amdgpu-waves-per-eu.ll| 40 + 11 files changed, 147 insertions(+), 75 deletions(-) create mode 100644 llvm/test/Verifier/AMDGPU/amdgpu-waves-per-eu.ll diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp b/clang/lib/CodeGen/Targets/AMDGPU.cpp index 47a552a7bf495..d2c43f86b7b13 100644 --- a/clang/lib/CodeGen/Targets/AMDGPU.cpp +++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp @@ -743,20 +743,21 @@ void CodeGenModule::handleAMDGPUWavesPerEUAttr( llvm::Function *F, const AMDGPUWavesPerEUAttr *Attr) { unsigned Min = Attr->getMin()->EvaluateKnownConstInt(getContext()).getExtValue(); - unsigned Max = - Attr->getMax() - ? Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue() - : 0; - if (Min != 0) { -assert((Max == 0 || Min <= Max) && "Min must be less than or equal Max"); - -std::string AttrVal = llvm::utostr(Min); -if (Max != 0) - AttrVal = AttrVal + "," + llvm::utostr(Max); -F->addFnAttr("amdgpu-waves-per-eu", AttrVal); - } else -assert(Max == 0 && "Max must be zero"); + if (Attr->getMax()) { +unsigned Max = +Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue(); +assert(Min == 0 || (Min != 0 && Max != 0) && + "Min must be non-zero when Max is non-zero"); +assert(Min <= Max && "Min must be less than or equal to Max"); +// Do not add the attribute if min,max=0,0. +if (Min != 0) { + std::string AttrVal = llvm::utostr(Min) + "," + llvm::utostr(Max); + F->addFnAttr("amdgpu-waves-per-eu", AttrVal); +} + } else if (Min != 0) { +F->addFnAttr("amdgpu-waves-per-eu", llvm::utostr(Min)); + } } std::unique_ptr diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp index e6414a623b929..9ae3ec1289def 100644 --- a/clang/lib/Sema/SemaAMDGPU.cpp +++ b/clang/lib/Sema/SemaAMDGPU.cpp @@ -244,11 +244,6 @@ static bool checkAMDGPUWavesPerEUArguments(Sema &S, Expr *MinExpr, if (MaxExpr && !S.checkUInt32Argument(Attr, MaxExpr, Max, 1)) return true; - if (Min == 0 && Max != 0) { -S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid) -<< &Attr << 0; -return true; - } if (Max != 0 && Min > Max) { S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid) << &Attr << 1; diff --git a/clang/test/SemaOpenCL/amdgpu-attrs.cl b/clang/test/SemaOpenCL/amdgpu-attrs.cl index 89ba3f86803c5..b9b44dff4d4a9 100644 --- a/clang/test/SemaOpenCL/amdgpu-attrs.cl +++ b/clang/test/SemaOpenCL/amdgpu-attrs.cl @@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void kernel_num_sgpr_L() {} __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() {} // expected-error {{integer constant expression evaluates to value 4294967296 that cannot be represented in a 32-bit unsigned integer type}} __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void kernel_flat_work_group_size_0_64() {} // expected-error {{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 since min is 0}} -__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' attribute argument is invalid: max must be 0 since min is 0}} __attribute__((amdgpu_flat_work_group_size(64, 32))) kernel void kernel_flat_work_group_size_64_32() {} // expected-error {{'amdgpu_flat_work_group_size' attribute argument is invalid: min must not be greater than max}} __attribute__((amdgpu_waves_per_eu(4, 2))) kernel void kernel_waves_per_eu_4_2() {} // expected-error {{'amdgpu_waves_per_eu' attribute argument is invalid: min must not be greater than max}} diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 9cab88b09779a..13b62ad548b63 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -2519,6 +2519,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, AttributeList Attrs, CheckFailed("invalid value for 'denormal-fp-math-f32' attribute: " + S, V); } +
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
lucas-rami wrote: ping https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
lucas-rami wrote: > IIUC that is because the flat workgroup size. Waves per EU must yield to the > value computed from flat workgroup size, and if it is absent, we must assume > it can be 1024. AFAIU the intent of the existing implementation is that the default minimum waves/EU is set so that all the waves of a workgroup of maximum size can fit concurrently on a single CU. I am not proposing we change that, what I would like the "amdgpu-waves-per-eu" attribute to do is be able to lower than minimum at the user's request in cases where higher occupancies are not thought to be beneficial. https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
https://github.com/lucas-rami updated https://github.com/llvm/llvm-project/pull/138284 >From b277b2e90a3665c5e945ddff47c6c55a7fdb8d33 Mon Sep 17 00:00:00 2001 From: Lucas Ramirez Date: Thu, 22 May 2025 13:26:57 + Subject: [PATCH 1/3] Rebase on main (integrate changes from 1b34722) --- clang/lib/CodeGen/Targets/AMDGPU.cpp | 27 - clang/lib/Sema/SemaAMDGPU.cpp | 5 -- clang/test/SemaOpenCL/amdgpu-attrs.cl | 1 - llvm/lib/IR/Verifier.cpp | 23 llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp | 4 ++ llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp| 46 --- llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h | 4 +- .../AMDGPU/attr-amdgpu-waves-per-eu.ll| 12 ...-work-group-size-overrides-waves-per-eu.ll | 4 +- .../CodeGen/AMDGPU/propagate-waves-per-eu.ll | 56 --- .../Verifier/AMDGPU/amdgpu-waves-per-eu.ll| 40 + 11 files changed, 147 insertions(+), 75 deletions(-) create mode 100644 llvm/test/Verifier/AMDGPU/amdgpu-waves-per-eu.ll diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp b/clang/lib/CodeGen/Targets/AMDGPU.cpp index 47a552a7bf495..d2c43f86b7b13 100644 --- a/clang/lib/CodeGen/Targets/AMDGPU.cpp +++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp @@ -743,20 +743,21 @@ void CodeGenModule::handleAMDGPUWavesPerEUAttr( llvm::Function *F, const AMDGPUWavesPerEUAttr *Attr) { unsigned Min = Attr->getMin()->EvaluateKnownConstInt(getContext()).getExtValue(); - unsigned Max = - Attr->getMax() - ? Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue() - : 0; - if (Min != 0) { -assert((Max == 0 || Min <= Max) && "Min must be less than or equal Max"); - -std::string AttrVal = llvm::utostr(Min); -if (Max != 0) - AttrVal = AttrVal + "," + llvm::utostr(Max); -F->addFnAttr("amdgpu-waves-per-eu", AttrVal); - } else -assert(Max == 0 && "Max must be zero"); + if (Attr->getMax()) { +unsigned Max = +Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue(); +assert(Min == 0 || (Min != 0 && Max != 0) && + "Min must be non-zero when Max is non-zero"); +assert(Min <= Max && "Min must be less than or equal to Max"); +// Do not add the attribute if min,max=0,0. +if (Min != 0) { + std::string AttrVal = llvm::utostr(Min) + "," + llvm::utostr(Max); + F->addFnAttr("amdgpu-waves-per-eu", AttrVal); +} + } else if (Min != 0) { +F->addFnAttr("amdgpu-waves-per-eu", llvm::utostr(Min)); + } } std::unique_ptr diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp index e6414a623b929..9ae3ec1289def 100644 --- a/clang/lib/Sema/SemaAMDGPU.cpp +++ b/clang/lib/Sema/SemaAMDGPU.cpp @@ -244,11 +244,6 @@ static bool checkAMDGPUWavesPerEUArguments(Sema &S, Expr *MinExpr, if (MaxExpr && !S.checkUInt32Argument(Attr, MaxExpr, Max, 1)) return true; - if (Min == 0 && Max != 0) { -S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid) -<< &Attr << 0; -return true; - } if (Max != 0 && Min > Max) { S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid) << &Attr << 1; diff --git a/clang/test/SemaOpenCL/amdgpu-attrs.cl b/clang/test/SemaOpenCL/amdgpu-attrs.cl index 89ba3f86803c5..b9b44dff4d4a9 100644 --- a/clang/test/SemaOpenCL/amdgpu-attrs.cl +++ b/clang/test/SemaOpenCL/amdgpu-attrs.cl @@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void kernel_num_sgpr_L() {} __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() {} // expected-error {{integer constant expression evaluates to value 4294967296 that cannot be represented in a 32-bit unsigned integer type}} __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void kernel_flat_work_group_size_0_64() {} // expected-error {{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 since min is 0}} -__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' attribute argument is invalid: max must be 0 since min is 0}} __attribute__((amdgpu_flat_work_group_size(64, 32))) kernel void kernel_flat_work_group_size_64_32() {} // expected-error {{'amdgpu_flat_work_group_size' attribute argument is invalid: min must not be greater than max}} __attribute__((amdgpu_waves_per_eu(4, 2))) kernel void kernel_waves_per_eu_4_2() {} // expected-error {{'amdgpu_waves_per_eu' attribute argument is invalid: min must not be greater than max}} diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 9cab88b09779a..13b62ad548b63 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -2519,6 +2519,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, AttributeList Attrs, CheckFailed("invalid value for 'denormal-fp-math-f32' attribute: " + S, V); } +
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
@@ -0,0 +1,40 @@ +; RUN: not llvm-as -disable-output %s 2>&1 | FileCheck %s + +target triple = "amdgcn-amd-amdhsa" + +define void @valid_amdgpu_waves_per_eu_range() "amdgpu-waves-per-eu"="2,4" { + ret void +} lucas-rami wrote: Will keep that in mind, thanks. https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
@@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void kernel_num_sgpr_L() {} __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() {} // expected-error {{integer constant expression evaluates to value 4294967296 that cannot be represented in a 32-bit unsigned integer type}} __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void kernel_flat_work_group_size_0_64() {} // expected-error {{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 since min is 0}} -__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' attribute argument is invalid: max must be 0 since min is 0}} lucas-rami wrote: [0,4] is now a valid range (i.e., no minimum requested, at most 4) so I moved it below instead of deleting it. I also added some HIP codegen tests. https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
@@ -2519,6 +2519,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, AttributeList Attrs, CheckFailed("invalid value for 'denormal-fp-math-f32' attribute: " + S, V); } + + if (TT.isAMDGPU()) { +if (auto A = Attrs.getFnAttr("amdgpu-waves-per-eu"); A.isValid()) { + std::pair Strs = A.getValueAsString().split(','); + unsigned Min = 0; + StringRef MinStr = Strs.first.trim(); + Check(!MinStr.getAsInteger(0, Min), +"minimum for 'amdgpu-waves-per-eu' must be integer: " + MinStr); + if (!Strs.second.empty()) { +unsigned Max = 0; +StringRef MaxStr = Strs.second.trim(); +Check(!MaxStr.getAsInteger(0, Max), + "maximum for 'amdgpu-waves-per-eu' must be integer: " + MaxStr); +Check(Max, "maximum for 'amdgpu-waves-per-eu' must be non-zero"); +Check(Min <= Max, "minimum must be less than or equal to maximum for " + "'amdgpu-waves-per-eu': " + + MinStr + " > " + MaxStr); + } else { +Check(Min, "minimum for 'amdgpu-waves-per-eu' must be non-zero when " + "maximum is not provided"); + } +} + } lucas-rami wrote: Deleted this for now and will prepare a patch to go on top of this one. https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)
https://github.com/lucas-rami edited https://github.com/llvm/llvm-project/pull/138284 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits