[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-05-08 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami updated 
https://github.com/llvm/llvm-project/pull/138284



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Allow 0 as min./max. of amdgpu-waves-per-eu (PR #138284)

2025-05-07 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami updated 
https://github.com/llvm/llvm-project/pull/138284



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Allow 0 as min./max. of amdgpu-waves-per-eu (PR #138284)

2025-05-07 Thread Lucas Ramirez via cfe-commits

lucas-rami wrote:

> It doesn't add the attribute if it's 0?

Indeed I missed this, sorry.

> I don't see a reason to support this on the IR level. We should just make it 
> a verifier error to use 0.

After further investigation and from what I can understand sema already rejects 
a minimum of 0 if the maximum is not 0 i.e., it will only accept 
amdgpu_waves_per_eu(0, 0) or amdgpu_waves_per_eu(N) or amdgpu_waves_per_eu(N, 
M) where N>0 and M>=N.

> Supporting 0 is only a shortcut to the default behavior without knowing the 
> target limits in the source

I see your point. With the current way the attribute affects the range of 
waves/EU returned by the subtarget method a user may set the range to [1,N] to 
only limit the maximum number of waves/EU, so accepting 0 is indeed useless 
as-is. I think it only makes sense with what I intended to be the followup 
change to this, which I integrated here now (PR description updated to reflect 
the change). Hopefully letting the minimum be 0 makes more sense now.

https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Allow 0 as min./max. of amdgpu-waves-per-eu (PR #138284)

2025-05-07 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami edited 
https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-05-07 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami edited 
https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-05-07 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami edited 
https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-05-22 Thread Lucas Ramirez via cfe-commits


@@ -279,7 +268,7 @@ define amdgpu_kernel void @kernel_3_6() #12 {
 ; 3,6 -> 6,9
 define internal void @refine_upper_func_3_6() #13 {
 ; CHECK-LABEL: define internal void @refine_upper_func_3_6
-; CHECK-SAME: () #[[ATTR9]] {
+; CHECK-SAME: () #[[ATTR14:[0-9]+]] {

lucas-rami wrote:

This is the last "real" change in the file. The waves/EU range for this 
function goes from [4,10] to [3,6]. All attribute changes below are just 
renames.

https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-05-22 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami updated 
https://github.com/llvm/llvm-project/pull/138284

>From 9906334cbabf160221dea11aa93e2cf7a154e6da Mon Sep 17 00:00:00 2001
From: Lucas Ramirez 
Date: Thu, 22 May 2025 13:26:57 +
Subject: [PATCH] Rebase on main (integrate changes from 1b34722)

---
 clang/lib/CodeGen/Targets/AMDGPU.cpp  | 27 -
 clang/lib/Sema/SemaAMDGPU.cpp |  5 --
 clang/test/SemaOpenCL/amdgpu-attrs.cl |  1 -
 llvm/lib/IR/Verifier.cpp  | 23 
 llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp   |  4 ++
 llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp| 46 ---
 llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h  |  4 +-
 .../AMDGPU/attr-amdgpu-waves-per-eu.ll| 12 
 ...-work-group-size-overrides-waves-per-eu.ll |  4 +-
 ...ine-scheduler-sink-trivial-remats-attr.mir | 12 ++--
 .../CodeGen/AMDGPU/propagate-waves-per-eu.ll  | 56 ---
 .../Verifier/AMDGPU/amdgpu-waves-per-eu.ll| 40 +
 12 files changed, 153 insertions(+), 81 deletions(-)
 create mode 100644 llvm/test/Verifier/AMDGPU/amdgpu-waves-per-eu.ll

diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp 
b/clang/lib/CodeGen/Targets/AMDGPU.cpp
index c8921c434db47..8072d4f363feb 100644
--- a/clang/lib/CodeGen/Targets/AMDGPU.cpp
+++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp
@@ -743,20 +743,21 @@ void CodeGenModule::handleAMDGPUWavesPerEUAttr(
 llvm::Function *F, const AMDGPUWavesPerEUAttr *Attr) {
   unsigned Min =
   Attr->getMin()->EvaluateKnownConstInt(getContext()).getExtValue();
-  unsigned Max =
-  Attr->getMax()
-  ? Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue()
-  : 0;
 
-  if (Min != 0) {
-assert((Max == 0 || Min <= Max) && "Min must be less than or equal Max");
-
-std::string AttrVal = llvm::utostr(Min);
-if (Max != 0)
-  AttrVal = AttrVal + "," + llvm::utostr(Max);
-F->addFnAttr("amdgpu-waves-per-eu", AttrVal);
-  } else
-assert(Max == 0 && "Max must be zero");
+  if (Attr->getMax()) {
+unsigned Max =
+Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue();
+assert(Min == 0 || (Min != 0 && Max != 0) &&
+   "Min must be non-zero when Max is non-zero");
+assert(Min <= Max && "Min must be less than or equal to Max");
+// Do not add the attribute if min,max=0,0.
+if (Min != 0) {
+  std::string AttrVal = llvm::utostr(Min) + "," + llvm::utostr(Max);
+  F->addFnAttr("amdgpu-waves-per-eu", AttrVal);
+}
+  } else if (Min != 0) {
+F->addFnAttr("amdgpu-waves-per-eu", llvm::utostr(Min));
+  }
 }
 
 std::unique_ptr
diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp
index e6414a623b929..9ae3ec1289def 100644
--- a/clang/lib/Sema/SemaAMDGPU.cpp
+++ b/clang/lib/Sema/SemaAMDGPU.cpp
@@ -244,11 +244,6 @@ static bool checkAMDGPUWavesPerEUArguments(Sema &S, Expr 
*MinExpr,
   if (MaxExpr && !S.checkUInt32Argument(Attr, MaxExpr, Max, 1))
 return true;
 
-  if (Min == 0 && Max != 0) {
-S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid)
-<< &Attr << 0;
-return true;
-  }
   if (Max != 0 && Min > Max) {
 S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid)
 << &Attr << 1;
diff --git a/clang/test/SemaOpenCL/amdgpu-attrs.cl 
b/clang/test/SemaOpenCL/amdgpu-attrs.cl
index 89ba3f86803c5..b9b44dff4d4a9 100644
--- a/clang/test/SemaOpenCL/amdgpu-attrs.cl
+++ b/clang/test/SemaOpenCL/amdgpu-attrs.cl
@@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void 
kernel_num_sgpr_L() {}
 __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() 
{} // expected-error {{integer constant expression evaluates to value 
4294967296 that cannot be represented in a 32-bit unsigned integer type}}
 
 __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void 
kernel_flat_work_group_size_0_64() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 
since min is 0}}
-__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void 
kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: max must be 0 since min is 0}}
 
 __attribute__((amdgpu_flat_work_group_size(64, 32))) kernel void 
kernel_flat_work_group_size_64_32() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: min must not be 
greater than max}}
 __attribute__((amdgpu_waves_per_eu(4, 2))) kernel void 
kernel_waves_per_eu_4_2() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: min must not be greater than max}}
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 73b4274a41ee6..402af44e38cbf 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -2473,6 +2473,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, 
AttributeList Attrs,
   CheckFailed("invalid value for 'denormal-fp-math

[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-05-22 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami updated 
https://github.com/llvm/llvm-project/pull/138284

>From 9906334cbabf160221dea11aa93e2cf7a154e6da Mon Sep 17 00:00:00 2001
From: Lucas Ramirez 
Date: Thu, 22 May 2025 13:26:57 +
Subject: [PATCH 1/2] Rebase on main (integrate changes from 1b34722)

---
 clang/lib/CodeGen/Targets/AMDGPU.cpp  | 27 -
 clang/lib/Sema/SemaAMDGPU.cpp |  5 --
 clang/test/SemaOpenCL/amdgpu-attrs.cl |  1 -
 llvm/lib/IR/Verifier.cpp  | 23 
 llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp   |  4 ++
 llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp| 46 ---
 llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h  |  4 +-
 .../AMDGPU/attr-amdgpu-waves-per-eu.ll| 12 
 ...-work-group-size-overrides-waves-per-eu.ll |  4 +-
 ...ine-scheduler-sink-trivial-remats-attr.mir | 12 ++--
 .../CodeGen/AMDGPU/propagate-waves-per-eu.ll  | 56 ---
 .../Verifier/AMDGPU/amdgpu-waves-per-eu.ll| 40 +
 12 files changed, 153 insertions(+), 81 deletions(-)
 create mode 100644 llvm/test/Verifier/AMDGPU/amdgpu-waves-per-eu.ll

diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp 
b/clang/lib/CodeGen/Targets/AMDGPU.cpp
index c8921c434db47..8072d4f363feb 100644
--- a/clang/lib/CodeGen/Targets/AMDGPU.cpp
+++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp
@@ -743,20 +743,21 @@ void CodeGenModule::handleAMDGPUWavesPerEUAttr(
 llvm::Function *F, const AMDGPUWavesPerEUAttr *Attr) {
   unsigned Min =
   Attr->getMin()->EvaluateKnownConstInt(getContext()).getExtValue();
-  unsigned Max =
-  Attr->getMax()
-  ? Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue()
-  : 0;
 
-  if (Min != 0) {
-assert((Max == 0 || Min <= Max) && "Min must be less than or equal Max");
-
-std::string AttrVal = llvm::utostr(Min);
-if (Max != 0)
-  AttrVal = AttrVal + "," + llvm::utostr(Max);
-F->addFnAttr("amdgpu-waves-per-eu", AttrVal);
-  } else
-assert(Max == 0 && "Max must be zero");
+  if (Attr->getMax()) {
+unsigned Max =
+Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue();
+assert(Min == 0 || (Min != 0 && Max != 0) &&
+   "Min must be non-zero when Max is non-zero");
+assert(Min <= Max && "Min must be less than or equal to Max");
+// Do not add the attribute if min,max=0,0.
+if (Min != 0) {
+  std::string AttrVal = llvm::utostr(Min) + "," + llvm::utostr(Max);
+  F->addFnAttr("amdgpu-waves-per-eu", AttrVal);
+}
+  } else if (Min != 0) {
+F->addFnAttr("amdgpu-waves-per-eu", llvm::utostr(Min));
+  }
 }
 
 std::unique_ptr
diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp
index e6414a623b929..9ae3ec1289def 100644
--- a/clang/lib/Sema/SemaAMDGPU.cpp
+++ b/clang/lib/Sema/SemaAMDGPU.cpp
@@ -244,11 +244,6 @@ static bool checkAMDGPUWavesPerEUArguments(Sema &S, Expr 
*MinExpr,
   if (MaxExpr && !S.checkUInt32Argument(Attr, MaxExpr, Max, 1))
 return true;
 
-  if (Min == 0 && Max != 0) {
-S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid)
-<< &Attr << 0;
-return true;
-  }
   if (Max != 0 && Min > Max) {
 S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid)
 << &Attr << 1;
diff --git a/clang/test/SemaOpenCL/amdgpu-attrs.cl 
b/clang/test/SemaOpenCL/amdgpu-attrs.cl
index 89ba3f86803c5..b9b44dff4d4a9 100644
--- a/clang/test/SemaOpenCL/amdgpu-attrs.cl
+++ b/clang/test/SemaOpenCL/amdgpu-attrs.cl
@@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void 
kernel_num_sgpr_L() {}
 __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() 
{} // expected-error {{integer constant expression evaluates to value 
4294967296 that cannot be represented in a 32-bit unsigned integer type}}
 
 __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void 
kernel_flat_work_group_size_0_64() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 
since min is 0}}
-__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void 
kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: max must be 0 since min is 0}}
 
 __attribute__((amdgpu_flat_work_group_size(64, 32))) kernel void 
kernel_flat_work_group_size_64_32() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: min must not be 
greater than max}}
 __attribute__((amdgpu_waves_per_eu(4, 2))) kernel void 
kernel_waves_per_eu_4_2() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: min must not be greater than max}}
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 73b4274a41ee6..402af44e38cbf 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -2473,6 +2473,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, 
AttributeList Attrs,
   CheckFailed("invalid value for 'denormal-fp-

[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-06-26 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami updated 
https://github.com/llvm/llvm-project/pull/138284

>From b277b2e90a3665c5e945ddff47c6c55a7fdb8d33 Mon Sep 17 00:00:00 2001
From: Lucas Ramirez 
Date: Thu, 22 May 2025 13:26:57 +
Subject: [PATCH 1/2] Rebase on main (integrate changes from 1b34722)

---
 clang/lib/CodeGen/Targets/AMDGPU.cpp  | 27 -
 clang/lib/Sema/SemaAMDGPU.cpp |  5 --
 clang/test/SemaOpenCL/amdgpu-attrs.cl |  1 -
 llvm/lib/IR/Verifier.cpp  | 23 
 llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp   |  4 ++
 llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp| 46 ---
 llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h  |  4 +-
 .../AMDGPU/attr-amdgpu-waves-per-eu.ll| 12 
 ...-work-group-size-overrides-waves-per-eu.ll |  4 +-
 .../CodeGen/AMDGPU/propagate-waves-per-eu.ll  | 56 ---
 .../Verifier/AMDGPU/amdgpu-waves-per-eu.ll| 40 +
 11 files changed, 147 insertions(+), 75 deletions(-)
 create mode 100644 llvm/test/Verifier/AMDGPU/amdgpu-waves-per-eu.ll

diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp 
b/clang/lib/CodeGen/Targets/AMDGPU.cpp
index 47a552a7bf495..d2c43f86b7b13 100644
--- a/clang/lib/CodeGen/Targets/AMDGPU.cpp
+++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp
@@ -743,20 +743,21 @@ void CodeGenModule::handleAMDGPUWavesPerEUAttr(
 llvm::Function *F, const AMDGPUWavesPerEUAttr *Attr) {
   unsigned Min =
   Attr->getMin()->EvaluateKnownConstInt(getContext()).getExtValue();
-  unsigned Max =
-  Attr->getMax()
-  ? Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue()
-  : 0;
 
-  if (Min != 0) {
-assert((Max == 0 || Min <= Max) && "Min must be less than or equal Max");
-
-std::string AttrVal = llvm::utostr(Min);
-if (Max != 0)
-  AttrVal = AttrVal + "," + llvm::utostr(Max);
-F->addFnAttr("amdgpu-waves-per-eu", AttrVal);
-  } else
-assert(Max == 0 && "Max must be zero");
+  if (Attr->getMax()) {
+unsigned Max =
+Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue();
+assert(Min == 0 || (Min != 0 && Max != 0) &&
+   "Min must be non-zero when Max is non-zero");
+assert(Min <= Max && "Min must be less than or equal to Max");
+// Do not add the attribute if min,max=0,0.
+if (Min != 0) {
+  std::string AttrVal = llvm::utostr(Min) + "," + llvm::utostr(Max);
+  F->addFnAttr("amdgpu-waves-per-eu", AttrVal);
+}
+  } else if (Min != 0) {
+F->addFnAttr("amdgpu-waves-per-eu", llvm::utostr(Min));
+  }
 }
 
 std::unique_ptr
diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp
index e6414a623b929..9ae3ec1289def 100644
--- a/clang/lib/Sema/SemaAMDGPU.cpp
+++ b/clang/lib/Sema/SemaAMDGPU.cpp
@@ -244,11 +244,6 @@ static bool checkAMDGPUWavesPerEUArguments(Sema &S, Expr 
*MinExpr,
   if (MaxExpr && !S.checkUInt32Argument(Attr, MaxExpr, Max, 1))
 return true;
 
-  if (Min == 0 && Max != 0) {
-S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid)
-<< &Attr << 0;
-return true;
-  }
   if (Max != 0 && Min > Max) {
 S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid)
 << &Attr << 1;
diff --git a/clang/test/SemaOpenCL/amdgpu-attrs.cl 
b/clang/test/SemaOpenCL/amdgpu-attrs.cl
index 89ba3f86803c5..b9b44dff4d4a9 100644
--- a/clang/test/SemaOpenCL/amdgpu-attrs.cl
+++ b/clang/test/SemaOpenCL/amdgpu-attrs.cl
@@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void 
kernel_num_sgpr_L() {}
 __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() 
{} // expected-error {{integer constant expression evaluates to value 
4294967296 that cannot be represented in a 32-bit unsigned integer type}}
 
 __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void 
kernel_flat_work_group_size_0_64() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 
since min is 0}}
-__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void 
kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: max must be 0 since min is 0}}
 
 __attribute__((amdgpu_flat_work_group_size(64, 32))) kernel void 
kernel_flat_work_group_size_64_32() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: min must not be 
greater than max}}
 __attribute__((amdgpu_waves_per_eu(4, 2))) kernel void 
kernel_waves_per_eu_4_2() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: min must not be greater than max}}
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 9cab88b09779a..13b62ad548b63 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -2519,6 +2519,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, 
AttributeList Attrs,
   CheckFailed("invalid value for 'denormal-fp-math-f32' attribute: " + S,
   V);
   }
+

[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-06-26 Thread Lucas Ramirez via cfe-commits

lucas-rami wrote:

ping

https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-06-26 Thread Lucas Ramirez via cfe-commits

lucas-rami wrote:

> IIUC that is because the flat workgroup size. Waves per EU must yield to the 
> value computed from flat workgroup size, and if it is absent, we must assume 
> it can be 1024.

AFAIU the intent of the existing implementation is that the default minimum 
waves/EU is set so that all the waves of a workgroup of maximum size can fit 
concurrently on a single CU. I am not proposing we change that, what I would 
like the "amdgpu-waves-per-eu" attribute to do is be able to lower than minimum 
at the user's request in cases where higher occupancies are not thought to be 
beneficial.

https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-06-27 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami updated 
https://github.com/llvm/llvm-project/pull/138284

>From b277b2e90a3665c5e945ddff47c6c55a7fdb8d33 Mon Sep 17 00:00:00 2001
From: Lucas Ramirez 
Date: Thu, 22 May 2025 13:26:57 +
Subject: [PATCH 1/3] Rebase on main (integrate changes from 1b34722)

---
 clang/lib/CodeGen/Targets/AMDGPU.cpp  | 27 -
 clang/lib/Sema/SemaAMDGPU.cpp |  5 --
 clang/test/SemaOpenCL/amdgpu-attrs.cl |  1 -
 llvm/lib/IR/Verifier.cpp  | 23 
 llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp   |  4 ++
 llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp| 46 ---
 llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h  |  4 +-
 .../AMDGPU/attr-amdgpu-waves-per-eu.ll| 12 
 ...-work-group-size-overrides-waves-per-eu.ll |  4 +-
 .../CodeGen/AMDGPU/propagate-waves-per-eu.ll  | 56 ---
 .../Verifier/AMDGPU/amdgpu-waves-per-eu.ll| 40 +
 11 files changed, 147 insertions(+), 75 deletions(-)
 create mode 100644 llvm/test/Verifier/AMDGPU/amdgpu-waves-per-eu.ll

diff --git a/clang/lib/CodeGen/Targets/AMDGPU.cpp 
b/clang/lib/CodeGen/Targets/AMDGPU.cpp
index 47a552a7bf495..d2c43f86b7b13 100644
--- a/clang/lib/CodeGen/Targets/AMDGPU.cpp
+++ b/clang/lib/CodeGen/Targets/AMDGPU.cpp
@@ -743,20 +743,21 @@ void CodeGenModule::handleAMDGPUWavesPerEUAttr(
 llvm::Function *F, const AMDGPUWavesPerEUAttr *Attr) {
   unsigned Min =
   Attr->getMin()->EvaluateKnownConstInt(getContext()).getExtValue();
-  unsigned Max =
-  Attr->getMax()
-  ? Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue()
-  : 0;
 
-  if (Min != 0) {
-assert((Max == 0 || Min <= Max) && "Min must be less than or equal Max");
-
-std::string AttrVal = llvm::utostr(Min);
-if (Max != 0)
-  AttrVal = AttrVal + "," + llvm::utostr(Max);
-F->addFnAttr("amdgpu-waves-per-eu", AttrVal);
-  } else
-assert(Max == 0 && "Max must be zero");
+  if (Attr->getMax()) {
+unsigned Max =
+Attr->getMax()->EvaluateKnownConstInt(getContext()).getExtValue();
+assert(Min == 0 || (Min != 0 && Max != 0) &&
+   "Min must be non-zero when Max is non-zero");
+assert(Min <= Max && "Min must be less than or equal to Max");
+// Do not add the attribute if min,max=0,0.
+if (Min != 0) {
+  std::string AttrVal = llvm::utostr(Min) + "," + llvm::utostr(Max);
+  F->addFnAttr("amdgpu-waves-per-eu", AttrVal);
+}
+  } else if (Min != 0) {
+F->addFnAttr("amdgpu-waves-per-eu", llvm::utostr(Min));
+  }
 }
 
 std::unique_ptr
diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp
index e6414a623b929..9ae3ec1289def 100644
--- a/clang/lib/Sema/SemaAMDGPU.cpp
+++ b/clang/lib/Sema/SemaAMDGPU.cpp
@@ -244,11 +244,6 @@ static bool checkAMDGPUWavesPerEUArguments(Sema &S, Expr 
*MinExpr,
   if (MaxExpr && !S.checkUInt32Argument(Attr, MaxExpr, Max, 1))
 return true;
 
-  if (Min == 0 && Max != 0) {
-S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid)
-<< &Attr << 0;
-return true;
-  }
   if (Max != 0 && Min > Max) {
 S.Diag(Attr.getLocation(), diag::err_attribute_argument_invalid)
 << &Attr << 1;
diff --git a/clang/test/SemaOpenCL/amdgpu-attrs.cl 
b/clang/test/SemaOpenCL/amdgpu-attrs.cl
index 89ba3f86803c5..b9b44dff4d4a9 100644
--- a/clang/test/SemaOpenCL/amdgpu-attrs.cl
+++ b/clang/test/SemaOpenCL/amdgpu-attrs.cl
@@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void 
kernel_num_sgpr_L() {}
 __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() 
{} // expected-error {{integer constant expression evaluates to value 
4294967296 that cannot be represented in a 32-bit unsigned integer type}}
 
 __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void 
kernel_flat_work_group_size_0_64() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 
since min is 0}}
-__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void 
kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: max must be 0 since min is 0}}
 
 __attribute__((amdgpu_flat_work_group_size(64, 32))) kernel void 
kernel_flat_work_group_size_64_32() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: min must not be 
greater than max}}
 __attribute__((amdgpu_waves_per_eu(4, 2))) kernel void 
kernel_waves_per_eu_4_2() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: min must not be greater than max}}
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 9cab88b09779a..13b62ad548b63 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -2519,6 +2519,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, 
AttributeList Attrs,
   CheckFailed("invalid value for 'denormal-fp-math-f32' attribute: " + S,
   V);
   }
+

[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-06-27 Thread Lucas Ramirez via cfe-commits


@@ -0,0 +1,40 @@
+; RUN: not llvm-as -disable-output %s 2>&1 | FileCheck %s
+
+target triple = "amdgcn-amd-amdhsa"
+
+define void @valid_amdgpu_waves_per_eu_range() "amdgpu-waves-per-eu"="2,4" {
+  ret void
+}

lucas-rami wrote:

Will keep that in mind, thanks.

https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-06-27 Thread Lucas Ramirez via cfe-commits


@@ -46,7 +46,6 @@ __attribute__((amdgpu_num_sgpr(4294967296))) kernel void 
kernel_num_sgpr_L() {}
 __attribute__((amdgpu_num_vgpr(4294967296))) kernel void kernel_num_vgpr_L() 
{} // expected-error {{integer constant expression evaluates to value 
4294967296 that cannot be represented in a 32-bit unsigned integer type}}
 
 __attribute__((amdgpu_flat_work_group_size(0, 64))) kernel void 
kernel_flat_work_group_size_0_64() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: max must be 0 
since min is 0}}
-__attribute__((amdgpu_waves_per_eu(0, 4))) kernel void 
kernel_waves_per_eu_0_4() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: max must be 0 since min is 0}}

lucas-rami wrote:

[0,4] is now a valid range (i.e., no minimum requested, at most 4) so I moved 
it below instead of deleting it. I also added some HIP codegen tests. 

https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-06-27 Thread Lucas Ramirez via cfe-commits


@@ -2519,6 +2519,29 @@ void Verifier::verifyFunctionAttrs(FunctionType *FT, 
AttributeList Attrs,
   CheckFailed("invalid value for 'denormal-fp-math-f32' attribute: " + S,
   V);
   }
+
+  if (TT.isAMDGPU()) {
+if (auto A = Attrs.getFnAttr("amdgpu-waves-per-eu"); A.isValid()) {
+  std::pair Strs = A.getValueAsString().split(',');
+  unsigned Min = 0;
+  StringRef MinStr = Strs.first.trim();
+  Check(!MinStr.getAsInteger(0, Min),
+"minimum for 'amdgpu-waves-per-eu' must be integer: " + MinStr);
+  if (!Strs.second.empty()) {
+unsigned Max = 0;
+StringRef MaxStr = Strs.second.trim();
+Check(!MaxStr.getAsInteger(0, Max),
+  "maximum for 'amdgpu-waves-per-eu' must be integer: " + MaxStr);
+Check(Max, "maximum for 'amdgpu-waves-per-eu' must be non-zero");
+Check(Min <= Max, "minimum must be less than or equal to maximum for "
+  "'amdgpu-waves-per-eu': " +
+  MinStr + " > " + MaxStr);
+  } else {
+Check(Min, "minimum for 'amdgpu-waves-per-eu' must be non-zero when "
+   "maximum is not provided");
+  }
+}
+  }

lucas-rami wrote:

Deleted this for now and will prepare a patch to go on top of this one. 

https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU][Clang] Allow amdgpu-waves-per-eu attribute to lower target occupancy range (PR #138284)

2025-06-27 Thread Lucas Ramirez via cfe-commits

https://github.com/lucas-rami edited 
https://github.com/llvm/llvm-project/pull/138284
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits