[llvm-branch-commits] [clang] [llvm] [mlir] [MLIR][OpenMP] Add LLVM translation support for OpenMP UserDefinedMappers (PR #124746)

2025-02-11 Thread Sergio Afonso via llvm-branch-commits


@@ -9099,9 +9099,10 @@ void CGOpenMPRuntime::emitUserDefinedMapper(const 
OMPDeclareMapperDecl *D,
   CGM.getCXXABI().getMangleContext().mangleCanonicalTypeName(Ty, Out);
   std::string Name = getName({"omp_mapper", TyStr, D->getName()});
 
-  auto *NewFn = OMPBuilder.emitUserDefinedMapper(PrivatizeAndGenMapInfoCB,
- ElemTy, Name, CustomMapperCB);
-  UDMMap.try_emplace(D, NewFn);
+  llvm::Expected NewFn = OMPBuilder.emitUserDefinedMapper(
+  PrivatizeAndGenMapInfoCB, ElemTy, Name, CustomMapperCB);
+  assert(NewFn && "Unexpected error in emitUserDefinedMapper");

skatrak wrote:

Wrap the call to `OMPBuilder.emitUserDefinedMapper` with `llvm::cantFail()` 
instead, which would consume and discard the error. Doing it with an assert has 
the problem that it will always crash if assertions are off, regardless of 
errors.

In clang, the error handling process involves early process exit during 
execution of the callback, which allows us to just assume there are no errors 
here. In MLIR to LLVM IR translation, however, we have to forward or handle 
errors so we can exit gracefully.

https://github.com/llvm/llvm-project/pull/124746
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] release/20.x: Fixes for flang/mlir dependencies (PR #125837)

2025-02-11 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic updated 
https://github.com/llvm/llvm-project/pull/125837

>From 88f8956711f7c8d306d08fff8603d6b99e8302c1 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Tue, 4 Feb 2025 16:37:21 +0100
Subject: [PATCH 1/3] [mlir] Fix MLIRTestDialect dependency in MLIRTestIR

This is a test library which is not part of libMLIR, so it should
use normal LINK_LIBS instead of mlir_target_link_libraries.

This fixes an issue introduced in #123910 and follows up on the
fix in #125004, which added the library to DEPENDS, which is not
sufficient.
---
 mlir/test/lib/IR/CMakeLists.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mlir/test/lib/IR/CMakeLists.txt b/mlir/test/lib/IR/CMakeLists.txt
index e5416da70d50080..71a96c7f92c0c7d 100644
--- a/mlir/test/lib/IR/CMakeLists.txt
+++ b/mlir/test/lib/IR/CMakeLists.txt
@@ -27,13 +27,15 @@ add_mlir_library(MLIRTestIR
   TestVisitorsGeneric.cpp
 
   EXCLUDE_FROM_LIBMLIR
+
+  LINK_LIBS PUBLIC
+  MLIRTestDialect
   )
 mlir_target_link_libraries(MLIRTestIR PUBLIC
   MLIRPass
   MLIRBytecodeReader
   MLIRBytecodeWriter
   MLIRFunctionInterfaces
-  MLIRTestDialect
   )
 
 target_include_directories(MLIRTestIR

>From dfa60a77e0bae875ea30340067bebea1c70b9d3d Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Wed, 5 Feb 2025 09:48:23 +0100
Subject: [PATCH 2/3] [flang] Move FIRSupport dependency to correct place
 (#125697)

This library is provided by flang, not MLIR, so it should not be part of
MLIR_LIBS.

Fixes an issue introduced in https://github.com/llvm/llvm-project/pull/120966.

(cherry picked from commit ee76bdac192ce86c5d13e4c712e0327aaefda45f)
---
 flang/lib/Optimizer/Analysis/CMakeLists.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/flang/lib/Optimizer/Analysis/CMakeLists.txt 
b/flang/lib/Optimizer/Analysis/CMakeLists.txt
index 6fe9c70f83765f1..c4dae898f8e5722 100644
--- a/flang/lib/Optimizer/Analysis/CMakeLists.txt
+++ b/flang/lib/Optimizer/Analysis/CMakeLists.txt
@@ -12,6 +12,7 @@ add_flang_library(FIRAnalysis
   LINK_LIBS
   FIRBuilder
   FIRDialect
+  FIRSupport
   HLFIRDialect
 
   MLIR_LIBS
@@ -19,5 +20,4 @@ add_flang_library(FIRAnalysis
   MLIRLLVMDialect
   MLIRMathTransforms
   MLIROpenMPDialect
-  FIRSupport
 )

>From 4c4ed5e2f5357d724e4c26d21ee3e840210b917f Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Wed, 5 Feb 2025 11:58:44 +0100
Subject: [PATCH 3/3] [flang][cmake] Fix bcc dependencies (#125822)

The Fortran libraries are not part of MLIR, so they should use
target_link_libraries() rather than mlir_target_link_libraries().

This fixes an issue introduced in
https://github.com/llvm/llvm-project/pull/120966.

(cherry picked from commit f9af5c145f40480d46874b643ca2b1237e9fbb2a)
---
 flang/tools/bbc/CMakeLists.txt | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/flang/tools/bbc/CMakeLists.txt b/flang/tools/bbc/CMakeLists.txt
index 85aeb85e0c53093..97462be83ea4389 100644
--- a/flang/tools/bbc/CMakeLists.txt
+++ b/flang/tools/bbc/CMakeLists.txt
@@ -29,6 +29,11 @@ target_link_libraries(bbc PRIVATE
   flangFrontend
   flangPasses
   FlangOpenMPTransforms
+  FortranCommon
+  FortranParser
+  FortranEvaluate
+  FortranSemantics
+  FortranLower
 )
 
 mlir_target_link_libraries(bbc PRIVATE
@@ -36,9 +41,4 @@ mlir_target_link_libraries(bbc PRIVATE
   ${extension_libs}
   MLIRAffineToStandard
   MLIRSCFToControlFlow
-  FortranCommon
-  FortranParser
-  FortranEvaluate
-  FortranSemantics
-  FortranLower
 )

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] 88f8956 - [mlir] Fix MLIRTestDialect dependency in MLIRTestIR

2025-02-11 Thread Nikita Popov via llvm-branch-commits

Author: Nikita Popov
Date: 2025-02-11T14:59:41+01:00
New Revision: 88f8956711f7c8d306d08fff8603d6b99e8302c1

URL: 
https://github.com/llvm/llvm-project/commit/88f8956711f7c8d306d08fff8603d6b99e8302c1
DIFF: 
https://github.com/llvm/llvm-project/commit/88f8956711f7c8d306d08fff8603d6b99e8302c1.diff

LOG: [mlir] Fix MLIRTestDialect dependency in MLIRTestIR

This is a test library which is not part of libMLIR, so it should
use normal LINK_LIBS instead of mlir_target_link_libraries.

This fixes an issue introduced in #123910 and follows up on the
fix in #125004, which added the library to DEPENDS, which is not
sufficient.

Added: 


Modified: 
mlir/test/lib/IR/CMakeLists.txt

Removed: 




diff  --git a/mlir/test/lib/IR/CMakeLists.txt b/mlir/test/lib/IR/CMakeLists.txt
index e5416da70d500..71a96c7f92c0c 100644
--- a/mlir/test/lib/IR/CMakeLists.txt
+++ b/mlir/test/lib/IR/CMakeLists.txt
@@ -27,13 +27,15 @@ add_mlir_library(MLIRTestIR
   TestVisitorsGeneric.cpp
 
   EXCLUDE_FROM_LIBMLIR
+
+  LINK_LIBS PUBLIC
+  MLIRTestDialect
   )
 mlir_target_link_libraries(MLIRTestIR PUBLIC
   MLIRPass
   MLIRBytecodeReader
   MLIRBytecodeWriter
   MLIRFunctionInterfaces
-  MLIRTestDialect
   )
 
 target_include_directories(MLIRTestIR



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [libc] release/20.x: [Clang] Fix test after new argument was added (PR #125912)

2025-02-11 Thread Joseph Huber via llvm-branch-commits

jhuber6 wrote:

> @jhuber6 Can you take a look at these test failures.

Looks green now.

https://github.com/llvm/llvm-project/pull/125912
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)

2025-02-11 Thread Finn Plummer via llvm-branch-commits

inbelic wrote:

Rebasing onto the lexer pr api changes of using `ConsumeToken`/`PeekNextToken` 
instead of pre-allocating the tokens

https://github.com/llvm/llvm-project/pull/122982
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a created 
https://github.com/llvm/llvm-project/pull/126762

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all occurrences of gfx940/gfx941 from clang that can be
removed without changes in the llvm directory. The
target-invalid-cpu-note/amdgcn.c test is not included here since it
tests a list of targets that is defined in
llvm/lib/TargetParser/TargetParser.cpp.

For SWDEV-512631

>From 3a165b2b1d718382d9ce2bb62679949684bc541c Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Tue, 11 Feb 2025 08:52:55 -0500
Subject: [PATCH] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in
 clang

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all occurrences of gfx940/gfx941 from clang that can be
removed without changes in the llvm directory. The
target-invalid-cpu-note/amdgcn.c test is not included here since it
tests a list of targets that is defined in
llvm/lib/TargetParser/TargetParser.cpp.

For SWDEV-512631
---
 clang/include/clang/Basic/Cuda.h  |   2 -
 clang/lib/Basic/Cuda.cpp  |   2 -
 clang/lib/Basic/Targets/NVPTX.cpp |   2 -
 clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp  |   2 -
 clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu   |   2 +-
 clang/test/CodeGenOpenCL/amdgpu-features.cl   |   4 -
 .../test/CodeGenOpenCL/builtins-amdgcn-fp8.cl |   2 +-
 ...cn-gfx940.cl => builtins-amdgcn-gfx942.cl} |   2 +-
 .../builtins-amdgcn-gfx950-err.cl |   2 +-
 .../builtins-amdgcn-gws-insts.cl  |   2 +-
 .../CodeGenOpenCL/builtins-amdgcn-mfma.cl | 110 +-
 ...fx940.cl => builtins-fp-atomics-gfx942.cl} |  34 +++---
 clang/test/Driver/amdgpu-macros.cl|   2 -
 clang/test/Driver/amdgpu-mcpu.cl  |   4 -
 clang/test/Driver/cuda-bad-arch.cu|   2 +-
 clang/test/Driver/hip-macros.hip  |  10 +-
 .../test/Misc/target-invalid-cpu-note/nvptx.c |   2 -
 ... => builtins-amdgcn-error-gfx942-param.cl} |   2 +-
 .../builtins-amdgcn-error-gfx950.cl   |   2 +-
 ...0-err.cl => builtins-amdgcn-gfx942-err.cl} |  14 +--
 20 files changed, 91 insertions(+), 113 deletions(-)
 rename clang/test/CodeGenOpenCL/{builtins-amdgcn-gfx940.cl => 
builtins-amdgcn-gfx942.cl} (98%)
 rename clang/test/CodeGenOpenCL/{builtins-fp-atomics-gfx940.cl => 
builtins-fp-atomics-gfx942.cl} (84%)
 rename clang/test/SemaOpenCL/{builtins-amdgcn-error-gfx940-param.cl => 
builtins-amdgcn-error-gfx942-param.cl} (99%)
 rename clang/test/SemaOpenCL/{builtins-amdgcn-gfx940-err.cl => 
builtins-amdgcn-gfx942-err.cl} (81%)

diff --git a/clang/include/clang/Basic/Cuda.h b/clang/include/clang/Basic/Cuda.h
index f33ba46233a7a..793cab1f4e84a 100644
--- a/clang/include/clang/Basic/Cuda.h
+++ b/clang/include/clang/Basic/Cuda.h
@@ -106,8 +106,6 @@ enum class OffloadArch {
   GFX90a,
   GFX90c,
   GFX9_4_GENERIC,
-  GFX940,
-  GFX941,
   GFX942,
   GFX950,
   GFX10_1_GENERIC,
diff --git a/clang/lib/Basic/Cuda.cpp b/clang/lib/Basic/Cuda.cpp
index 1bfec0b37c5ee..f45fb0eca3714 100644
--- a/clang/lib/Basic/Cuda.cpp
+++ b/clang/lib/Basic/Cuda.cpp
@@ -124,8 +124,6 @@ static const OffloadArchToStringMap arch_names[] = {
 GFX(90a),  // gfx90a
 GFX(90c),  // gfx90c
 {OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"},
-GFX(940),  // gfx940
-GFX(941),  // gfx941
 GFX(942),  // gfx942
 GFX(950),  // gfx950
 {OffloadArch::GFX10_1_GENERIC, "gfx10-1-generic", "compute_amdgcn"},
diff --git a/clang/lib/Basic/Targets/NVPTX.cpp 
b/clang/lib/Basic/Targets/NVPTX.cpp
index 7d13c1f145440..547cf3dfa2be7 100644
--- a/clang/lib/Basic/Targets/NVPTX.cpp
+++ b/clang/lib/Basic/Targets/NVPTX.cpp
@@ -211,8 +211,6 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions 
&Opts,
   case OffloadArch::GFX90a:
   case OffloadArch::GFX90c:
   case OffloadArch::GFX9_4_GENERIC:
-  case OffloadArch::GFX940:
-  case OffloadArch::GFX941:
   case OffloadArch::GFX942:
   case OffloadArch::GFX950:
   case OffloadArch::GFX10_1_GENERIC:
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index c13928f61a748..826ec4da8ea28 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -2302,8 +2302,6 @@ void CGOpenMPRuntimeGPU::processRequiresDirective(const 
OMPRequiresDecl *D) {
   case OffloadArch::GFX90a:
   case OffloadArch::GFX90c:
   case OffloadArch::GFX9_4_GENERIC:
-  case OffloadArch::GFX940:
-  case OffloadArch::GFX941:
   case OffloadArch::GFX942:
   case OffloadArch::GFX950:
   case OffloadArch::GFX10_1_GENERIC:
diff --git a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu 
b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
index 47fa3967fe237..37fca614c3111 100644
--- a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.c

[llvm-branch-commits] [clang] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (PR #126763)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

ritter-x2a wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/126763?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#126763** https://app.graphite.dev/github/pr/llvm/llvm-project/126763?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/126763?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#126762** https://app.graphite.dev/github/pr/llvm/llvm-project/126762?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#125836** https://app.graphite.dev/github/pr/llvm/llvm-project/125836?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#125827** https://app.graphite.dev/github/pr/llvm/llvm-project/125827?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#125826** https://app.graphite.dev/github/pr/llvm/llvm-project/125826?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#125711** https://app.graphite.dev/github/pr/llvm/llvm-project/125711?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/126763
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

ritter-x2a wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/126762?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#126763** https://app.graphite.dev/github/pr/llvm/llvm-project/126763?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#126762** https://app.graphite.dev/github/pr/llvm/llvm-project/126762?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/126762?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#125836** https://app.graphite.dev/github/pr/llvm/llvm-project/125836?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#125827** https://app.graphite.dev/github/pr/llvm/llvm-project/125827?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#125826** https://app.graphite.dev/github/pr/llvm/llvm-project/125826?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#125711** https://app.graphite.dev/github/pr/llvm/llvm-project/125711?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/126762
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Add OMP Mapper field to MapInfoOp (PR #120994)

2025-02-11 Thread Akash Banerjee via llvm-branch-commits

https://github.com/TIFitis updated 
https://github.com/llvm/llvm-project/pull/120994

>From 57858d2e19897a72057464bd33311d2cd4d4f156 Mon Sep 17 00:00:00 2001
From: Akash Banerjee 
Date: Mon, 23 Dec 2024 20:53:47 +
Subject: [PATCH 1/2] Add mapper field to mapInfoOp.

---
 flang/lib/Lower/OpenMP/Utils.cpp| 3 ++-
 flang/lib/Lower/OpenMP/Utils.h  | 3 ++-
 flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp  | 5 -
 flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp | 1 +
 mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td   | 2 ++
 mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp| 2 +-
 mlir/test/Dialect/OpenMP/ops.mlir   | 4 ++--
 7 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp
index 35722fa7d1b12..fa1975dac789b 100644
--- a/flang/lib/Lower/OpenMP/Utils.cpp
+++ b/flang/lib/Lower/OpenMP/Utils.cpp
@@ -125,7 +125,7 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location 
loc,
 llvm::ArrayRef members,
 mlir::ArrayAttr membersIndex, uint64_t mapType,
 mlir::omp::VariableCaptureKind mapCaptureType, mlir::Type 
retTy,
-bool partialMap) {
+bool partialMap, mlir::FlatSymbolRefAttr mapperId) {
   if (auto boxTy = llvm::dyn_cast(baseAddr.getType())) {
 baseAddr = builder.create(loc, baseAddr);
 retTy = baseAddr.getType();
@@ -144,6 +144,7 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location 
loc,
   mlir::omp::MapInfoOp op = builder.create(
   loc, retTy, baseAddr, varType, varPtrPtr, members, membersIndex, bounds,
   builder.getIntegerAttr(builder.getIntegerType(64, false), mapType),
+  mapperId,
   builder.getAttr(mapCaptureType),
   builder.getStringAttr(name), builder.getBoolAttr(partialMap));
   return op;
diff --git a/flang/lib/Lower/OpenMP/Utils.h b/flang/lib/Lower/OpenMP/Utils.h
index f2e378443e5f2..3943eb633b04e 100644
--- a/flang/lib/Lower/OpenMP/Utils.h
+++ b/flang/lib/Lower/OpenMP/Utils.h
@@ -116,7 +116,8 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location 
loc,
 llvm::ArrayRef members,
 mlir::ArrayAttr membersIndex, uint64_t mapType,
 mlir::omp::VariableCaptureKind mapCaptureType, mlir::Type 
retTy,
-bool partialMap = false);
+bool partialMap = false,
+mlir::FlatSymbolRefAttr mapperId = mlir::FlatSymbolRefAttr());
 
 void insertChildMapInfoIntoParent(
 Fortran::lower::AbstractConverter &converter,
diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp 
b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
index e7c1d1d9d560f..beea7543e54b3 100644
--- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
@@ -184,6 +184,7 @@ class MapInfoFinalizationPass
 /*members=*/mlir::SmallVector{},
 /*membersIndex=*/mlir::ArrayAttr{}, bounds,
 builder.getIntegerAttr(builder.getIntegerType(64, false), mapType),
+/*mapperId*/ mlir::FlatSymbolRefAttr(),
 builder.getAttr(
 mlir::omp::VariableCaptureKind::ByRef),
 /*name=*/builder.getStringAttr(""),
@@ -329,7 +330,8 @@ class MapInfoFinalizationPass
 builder.getIntegerAttr(
 builder.getIntegerType(64, false),
 getDescriptorMapType(op.getMapType().value_or(0), target)),
-op.getMapCaptureTypeAttr(), op.getNameAttr(),
+/*mapperId*/ mlir::FlatSymbolRefAttr(), op.getMapCaptureTypeAttr(),
+op.getNameAttr(),
 /*partial_map=*/builder.getBoolAttr(false));
 op.replaceAllUsesWith(newDescParentMapOp.getResult());
 op->erase();
@@ -623,6 +625,7 @@ class MapInfoFinalizationPass
   /*members=*/mlir::ValueRange{},
   /*members_index=*/mlir::ArrayAttr{},
   /*bounds=*/bounds, op.getMapTypeAttr(),
+  /*mapperId*/ mlir::FlatSymbolRefAttr(),
   builder.getAttr(
   mlir::omp::VariableCaptureKind::ByRef),
   builder.getStringAttr(op.getNameAttr().strref() + "." +
diff --git a/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp 
b/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp
index 963ae863c1fc5..97ea463a3c495 100644
--- a/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp
+++ b/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp
@@ -91,6 +91,7 @@ class MapsForPrivatizedSymbolsPass
 /*bounds=*/ValueRange{},
 builder.getIntegerAttr(builder.getIntegerType(64, /*isSigned=*/false),
mapTypeTo),
+/*mapperId*/ mlir::FlatSymbolRefAttr(),
 builder.getAttr(
 omp::VariableCaptureKind::ByRef),
 StringAttr(), builder.getBoolAttr(false));
diff --git a/mlir/include/mlir/

[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Fabian Ritter (ritter-x2a)


Changes

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all occurrences of gfx940/gfx941 from clang that can be
removed without changes in the llvm directory. The
target-invalid-cpu-note/amdgcn.c test is not included here since it
tests a list of targets that is defined in
llvm/lib/TargetParser/TargetParser.cpp.

For SWDEV-512631

---

Patch is 41.59 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/126762.diff


20 Files Affected:

- (modified) clang/include/clang/Basic/Cuda.h (-2) 
- (modified) clang/lib/Basic/Cuda.cpp (-2) 
- (modified) clang/lib/Basic/Targets/NVPTX.cpp (-2) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (-2) 
- (modified) clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu (+1-1) 
- (modified) clang/test/CodeGenOpenCL/amdgpu-features.cl (-4) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-fp8.cl (+1-1) 
- (renamed) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx942.cl (+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950-err.cl (+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gws-insts.cl (+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-mfma.cl (+55-55) 
- (renamed) clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx942.cl (+17-17) 
- (modified) clang/test/Driver/amdgpu-macros.cl (-2) 
- (modified) clang/test/Driver/amdgpu-mcpu.cl (-4) 
- (modified) clang/test/Driver/cuda-bad-arch.cu (+1-1) 
- (modified) clang/test/Driver/hip-macros.hip (+4-6) 
- (modified) clang/test/Misc/target-invalid-cpu-note/nvptx.c (-2) 
- (renamed) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx942-param.cl (+1-1) 
- (modified) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950.cl (+1-1) 
- (renamed) clang/test/SemaOpenCL/builtins-amdgcn-gfx942-err.cl (+7-7) 


``diff
diff --git a/clang/include/clang/Basic/Cuda.h b/clang/include/clang/Basic/Cuda.h
index f33ba46233a7a..793cab1f4e84a 100644
--- a/clang/include/clang/Basic/Cuda.h
+++ b/clang/include/clang/Basic/Cuda.h
@@ -106,8 +106,6 @@ enum class OffloadArch {
   GFX90a,
   GFX90c,
   GFX9_4_GENERIC,
-  GFX940,
-  GFX941,
   GFX942,
   GFX950,
   GFX10_1_GENERIC,
diff --git a/clang/lib/Basic/Cuda.cpp b/clang/lib/Basic/Cuda.cpp
index 1bfec0b37c5ee..f45fb0eca3714 100644
--- a/clang/lib/Basic/Cuda.cpp
+++ b/clang/lib/Basic/Cuda.cpp
@@ -124,8 +124,6 @@ static const OffloadArchToStringMap arch_names[] = {
 GFX(90a),  // gfx90a
 GFX(90c),  // gfx90c
 {OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"},
-GFX(940),  // gfx940
-GFX(941),  // gfx941
 GFX(942),  // gfx942
 GFX(950),  // gfx950
 {OffloadArch::GFX10_1_GENERIC, "gfx10-1-generic", "compute_amdgcn"},
diff --git a/clang/lib/Basic/Targets/NVPTX.cpp 
b/clang/lib/Basic/Targets/NVPTX.cpp
index 7d13c1f145440..547cf3dfa2be7 100644
--- a/clang/lib/Basic/Targets/NVPTX.cpp
+++ b/clang/lib/Basic/Targets/NVPTX.cpp
@@ -211,8 +211,6 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions 
&Opts,
   case OffloadArch::GFX90a:
   case OffloadArch::GFX90c:
   case OffloadArch::GFX9_4_GENERIC:
-  case OffloadArch::GFX940:
-  case OffloadArch::GFX941:
   case OffloadArch::GFX942:
   case OffloadArch::GFX950:
   case OffloadArch::GFX10_1_GENERIC:
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index c13928f61a748..826ec4da8ea28 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -2302,8 +2302,6 @@ void CGOpenMPRuntimeGPU::processRequiresDirective(const 
OMPRequiresDecl *D) {
   case OffloadArch::GFX90a:
   case OffloadArch::GFX90c:
   case OffloadArch::GFX9_4_GENERIC:
-  case OffloadArch::GFX940:
-  case OffloadArch::GFX941:
   case OffloadArch::GFX942:
   case OffloadArch::GFX950:
   case OffloadArch::GFX10_1_GENERIC:
diff --git a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu 
b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
index 47fa3967fe237..37fca614c3111 100644
--- a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
+++ b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
@@ -11,7 +11,7 @@
 // RUN:   -fnative-half-arguments-and-returns | FileCheck -check-prefix=SAFE %s
 
 // RUN: %clang_cc1 -x hip %s -O3 -S -o - -triple=amdgcn-amd-amdhsa \
-// RUN:   -fcuda-is-device -target-cpu gfx940 -fnative-half-type \
+// RUN:   -fcuda-is-device -target-cpu gfx942 -fnative-half-type \
 // RUN:   -fnative-half-arguments-and-returns -munsafe-fp-atomics \
 // RUN:   | FileCheck -check-prefix=UNSAFE %s
 
diff --git a/clang/test/CodeGenOpenCL/amdgpu-features.cl 
b/clang/test/CodeGenOpenCL/amdgpu-features.cl
index 633f1dec5e370..d12dcead6fadf 100644
--- a/clang/test/CodeGenOpenCL/amdgpu-features.cl
+++ b/clang/test/CodeGenOpenCL/amdgpu-features.cl
@@ -29,8 +29,6 @@
 // RUN: %clang_cc1 -triple amdgcn

[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Fabian Ritter (ritter-x2a)


Changes

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all occurrences of gfx940/gfx941 from clang that can be
removed without changes in the llvm directory. The
target-invalid-cpu-note/amdgcn.c test is not included here since it
tests a list of targets that is defined in
llvm/lib/TargetParser/TargetParser.cpp.

For SWDEV-512631

---

Patch is 41.59 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/126762.diff


20 Files Affected:

- (modified) clang/include/clang/Basic/Cuda.h (-2) 
- (modified) clang/lib/Basic/Cuda.cpp (-2) 
- (modified) clang/lib/Basic/Targets/NVPTX.cpp (-2) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (-2) 
- (modified) clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu (+1-1) 
- (modified) clang/test/CodeGenOpenCL/amdgpu-features.cl (-4) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-fp8.cl (+1-1) 
- (renamed) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx942.cl (+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950-err.cl (+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gws-insts.cl (+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-mfma.cl (+55-55) 
- (renamed) clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx942.cl (+17-17) 
- (modified) clang/test/Driver/amdgpu-macros.cl (-2) 
- (modified) clang/test/Driver/amdgpu-mcpu.cl (-4) 
- (modified) clang/test/Driver/cuda-bad-arch.cu (+1-1) 
- (modified) clang/test/Driver/hip-macros.hip (+4-6) 
- (modified) clang/test/Misc/target-invalid-cpu-note/nvptx.c (-2) 
- (renamed) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx942-param.cl (+1-1) 
- (modified) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950.cl (+1-1) 
- (renamed) clang/test/SemaOpenCL/builtins-amdgcn-gfx942-err.cl (+7-7) 


``diff
diff --git a/clang/include/clang/Basic/Cuda.h b/clang/include/clang/Basic/Cuda.h
index f33ba46233a7a..793cab1f4e84a 100644
--- a/clang/include/clang/Basic/Cuda.h
+++ b/clang/include/clang/Basic/Cuda.h
@@ -106,8 +106,6 @@ enum class OffloadArch {
   GFX90a,
   GFX90c,
   GFX9_4_GENERIC,
-  GFX940,
-  GFX941,
   GFX942,
   GFX950,
   GFX10_1_GENERIC,
diff --git a/clang/lib/Basic/Cuda.cpp b/clang/lib/Basic/Cuda.cpp
index 1bfec0b37c5ee..f45fb0eca3714 100644
--- a/clang/lib/Basic/Cuda.cpp
+++ b/clang/lib/Basic/Cuda.cpp
@@ -124,8 +124,6 @@ static const OffloadArchToStringMap arch_names[] = {
 GFX(90a),  // gfx90a
 GFX(90c),  // gfx90c
 {OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"},
-GFX(940),  // gfx940
-GFX(941),  // gfx941
 GFX(942),  // gfx942
 GFX(950),  // gfx950
 {OffloadArch::GFX10_1_GENERIC, "gfx10-1-generic", "compute_amdgcn"},
diff --git a/clang/lib/Basic/Targets/NVPTX.cpp 
b/clang/lib/Basic/Targets/NVPTX.cpp
index 7d13c1f145440..547cf3dfa2be7 100644
--- a/clang/lib/Basic/Targets/NVPTX.cpp
+++ b/clang/lib/Basic/Targets/NVPTX.cpp
@@ -211,8 +211,6 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions 
&Opts,
   case OffloadArch::GFX90a:
   case OffloadArch::GFX90c:
   case OffloadArch::GFX9_4_GENERIC:
-  case OffloadArch::GFX940:
-  case OffloadArch::GFX941:
   case OffloadArch::GFX942:
   case OffloadArch::GFX950:
   case OffloadArch::GFX10_1_GENERIC:
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index c13928f61a748..826ec4da8ea28 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -2302,8 +2302,6 @@ void CGOpenMPRuntimeGPU::processRequiresDirective(const 
OMPRequiresDecl *D) {
   case OffloadArch::GFX90a:
   case OffloadArch::GFX90c:
   case OffloadArch::GFX9_4_GENERIC:
-  case OffloadArch::GFX940:
-  case OffloadArch::GFX941:
   case OffloadArch::GFX942:
   case OffloadArch::GFX950:
   case OffloadArch::GFX10_1_GENERIC:
diff --git a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu 
b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
index 47fa3967fe237..37fca614c3111 100644
--- a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
+++ b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
@@ -11,7 +11,7 @@
 // RUN:   -fnative-half-arguments-and-returns | FileCheck -check-prefix=SAFE %s
 
 // RUN: %clang_cc1 -x hip %s -O3 -S -o - -triple=amdgcn-amd-amdhsa \
-// RUN:   -fcuda-is-device -target-cpu gfx940 -fnative-half-type \
+// RUN:   -fcuda-is-device -target-cpu gfx942 -fnative-half-type \
 // RUN:   -fnative-half-arguments-and-returns -munsafe-fp-atomics \
 // RUN:   | FileCheck -check-prefix=UNSAFE %s
 
diff --git a/clang/test/CodeGenOpenCL/amdgpu-features.cl 
b/clang/test/CodeGenOpenCL/amdgpu-features.cl
index 633f1dec5e370..d12dcead6fadf 100644
--- a/clang/test/CodeGenOpenCL/amdgpu-features.cl
+++ b/clang/test/CodeGenOpenCL/amdgpu-features.cl
@@ -29,8 +29,6 @@
 // RUN: %clang_cc1 -trip

[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Shilei Tian via llvm-branch-commits

https://github.com/shiltian approved this pull request.


https://github.com/llvm/llvm-project/pull/126762
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: Fabian Ritter (ritter-x2a)


Changes

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all occurrences of gfx940/gfx941 from clang that can be
removed without changes in the llvm directory. The
target-invalid-cpu-note/amdgcn.c test is not included here since it
tests a list of targets that is defined in
llvm/lib/TargetParser/TargetParser.cpp.

For SWDEV-512631

---

Patch is 41.59 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/126762.diff


20 Files Affected:

- (modified) clang/include/clang/Basic/Cuda.h (-2) 
- (modified) clang/lib/Basic/Cuda.cpp (-2) 
- (modified) clang/lib/Basic/Targets/NVPTX.cpp (-2) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (-2) 
- (modified) clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu (+1-1) 
- (modified) clang/test/CodeGenOpenCL/amdgpu-features.cl (-4) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-fp8.cl (+1-1) 
- (renamed) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx942.cl (+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950-err.cl (+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gws-insts.cl (+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-mfma.cl (+55-55) 
- (renamed) clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx942.cl (+17-17) 
- (modified) clang/test/Driver/amdgpu-macros.cl (-2) 
- (modified) clang/test/Driver/amdgpu-mcpu.cl (-4) 
- (modified) clang/test/Driver/cuda-bad-arch.cu (+1-1) 
- (modified) clang/test/Driver/hip-macros.hip (+4-6) 
- (modified) clang/test/Misc/target-invalid-cpu-note/nvptx.c (-2) 
- (renamed) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx942-param.cl (+1-1) 
- (modified) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950.cl (+1-1) 
- (renamed) clang/test/SemaOpenCL/builtins-amdgcn-gfx942-err.cl (+7-7) 


``diff
diff --git a/clang/include/clang/Basic/Cuda.h b/clang/include/clang/Basic/Cuda.h
index f33ba46233a7a..793cab1f4e84a 100644
--- a/clang/include/clang/Basic/Cuda.h
+++ b/clang/include/clang/Basic/Cuda.h
@@ -106,8 +106,6 @@ enum class OffloadArch {
   GFX90a,
   GFX90c,
   GFX9_4_GENERIC,
-  GFX940,
-  GFX941,
   GFX942,
   GFX950,
   GFX10_1_GENERIC,
diff --git a/clang/lib/Basic/Cuda.cpp b/clang/lib/Basic/Cuda.cpp
index 1bfec0b37c5ee..f45fb0eca3714 100644
--- a/clang/lib/Basic/Cuda.cpp
+++ b/clang/lib/Basic/Cuda.cpp
@@ -124,8 +124,6 @@ static const OffloadArchToStringMap arch_names[] = {
 GFX(90a),  // gfx90a
 GFX(90c),  // gfx90c
 {OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"},
-GFX(940),  // gfx940
-GFX(941),  // gfx941
 GFX(942),  // gfx942
 GFX(950),  // gfx950
 {OffloadArch::GFX10_1_GENERIC, "gfx10-1-generic", "compute_amdgcn"},
diff --git a/clang/lib/Basic/Targets/NVPTX.cpp 
b/clang/lib/Basic/Targets/NVPTX.cpp
index 7d13c1f145440..547cf3dfa2be7 100644
--- a/clang/lib/Basic/Targets/NVPTX.cpp
+++ b/clang/lib/Basic/Targets/NVPTX.cpp
@@ -211,8 +211,6 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions 
&Opts,
   case OffloadArch::GFX90a:
   case OffloadArch::GFX90c:
   case OffloadArch::GFX9_4_GENERIC:
-  case OffloadArch::GFX940:
-  case OffloadArch::GFX941:
   case OffloadArch::GFX942:
   case OffloadArch::GFX950:
   case OffloadArch::GFX10_1_GENERIC:
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index c13928f61a748..826ec4da8ea28 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -2302,8 +2302,6 @@ void CGOpenMPRuntimeGPU::processRequiresDirective(const 
OMPRequiresDecl *D) {
   case OffloadArch::GFX90a:
   case OffloadArch::GFX90c:
   case OffloadArch::GFX9_4_GENERIC:
-  case OffloadArch::GFX940:
-  case OffloadArch::GFX941:
   case OffloadArch::GFX942:
   case OffloadArch::GFX950:
   case OffloadArch::GFX10_1_GENERIC:
diff --git a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu 
b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
index 47fa3967fe237..37fca614c3111 100644
--- a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
+++ b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
@@ -11,7 +11,7 @@
 // RUN:   -fnative-half-arguments-and-returns | FileCheck -check-prefix=SAFE %s
 
 // RUN: %clang_cc1 -x hip %s -O3 -S -o - -triple=amdgcn-amd-amdhsa \
-// RUN:   -fcuda-is-device -target-cpu gfx940 -fnative-half-type \
+// RUN:   -fcuda-is-device -target-cpu gfx942 -fnative-half-type \
 // RUN:   -fnative-half-arguments-and-returns -munsafe-fp-atomics \
 // RUN:   | FileCheck -check-prefix=UNSAFE %s
 
diff --git a/clang/test/CodeGenOpenCL/amdgpu-features.cl 
b/clang/test/CodeGenOpenCL/amdgpu-features.cl
index 633f1dec5e370..d12dcead6fadf 100644
--- a/clang/test/CodeGenOpenCL/amdgpu-features.cl
+++ b/clang/test/CodeGenOpenCL/amdgpu-features.cl
@@ -29,8 +29,6 @@
 // RUN: %clang_cc1 -triple

[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a ready_for_review 
https://github.com/llvm/llvm-project/pull/126762
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Joseph Huber via llvm-branch-commits


@@ -106,8 +106,6 @@ enum class OffloadArch {
   GFX90a,
   GFX90c,
   GFX9_4_GENERIC,
-  GFX940,
-  GFX941,

jhuber6 wrote:

Seems bizarre to just fully remove support when we still accept things like 
`gfx600` to this day. As far as I understand, these are basically just being 
replaced by `gfx942`. Would it be at all possible to do `GFX940 = GFX942`?

https://github.com/llvm/llvm-project/pull/126762
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Shilei Tian via llvm-branch-commits


@@ -106,8 +106,6 @@ enum class OffloadArch {
   GFX90a,
   GFX90c,
   GFX9_4_GENERIC,
-  GFX940,
-  GFX941,

shiltian wrote:

I think the discussion yesterday decided to simply just remove them w/o more 
explanation and aliases.

https://github.com/llvm/llvm-project/pull/126762
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Add Lowering support for OpenMP custom mappers in map clause (PR #121001)

2025-02-11 Thread Akash Banerjee via llvm-branch-commits


@@ -1003,6 +1006,20 @@ void ClauseProcessor::processMapObjects(
   }
 }
 
+if (!mapperIdName.empty()) {

TIFitis wrote:

The if is contained in a loop and I want the if to execute only the first 
iteration. So this replacement won't be helpful here.

https://github.com/llvm/llvm-project/pull/121001
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2025-02-11 Thread Akash Banerjee via llvm-branch-commits


@@ -0,0 +1,85 @@
+! This test checks lowering of OpenMP declare mapper Directive.
+
+! RUN: split-file %s %t
+! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 
%t/omp-declare-mapper-1.f90 -o - | FileCheck %t/omp-declare-mapper-1.f90
+! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 
%t/omp-declare-mapper-2.f90 -o - | FileCheck %t/omp-declare-mapper-2.f90
+
+!--- omp-declare-mapper-1.f90
+subroutine declare_mapper_1
+   integer, parameter  :: nvals = 250
+   type my_type
+  integer  :: num_vals
+  integer, allocatable :: values(:)
+   end type
+
+   type my_type2
+  type(my_type):: my_type_var
+  type(my_type):: temp
+  real, dimension(nvals) :: unmapped
+  real, dimension(nvals) :: arr
+   end type
+   type(my_type2):: t
+   real   :: x, y(nvals)
+   !CHECK:omp.declare_mapper 
@[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.default]] : 
[[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]]
 {
+   !CHECK:  ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>):
+   !CHECK:%[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = 
"_QFdeclare_mapper_1Evar"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE]]>, 
!fir.ref<[[MY_TYPE]]>)
+   !CHECK:%[[VAL_2:.*]] = hlfir.designate %[[VAL_1]]#0{"values"}   
{fortran_attrs = #fir.var_attrs} : (!fir.ref<[[MY_TYPE]]>) -> 
!fir.ref>>>
+   !CHECK:%[[VAL_3:.*]] = fir.load %[[VAL_2]] : 
!fir.ref>>>
+   !CHECK:%[[VAL_4:.*]] = fir.box_addr %[[VAL_3]] : 
(!fir.box>>) -> !fir.heap>
+   !CHECK:%[[VAL_5:.*]] = arith.constant 0 : index
+   !CHECK:%[[VAL_6:.*]]:3 = fir.box_dims %[[VAL_3]], %[[VAL_5]] : 
(!fir.box>>, index) -> (index, index, index)
+   !CHECK:%[[VAL_7:.*]] = arith.constant 0 : index
+   !CHECK:%[[VAL_8:.*]] = arith.constant 1 : index
+   !CHECK:%[[VAL_9:.*]] = arith.constant 1 : index
+   !CHECK:%[[VAL_10:.*]] = arith.subi %[[VAL_9]], %[[VAL_6]]#0 : index
+   !CHECK:%[[VAL_11:.*]] = hlfir.designate %[[VAL_1]]#0{"num_vals"}   
: (!fir.ref<[[MY_TYPE]]>) -> !fir.ref
+   !CHECK:%[[VAL_12:.*]] = fir.load %[[VAL_11]] : !fir.ref
+   !CHECK:%[[VAL_13:.*]] = fir.convert %[[VAL_12]] : (i32) -> i64
+   !CHECK:%[[VAL_14:.*]] = fir.convert %[[VAL_13]] : (i64) -> index
+   !CHECK:%[[VAL_15:.*]] = arith.subi %[[VAL_14]], %[[VAL_6]]#0 : index
+   !CHECK:%[[VAL_16:.*]] = omp.map.bounds lower_bound(%[[VAL_10]] : 
index) upper_bound(%[[VAL_15]] : index) extent(%[[VAL_6]]#1 : index) 
stride(%[[VAL_8]] : index) start_idx(%[[VAL_6]]#0 : index)
+   !CHECK:%[[VAL_17:.*]] = arith.constant 1 : index
+   !CHECK:%[[VAL_18:.*]] = fir.coordinate_of %[[VAL_1]]#0, %[[VAL_17]] 
: (!fir.ref<[[MY_TYPE]]>, index) -> 
!fir.ref>>>
+   !CHECK:%[[VAL_19:.*]] = fir.box_offset %[[VAL_18]] base_addr : 
(!fir.ref>>>) -> 
!fir.llvm_ptr>>
+   !CHECK:%[[VAL_20:.*]] = omp.map.info var_ptr(%[[VAL_18]] : 
!fir.ref>>>, i32) var_ptr_ptr(%[[VAL_19]] 
: !fir.llvm_ptr>>) map_clauses(tofrom) 
capture(ByRef) bounds(%[[VAL_16]]) -> 
!fir.llvm_ptr>> {name = ""}
+   !CHECK:%[[VAL_21:.*]] = omp.map.info var_ptr(%[[VAL_18]] : 
!fir.ref>>>, 
!fir.box>>) map_clauses(to) capture(ByRef) -> 
!fir.ref>>> {name = 
"var%[[VAL_22:.*]](1:var%[[VAL_23:.*]])"}
+   !CHECK:%[[VAL_24:.*]] = omp.map.info var_ptr(%[[VAL_1]]#1 : 
!fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) 
members(%[[VAL_21]], %[[VAL_20]] : [1], [1, 0] : 
!fir.ref>>>, 
!fir.llvm_ptr>>) -> !fir.ref<[[MY_TYPE]]> {name = 
"var"}
+   !CHECK:omp.declare_mapper.info map_entries(%[[VAL_24]], 
%[[VAL_21]], %[[VAL_20]] : !fir.ref<[[MY_TYPE]]>, 
!fir.ref>>>, 
!fir.llvm_ptr>>)
+   !CHECK:  }
+   !$omp declare mapper (my_type :: var) map (var, var%values (1:var%num_vals))
+end subroutine declare_mapper_1
+
+!--- omp-declare-mapper-2.f90
+subroutine declare_mapper_2
+   integer, parameter  :: nvals = 250
+   type my_type
+  integer  :: num_vals
+  integer, allocatable :: values(:)
+   end type
+
+   type my_type2
+  type(my_type):: my_type_var
+  type(my_type):: temp
+  real, dimension(nvals) :: unmapped
+  real, dimension(nvals) :: arr
+   end type
+   type(my_type2):: t
+   real  :: x, y(nvals)
+   !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_2my_mapper]] 
: 
[[MY_TYPE:!fir\.type<_QFdeclare_mapper_2Tmy_type2\{my_type_var:!fir\.type<_QFdeclare_mapper_2Tmy_type\{num_vals:i32,values:!fir\.box>>\}>,temp:!fir\.type<_QFdeclare_mapper_2Tmy_type\{num_vals:i32,values:!fir\.box>>\}>,unmapped:!fir\.array<250xf32>,arr:!fir\.array<250xf32>\}>]]
 {
+   !CHECK:  ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>):
+   !CHECK:%[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = 
"_QFdeclare_mapper_2Ev"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE

[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Joseph Huber via llvm-branch-commits

https://github.com/jhuber6 approved this pull request.

Okay, so I guess we can delete these because the cards that corresponded to 
these were never fully released as I understand it.

https://github.com/llvm/llvm-project/pull/126762
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (PR #126763)

2025-02-11 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm requested changes to this pull request.

Should just leave the subtarget feature name alone. It's not worth the trouble, 
and this will now start spewing warnings on old IR (due to unnecessary 
target-features spam clang should stop emitting). It really should have been 
named 94-insts, but I think it's best to leave it alone 

https://github.com/llvm/llvm-project/pull/126763
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Add Lowering support for OpenMP custom mappers in map clause (PR #121001)

2025-02-11 Thread Sergio Afonso via llvm-branch-commits


@@ -1003,6 +1006,20 @@ void ClauseProcessor::processMapObjects(
   }
 }
 
+if (!mapperIdName.empty()) {

skatrak wrote:

Nevermind, thank you for bringing to my attention that this is inside of a loop.

https://github.com/llvm/llvm-project/pull/121001
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (PR #126763)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a ready_for_review 
https://github.com/llvm/llvm-project/pull/126763
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Joseph Huber via llvm-branch-commits

https://github.com/jhuber6 edited 
https://github.com/llvm/llvm-project/pull/126762
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP custom mappers in map clause (PR #121001)

2025-02-11 Thread Akash Banerjee via llvm-branch-commits

https://github.com/TIFitis updated 
https://github.com/llvm/llvm-project/pull/121001

>From 107f59d06dcb0523e373682b5879bb79c824bb2f Mon Sep 17 00:00:00 2001
From: Akash Banerjee 
Date: Mon, 23 Dec 2024 21:13:42 +
Subject: [PATCH 1/5] Add flang lowering changes for mapper field in map
 clause.

---
 flang/lib/Lower/OpenMP/ClauseProcessor.cpp  | 32 +
 flang/lib/Lower/OpenMP/ClauseProcessor.h|  3 +-
 flang/test/Lower/OpenMP/Todo/map-mapper.f90 | 16 ---
 flang/test/Lower/OpenMP/map-mapper.f90  | 23 +++
 4 files changed, 52 insertions(+), 22 deletions(-)
 delete mode 100644 flang/test/Lower/OpenMP/Todo/map-mapper.f90
 create mode 100644 flang/test/Lower/OpenMP/map-mapper.f90

diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp 
b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index febc6adcf9d6f..467a0dcebf2b8 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -969,8 +969,10 @@ void ClauseProcessor::processMapObjects(
 llvm::omp::OpenMPOffloadMappingFlags mapTypeBits,
 std::map &parentMemberIndices,
 llvm::SmallVectorImpl &mapVars,
-llvm::SmallVectorImpl &mapSyms) const {
+llvm::SmallVectorImpl &mapSyms,
+std::string mapperIdName) const {
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+  mlir::FlatSymbolRefAttr mapperId;
 
   for (const omp::Object &object : objects) {
 llvm::SmallVector bounds;
@@ -1003,6 +1005,20 @@ void ClauseProcessor::processMapObjects(
   }
 }
 
+if (!mapperIdName.empty()) {
+  if (mapperIdName == "default") {
+auto &typeSpec = object.sym()->owner().IsDerivedType()
+ ? *object.sym()->owner().derivedTypeSpec()
+ : object.sym()->GetType()->derivedTypeSpec();
+mapperIdName = typeSpec.name().ToString() + ".default";
+mapperIdName = converter.mangleName(mapperIdName, 
*typeSpec.GetScope());
+  }
+  assert(converter.getMLIRSymbolTable()->lookup(mapperIdName) &&
+ "mapper not found");
+  mapperId = mlir::FlatSymbolRefAttr::get(&converter.getMLIRContext(),
+  mapperIdName);
+  mapperIdName.clear();
+}
 // Explicit map captures are captured ByRef by default,
 // optimisation passes may alter this to ByCopy or other capture
 // types to optimise
@@ -1016,7 +1032,8 @@ void ClauseProcessor::processMapObjects(
 static_cast<
 std::underlying_type_t>(
 mapTypeBits),
-mlir::omp::VariableCaptureKind::ByRef, baseOp.getType());
+mlir::omp::VariableCaptureKind::ByRef, baseOp.getType(), false,
+mapperId);
 
 if (parentObj.has_value()) {
   parentMemberIndices[parentObj.value()].addChildIndexAndMapToParent(
@@ -1047,6 +1064,7 @@ bool ClauseProcessor::processMap(
 const auto &[mapType, typeMods, mappers, iterator, objects] = clause.t;
 llvm::omp::OpenMPOffloadMappingFlags mapTypeBits =
 llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_NONE;
+std::string mapperIdName;
 // If the map type is specified, then process it else Tofrom is the
 // default.
 Map::MapType type = mapType.value_or(Map::MapType::Tofrom);
@@ -1090,13 +1108,17 @@ bool ClauseProcessor::processMap(
"Support for iterator modifiers is not implemented yet");
 }
 if (mappers) {
-  TODO(currentLocation,
-   "Support for mapper modifiers is not implemented yet");
+  assert(mappers->size() == 1 && "more than one mapper");
+  mapperIdName = mappers->front().v.id().symbol->name().ToString();
+  if (mapperIdName != "default")
+mapperIdName = converter.mangleName(
+mapperIdName, mappers->front().v.id().symbol->owner());
 }
 
 processMapObjects(stmtCtx, clauseLocation,
   std::get(clause.t), mapTypeBits,
-  parentMemberIndices, result.mapVars, *ptrMapSyms);
+  parentMemberIndices, result.mapVars, *ptrMapSyms,
+  mapperIdName);
   };
 
   bool clauseFound = findRepeatableClause(process);
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h 
b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index e05f66c766684..2b319e890a5ad 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -175,7 +175,8 @@ class ClauseProcessor {
   llvm::omp::OpenMPOffloadMappingFlags mapTypeBits,
   std::map &parentMemberIndices,
   llvm::SmallVectorImpl &mapVars,
-  llvm::SmallVectorImpl &mapSyms) const;
+  llvm::SmallVectorImpl &mapSyms,
+  std::string mapperIdName = "") const;
 
   lower::AbstractConverter &converter;
   semantics::SemanticsContext &semaCtx;
diff --git a/flang/test/Lower/OpenMP/Todo/map-mapper.f90 
b/flang/test/Lower/OpenMP/Todo/map-mapper.f90
deleted file mode 100644
index 9554ffd5fda7b..0
--- a/flang/test/Lowe

[llvm-branch-commits] [llvm] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (PR #126607)

2025-02-11 Thread Amr Hesham via llvm-branch-commits

https://github.com/AmrDeveloper updated 
https://github.com/llvm/llvm-project/pull/126607

>From 8886b33981f73da04adadb3e02a740b8e376e042 Mon Sep 17 00:00:00 2001
From: AmrDeveloper 
Date: Mon, 10 Feb 2025 23:03:15 +0100
Subject: [PATCH 1/2] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints
 wrong path when dump-section output path doesn't exist #125345

---
 llvm/docs/ReleaseNotes.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index 44a0b17d6a07b..c92338345f1cb 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -460,6 +460,8 @@ Changes to the LLVM tools
   `--localize-symbol`, `--localize-symbols`,
   `--skip-symbol`, `--skip-symbols`.
 
+* llvm-objcopy now prints the correct file path in the error message when the 
output file specified by --dump-section cannot be opened.
+
 Changes to LLDB
 -
 

>From 6f5ced04f9fb5769f367a26b306a3ca961324926 Mon Sep 17 00:00:00 2001
From: AmrDeveloper 
Date: Tue, 11 Feb 2025 17:53:03 +0100
Subject: [PATCH 2/2] Add quote around option name

---
 llvm/docs/ReleaseNotes.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index c92338345f1cb..28908490b8f7c 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -460,7 +460,7 @@ Changes to the LLVM tools
   `--localize-symbol`, `--localize-symbols`,
   `--skip-symbol`, `--skip-symbols`.
 
-* llvm-objcopy now prints the correct file path in the error message when the 
output file specified by --dump-section cannot be opened.
+* llvm-objcopy now prints the correct file path in the error message when the 
output file specified by `--dump-section` cannot be opened.
 
 Changes to LLDB
 -

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (PR #126607)

2025-02-11 Thread Amr Hesham via llvm-branch-commits


@@ -460,6 +460,8 @@ Changes to the LLVM tools
   `--localize-symbol`, `--localize-symbols`,
   `--skip-symbol`, `--skip-symbols`.
 
+* llvm-objcopy now prints the correct file path in the error message when the 
output file specified by --dump-section cannot be opened.

AmrDeveloper wrote:

I made it `--dump-section` to be similar to all other options in the same file

https://github.com/llvm/llvm-project/pull/126607
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [llvm] [mlir] [MLIR][OpenMP] Add LLVM translation support for OpenMP UserDefinedMappers (PR #124746)

2025-02-11 Thread Akash Banerjee via llvm-branch-commits


@@ -3529,6 +3549,84 @@ static void genMapInfos(llvm::IRBuilderBase &builder,
   }
 }
 
+static llvm::Expected
+emitUserDefinedMapper(Operation *declMapperOp, llvm::IRBuilderBase &builder,
+  LLVM::ModuleTranslation &moduleTranslation);
+
+static llvm::Expected
+getOrCreateUserDefinedMapperFunc(Operation *declMapperOp,
+ llvm::IRBuilderBase &builder,
+ LLVM::ModuleTranslation &moduleTranslation) {
+  static llvm::DenseMap userDefMapperMap;

TIFitis wrote:

Thanks for the suggestion, I've reworked this bit of code.

https://github.com/llvm/llvm-project/pull/124746
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Joseph Huber via llvm-branch-commits


@@ -106,8 +106,6 @@ enum class OffloadArch {
   GFX90a,
   GFX90c,
   GFX9_4_GENERIC,
-  GFX940,
-  GFX941,

jhuber6 wrote:

So `--offload-arch=gfx940` will be a hard error after working at least since 
clang 16? That sounds very silly.

https://github.com/llvm/llvm-project/pull/126762
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Add conversion support from FIR to LLVM Dialect for OMP DeclareMapper (PR #121005)

2025-02-11 Thread Akash Banerjee via llvm-branch-commits

https://github.com/TIFitis updated 
https://github.com/llvm/llvm-project/pull/121005

>From 0cba204faac851d186470b74aab3601a987e0f2d Mon Sep 17 00:00:00 2001
From: Akash Banerjee 
Date: Mon, 23 Dec 2024 21:50:03 +
Subject: [PATCH 1/2] Add OpenMP to LLVM dialect conversion support for
 DeclareMapperOp.

---
 .../Fir/convert-to-llvm-openmp-and-fir.fir| 27 +--
 .../Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp  | 48 +++
 .../OpenMPToLLVM/convert-to-llvmir.mlir   | 13 +
 3 files changed, 74 insertions(+), 14 deletions(-)

diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir 
b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
index 8e4e1fe824d9f..82f2aea3ad983 100644
--- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
+++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
@@ -936,9 +936,9 @@ func.func @omp_map_info_descriptor_type_conversion(%arg0 : 
!fir.ref>, i32) 
map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr> {name = ""}
   // CHECK: %[[DESC_MAP:.*]] = omp.map.info var_ptr(%[[ARG_0]] : !llvm.ptr, 
!llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(always, delete) 
capture(ByRef) members(%[[MEMBER_MAP]] : [0] : !llvm.ptr) -> !llvm.ptr {name = 
""}
   %2 = omp.map.info var_ptr(%arg0 : !fir.ref>>, 
!fir.box>) map_clauses(always, delete) capture(ByRef) members(%1 
: [0] : !fir.llvm_ptr>) -> !fir.ref>> 
{name = ""}
-  // CHECK: omp.target_exit_data map_entries(%[[DESC_MAP]] : !llvm.ptr) 
+  // CHECK: omp.target_exit_data map_entries(%[[DESC_MAP]] : !llvm.ptr)
   omp.target_exit_data   map_entries(%2 : !fir.ref>>)
-  return 
+  return
 }
 
 // -
@@ -956,8 +956,8 @@ func.func 
@omp_map_info_derived_type_explicit_member_conversion(%arg0 : !fir.ref
   %3 = fir.field_index real, 
!fir.type<_QFderived_type{real:f32,array:!fir.array<10xi32>,int:i32}>
   %4 = fir.coordinate_of %arg0, %3 : 
(!fir.ref,int:i32}>>,
 !fir.field) -> !fir.ref
   // CHECK: %[[MAP_MEMBER_2:.*]] = omp.map.info var_ptr(%[[GEP_2]] : 
!llvm.ptr, f32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = 
"dtype%real"}
-  %5 = omp.map.info var_ptr(%4 : !fir.ref, f32) map_clauses(tofrom) 
capture(ByRef) -> !fir.ref {name = "dtype%real"}
-  // CHECK: %[[MAP_PARENT:.*]] = omp.map.info var_ptr(%[[ARG_0]] : !llvm.ptr, 
!llvm.struct<"_QFderived_type", (f32, array<10 x i32>, i32)>) 
map_clauses(tofrom) capture(ByRef) members(%[[MAP_MEMBER_1]], %[[MAP_MEMBER_2]] 
: [2], [0] : !llvm.ptr, !llvm.ptr) -> !llvm.ptr {name = "dtype", partial_map = 
true} 
+  %5 = omp.map.info var_ptr(%4 : !fir.ref, f32) map_clauses(tofrom) 
capture(ByRef) -> !fir.ref {name = "dtype%real"}
+  // CHECK: %[[MAP_PARENT:.*]] = omp.map.info var_ptr(%[[ARG_0]] : !llvm.ptr, 
!llvm.struct<"_QFderived_type", (f32, array<10 x i32>, i32)>) 
map_clauses(tofrom) capture(ByRef) members(%[[MAP_MEMBER_1]], %[[MAP_MEMBER_2]] 
: [2], [0] : !llvm.ptr, !llvm.ptr) -> !llvm.ptr {name = "dtype", partial_map = 
true}
   %6 = omp.map.info var_ptr(%arg0 : 
!fir.ref,int:i32}>>,
 !fir.type<_QFderived_type{real:f32,array:!fir.array<10xi32>,int:i32}>) 
map_clauses(tofrom) capture(ByRef) members(%2, %5 : [2], [0] : !fir.ref, 
!fir.ref) -> 
!fir.ref,int:i32}>> 
{name = "dtype", partial_map = true}
   // CHECK: omp.target map_entries(%[[MAP_MEMBER_1]] -> %[[ARG_1:.*]], 
%[[MAP_MEMBER_2]] -> %[[ARG_2:.*]], %[[MAP_PARENT]] -> %[[ARG_3:.*]] : 
!llvm.ptr, !llvm.ptr, !llvm.ptr) {
   omp.target map_entries(%2 -> %arg1, %5 -> %arg2, %6 -> %arg3 : 
!fir.ref, !fir.ref, 
!fir.ref,int:i32}>>)
 {
@@ -1275,3 +1275,22 @@ func.func @map_nested_dtype_alloca_mem2(%arg0 : 
!fir.ref {
+omp.declare_mapper @my_mapper : !fir.type<_QFdeclare_mapperTmy_type{data:i32}> 
{
+// CHECK: ^bb0(%[[VAL_0:.*]]: !llvm.ptr):
+^bb0(%0: !fir.ref>):
+// CHECK:   %[[VAL_1:.*]] = llvm.mlir.constant(0 : i32) : i32
+  %1 = fir.field_index data, !fir.type<_QFdeclare_mapperTmy_type{data:i32}>
+// CHECK:   %[[VAL_2:.*]] = llvm.getelementptr %[[VAL_0]][0, 0] : 
(!llvm.ptr) -> !llvm.ptr, !llvm.struct<"_QFdeclare_mapperTmy_type", (i32)>
+  %2 = fir.coordinate_of %0, %1 : 
(!fir.ref>, !fir.field) -> 
!fir.ref
+// CHECK:   %[[VAL_3:.*]] = omp.map.info var_ptr(%[[VAL_2]] : 
!llvm.ptr, i32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = 
"var%[[VAL_4:.*]]"}
+  %3 = omp.map.info var_ptr(%2 : !fir.ref, i32) map_clauses(tofrom) 
capture(ByRef) -> !fir.ref {name = "var%data"}
+// CHECK:   %[[VAL_5:.*]] = omp.map.info var_ptr(%[[VAL_0]] : 
!llvm.ptr, !llvm.struct<"_QFdeclare_mapperTmy_type", (i32)>) 
map_clauses(tofrom) capture(ByRef) members(%[[VAL_3]] : [0] : !llvm.ptr) -> 
!llvm.ptr {name = "var", partial_map = true}
+  %4 = omp.map.info var_ptr(%0 : 
!fir.ref>, 
!fir.type<_QFdeclare_mapperTmy_type{data:i32}>) map_clauses(tofrom) 
capture(ByRef) members(%3 : [0] : !fir.ref) -> 
!fir.ref> {name = "var", 
partial_map = true}
+// CHECK:   omp.declare_mapper_info map_entries(%[[VAL_5]], %[[VAL_3]] 
: !llvm.ptr, !llvm.ptr)
+  omp.declare_mappe

[llvm-branch-commits] [llvm] [AMDGPU] Remove dead function metadata after amdgpu-lower-kernel-arguments (PR #126147)

2025-02-11 Thread Matt Arsenault via llvm-branch-commits


@@ -1,7 +1,10 @@
-; RUN: not --crash opt -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 
-passes='amdgpu-attributor,function(amdgpu-lower-kernel-arguments)' 
-amdgpu-kernarg-preload-count=16 -S < %s 2>&1 | FileCheck %s
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 
-passes='amdgpu-attributor,function(amdgpu-lower-kernel-arguments)' 
-amdgpu-kernarg-preload-count=16 -S < %s 2>&1 \
+; RUN: | FileCheck -implicit-check-not='declare {{.*}} !dbg' %s

arsenm wrote:

this is still a big aggressive for check-not, and I'm not sure it supports 
regex. Can you simplify to just check-not=declare and explicitly check the few 
declares that are expected? 

https://github.com/llvm/llvm-project/pull/126147
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [clang-format] Handle C-style cast of member function pointer type (#126340) (PR #126479)

2025-02-11 Thread via llvm-branch-commits

https://github.com/mydeveloperday approved this pull request.


https://github.com/llvm/llvm-project/pull/126479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [VPlan] Only skip expansion for SCEVUnknown if it isn't an instruction. (#125235) (PR #126718)

2025-02-11 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn approved this pull request.

LGTM, thanks although I am surprised the bot requested a review from myself. 
Maybe it should have asked @nikic as well, who reviewed the original PR?

https://github.com/llvm/llvm-project/pull/126718
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port RegAllocPriorityAdvisor analysis to NPM (PR #118462)

2025-02-11 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/118462
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)

2025-02-11 Thread Sarah Spall via llvm-branch-commits


@@ -0,0 +1,43 @@
+// RUN: %clang_cc1 -finclude-default-header -triple 
dxil-pc-shadermodel6.3-compute \
+// RUN:   -fnative-half-type -emit-llvm -disable-llvm-passes -o - %s | 
FileCheck %s
+
+// CHECK: %"struct.__cblayout_$Globals" = type { float, float, 
%struct.__cblayout_S }
+// CHECK: %struct.__cblayout_S = type { float }
+
+// CHECK-DAG: @"$Globals.cb" = external constant target("dx.CBuffer", 
%"struct.__cblayout_$Globals")
+// CHECK-DAG: @a = external addrspace(2) global float
+// CHECK-DAG: @g = external addrspace(2) global float
+// CHECK-DAG: @h = external addrspace(2) global %struct.__cblayout_S
+
+struct EmptyStruct {
+};
+
+struct S {
+  RWBuffer buf;
+  EmptyStruct es;
+  float ea[0];
+  float b;
+};
+
+float a;
+RWBuffer b; 
+EmptyStruct c;
+float d[0];
+RWBuffer e[2];
+groupshared float f;
+float g;
+S h;
+
+RWBuffer Buf;
+
+[numthreads(4,1,1)]
+void main() {
+  Buf[0] = a;
+}
+
+// CHECK: !hlsl.cblayouts = !{![[S_LAYOUT:.*]], ![[CB_LAYOUT:.*]]}
+// CHECK: !hlsl.cbs = !{![[CB:.*]]}
+
+// CHECK: ![[S_LAYOUT]] = !{!"struct.__cblayout_S", i32 4, i32 0}
+// CHECK: ![[CB_LAYOUT]] = !{!"struct.__cblayout_$Globals", i32 20, i32 0, i32 
4, i32 16}
+// CHECK: ![[CB]] = !{ptr @"$Globals.cb", ptr addrspace(2) @a, ptr 
addrspace(2) @g, ptr addrspace(2) @h}

spall wrote:

newline

https://github.com/llvm/llvm-project/pull/125807
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)

2025-02-11 Thread Sarah Spall via llvm-branch-commits


@@ -5753,6 +5765,30 @@ void HLSLBufferDecl::addLayoutStruct(CXXRecordDecl *LS) {
   addDecl(LS);
 }
 
+void HLSLBufferDecl::addDefaultBufferDecl(Decl *D) {
+  assert(isImplicit() &&
+ "default decls can only be added to the implicit/default constant "
+ "buffer $Globals");
+  DefaultBufferDecls.push_back(D);
+}
+
+HLSLBufferDecl::buffer_decl_iterator
+HLSLBufferDecl::buffer_decls_begin() const {
+  return buffer_decl_iterator(llvm::iterator_range(DefaultBufferDecls.begin(),
+   DefaultBufferDecls.end()),
+  decl_range(decls_begin(), decls_end()));
+}
+
+HLSLBufferDecl::buffer_decl_iterator HLSLBufferDecl::buffer_decls_end() const {
+  return buffer_decl_iterator(
+  llvm::iterator_range(DefaultBufferDecls.end(), DefaultBufferDecls.end()),

spall wrote:

this is supposed to say end, end?

https://github.com/llvm/llvm-project/pull/125807
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)

2025-02-11 Thread Sarah Spall via llvm-branch-commits


@@ -5072,6 +5080,20 @@ class HLSLBufferDecl final : public NamedDecl, public 
DeclContext {
 return static_cast(const_cast(DC));
   }
 
+  // Iterator for the buffer decls. Concatenates the list of decls parented

spall wrote:

I just want to clarify what this comment says. The children decls of this 
hlslbufferdecl are concatenated with the list of default buffer decls? Does the 
order of this matter?

https://github.com/llvm/llvm-project/pull/125807
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)

2025-02-11 Thread Sarah Spall via llvm-branch-commits


@@ -159,11 +159,16 @@ class SemaHLSL : public SemaBase {
   // List of all resource bindings
   ResourceBindings Bindings;
 
+  // default constant buffer $Globals
+  HLSLBufferDecl *DefaultCBuffer;
+
 private:
   void collectResourcesOnVarDecl(VarDecl *D);
   void collectResourcesOnUserRecordDecl(const VarDecl *VD,
 const RecordType *RT);
   void processExplicitBindingsOnDecl(VarDecl *D);
+
+  void diagnoseAvailabilityViolations(TranslationUnitDecl *TU);

spall wrote:

Why do you want this to be private now?

https://github.com/llvm/llvm-project/pull/125807
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)

2025-02-11 Thread Sarah Spall via llvm-branch-commits


@@ -286,10 +286,7 @@ void CGHLSLRuntime::emitBufferGlobalsAndMetadata(const 
HLSLBufferDecl *BufDecl,
   .str( &&
"layout type does not match the converted element type");
 
-// there might be resources inside the used defined structs
-if (VDTy->isStructureType() && VDTy->isHLSLIntangibleType())
-  // FIXME: handle resources in cbuffer structs
-  llvm_unreachable("resources in cbuffer are not supported yet");
+// FIXME: handle resources in cbuffer user-defined structs

spall wrote:

this is future work? not a reminder for this pr?

https://github.com/llvm/llvm-project/pull/125807
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)

2025-02-11 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-offload

Author: None (llvmbot)


Changes

Backport baf7a3c1e561ff7e3f7da2261ce1012c4f2ba1c0

Requested by: @jhuber6

---
Full diff: https://github.com/llvm/llvm-project/pull/126795.diff


2 Files Affected:

- (modified) offload/plugins-nextgen/common/include/RPC.h (+9-3) 
- (modified) offload/plugins-nextgen/common/src/RPC.cpp (+4-1) 


``diff
diff --git a/offload/plugins-nextgen/common/include/RPC.h 
b/offload/plugins-nextgen/common/include/RPC.h
index 42fca4aa4aebc..08556f15a76bf 100644
--- a/offload/plugins-nextgen/common/include/RPC.h
+++ b/offload/plugins-nextgen/common/include/RPC.h
@@ -72,6 +72,9 @@ struct RPCServerTy {
   /// Array of associated devices. These must be alive as long as the server 
is.
   std::unique_ptr Devices;
 
+  /// Mutex that guards accesses to the buffers and device array.
+  std::mutex BufferMutex{};
+
   /// A helper class for running the user thread that handles the RPC 
interface.
   /// Because we only need to check the RPC server while any kernels are
   /// working, we track submission / completion events to allow the thread to
@@ -90,6 +93,9 @@ struct RPCServerTy {
 std::condition_variable CV;
 std::mutex Mutex;
 
+/// A reference to the main server's mutex.
+std::mutex &BufferMutex;
+
 /// A reference to all the RPC interfaces that the server is handling.
 llvm::ArrayRef Buffers;
 
@@ -98,9 +104,9 @@ struct RPCServerTy {
 
 /// Initialize the worker thread to run in the background.
 ServerThread(void *Buffers[], plugin::GenericDeviceTy *Devices[],
- size_t Length)
-: Running(false), NumUsers(0), CV(), Mutex(), Buffers(Buffers, Length),
-  Devices(Devices, Length) {}
+ size_t Length, std::mutex &BufferMutex)
+: Running(false), NumUsers(0), CV(), Mutex(), BufferMutex(BufferMutex),
+  Buffers(Buffers, Length), Devices(Devices, Length) {}
 
 ~ServerThread() { assert(!Running && "Thread not shut down explicitly\n"); 
}
 
diff --git a/offload/plugins-nextgen/common/src/RPC.cpp 
b/offload/plugins-nextgen/common/src/RPC.cpp
index e6750a540b391..eb305736d6264 100644
--- a/offload/plugins-nextgen/common/src/RPC.cpp
+++ b/offload/plugins-nextgen/common/src/RPC.cpp
@@ -131,6 +131,7 @@ void RPCServerTy::ServerThread::run() {
 Lock.unlock();
 while (NumUsers.load(std::memory_order_relaxed) > 0 &&
Running.load(std::memory_order_relaxed)) {
+  std::lock_guard Lock(BufferMutex);
   for (const auto &[Buffer, Device] : llvm::zip_equal(Buffers, Devices)) {
 if (!Buffer || !Device)
   continue;
@@ -149,7 +150,7 @@ RPCServerTy::RPCServerTy(plugin::GenericPluginTy &Plugin)
   Devices(std::make_unique(
   Plugin.getNumDevices())),
   Thread(new ServerThread(Buffers.get(), Devices.get(),
-  Plugin.getNumDevices())) {}
+  Plugin.getNumDevices(), BufferMutex)) {}
 
 llvm::Error RPCServerTy::startThread() {
   Thread->startThread();
@@ -190,6 +191,7 @@ Error RPCServerTy::initDevice(plugin::GenericDeviceTy 
&Device,
   if (auto Err = Device.dataSubmit(ClientGlobal.getPtr(), &client,
sizeof(rpc::Client), nullptr))
 return Err;
+  std::lock_guard Lock(BufferMutex);
   Buffers[Device.getDeviceId()] = RPCBuffer;
   Devices[Device.getDeviceId()] = &Device;
 
@@ -197,6 +199,7 @@ Error RPCServerTy::initDevice(plugin::GenericDeviceTy 
&Device,
 }
 
 Error RPCServerTy::deinitDevice(plugin::GenericDeviceTy &Device) {
+  std::lock_guard Lock(BufferMutex);
   Device.free(Buffers[Device.getDeviceId()], TARGET_ALLOC_HOST);
   Buffers[Device.getDeviceId()] = nullptr;
   Devices[Device.getDeviceId()] = nullptr;

``




https://github.com/llvm/llvm-project/pull/126795
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)

2025-02-11 Thread Jan Patrick Lehr via llvm-branch-commits

https://github.com/jplehr approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/126795
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/126795

Backport baf7a3c1e561ff7e3f7da2261ce1012c4f2ba1c0

Requested by: @jhuber6

>From c03f46f2f0eac60ee407a6c645cfdb62e97fa77b Mon Sep 17 00:00:00 2001
From: Joseph Huber 
Date: Tue, 11 Feb 2025 14:57:31 -0600
Subject: [PATCH] [Offload] Properly guard modifications to the RPC device
 array (#126790)

Summary:
If the user deallocates an RPC device this can sometimes fail if the RPC
server is still running. This will happen if the modification happens
while the server is still checking it. This patch adds a mutex to guard
modifications to it.

(cherry picked from commit baf7a3c1e561ff7e3f7da2261ce1012c4f2ba1c0)
---
 offload/plugins-nextgen/common/include/RPC.h | 12 +---
 offload/plugins-nextgen/common/src/RPC.cpp   |  5 -
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/offload/plugins-nextgen/common/include/RPC.h 
b/offload/plugins-nextgen/common/include/RPC.h
index 42fca4aa4aebc..08556f15a76bf 100644
--- a/offload/plugins-nextgen/common/include/RPC.h
+++ b/offload/plugins-nextgen/common/include/RPC.h
@@ -72,6 +72,9 @@ struct RPCServerTy {
   /// Array of associated devices. These must be alive as long as the server 
is.
   std::unique_ptr Devices;
 
+  /// Mutex that guards accesses to the buffers and device array.
+  std::mutex BufferMutex{};
+
   /// A helper class for running the user thread that handles the RPC 
interface.
   /// Because we only need to check the RPC server while any kernels are
   /// working, we track submission / completion events to allow the thread to
@@ -90,6 +93,9 @@ struct RPCServerTy {
 std::condition_variable CV;
 std::mutex Mutex;
 
+/// A reference to the main server's mutex.
+std::mutex &BufferMutex;
+
 /// A reference to all the RPC interfaces that the server is handling.
 llvm::ArrayRef Buffers;
 
@@ -98,9 +104,9 @@ struct RPCServerTy {
 
 /// Initialize the worker thread to run in the background.
 ServerThread(void *Buffers[], plugin::GenericDeviceTy *Devices[],
- size_t Length)
-: Running(false), NumUsers(0), CV(), Mutex(), Buffers(Buffers, Length),
-  Devices(Devices, Length) {}
+ size_t Length, std::mutex &BufferMutex)
+: Running(false), NumUsers(0), CV(), Mutex(), BufferMutex(BufferMutex),
+  Buffers(Buffers, Length), Devices(Devices, Length) {}
 
 ~ServerThread() { assert(!Running && "Thread not shut down explicitly\n"); 
}
 
diff --git a/offload/plugins-nextgen/common/src/RPC.cpp 
b/offload/plugins-nextgen/common/src/RPC.cpp
index e6750a540b391..eb305736d6264 100644
--- a/offload/plugins-nextgen/common/src/RPC.cpp
+++ b/offload/plugins-nextgen/common/src/RPC.cpp
@@ -131,6 +131,7 @@ void RPCServerTy::ServerThread::run() {
 Lock.unlock();
 while (NumUsers.load(std::memory_order_relaxed) > 0 &&
Running.load(std::memory_order_relaxed)) {
+  std::lock_guard Lock(BufferMutex);
   for (const auto &[Buffer, Device] : llvm::zip_equal(Buffers, Devices)) {
 if (!Buffer || !Device)
   continue;
@@ -149,7 +150,7 @@ RPCServerTy::RPCServerTy(plugin::GenericPluginTy &Plugin)
   Devices(std::make_unique(
   Plugin.getNumDevices())),
   Thread(new ServerThread(Buffers.get(), Devices.get(),
-  Plugin.getNumDevices())) {}
+  Plugin.getNumDevices(), BufferMutex)) {}
 
 llvm::Error RPCServerTy::startThread() {
   Thread->startThread();
@@ -190,6 +191,7 @@ Error RPCServerTy::initDevice(plugin::GenericDeviceTy 
&Device,
   if (auto Err = Device.dataSubmit(ClientGlobal.getPtr(), &client,
sizeof(rpc::Client), nullptr))
 return Err;
+  std::lock_guard Lock(BufferMutex);
   Buffers[Device.getDeviceId()] = RPCBuffer;
   Devices[Device.getDeviceId()] = &Device;
 
@@ -197,6 +199,7 @@ Error RPCServerTy::initDevice(plugin::GenericDeviceTy 
&Device,
 }
 
 Error RPCServerTy::deinitDevice(plugin::GenericDeviceTy &Device) {
+  std::lock_guard Lock(BufferMutex);
   Device.free(Buffers[Device.getDeviceId()], TARGET_ALLOC_HOST);
   Buffers[Device.getDeviceId()] = nullptr;
   Devices[Device.getDeviceId()] = nullptr;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)

2025-02-11 Thread via llvm-branch-commits

llvmbot wrote:

@jplehr What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/126795
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/126795
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/126795
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)

2025-02-11 Thread via llvm-branch-commits

github-actions[bot] wrote:

@jhuber6 (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/126795
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm] Extend CallSiteInfo with TypeId (PR #87574)

2025-02-11 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87574

>From 1d7ee612e408ee7e64e984eb08e6d7089a435d09 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran 
Date: Sun, 2 Feb 2025 00:58:49 +
Subject: [PATCH] Simplify MIR test.

Created using spr 1.3.6-beta.1
---
 .../CodeGen/MIR/X86/call-site-info-typeid.mir | 21 ++-
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir 
b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
index 5ab797bfcc18f..a99ee50a608fb 100644
--- a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
+++ b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
@@ -8,11 +8,6 @@
 # CHECK-NEXT: 123456789 }
 
 --- |
-  ; ModuleID = 'test.ll'
-  source_filename = "test.ll"
-  target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
-  target triple = "x86_64-unknown-linux-gnu"
-  
   define dso_local void @foo(i8 signext %a) {
   entry:
 ret void
@@ -21,10 +16,10 @@
   define dso_local i32 @main() {
   entry:
 %retval = alloca i32, align 4
-%fp = alloca void (i8)*, align 8
-store i32 0, i32* %retval, align 4
-store void (i8)* @foo, void (i8)** %fp, align 8
-%0 = load void (i8)*, void (i8)** %fp, align 8
+%fp = alloca ptr, align 8
+store i32 0, ptr %retval, align 4
+store ptr @foo, ptr %fp, align 8
+%0 = load ptr, ptr %fp, align 8
 call void %0(i8 signext 97)
 ret i32 0
   }
@@ -42,12 +37,8 @@ body: |
 name:main
 tracksRegLiveness: true
 stack:
-  - { id: 0, name: retval, type: default, offset: 0, size: 4, alignment: 4, 
-  stack-id: default, callee-saved-register: '', callee-saved-restored: 
true, 
-  debug-info-variable: '', debug-info-expression: '', debug-info-location: 
'' }
-  - { id: 1, name: fp, type: default, offset: 0, size: 8, alignment: 8, 
-  stack-id: default, callee-saved-register: '', callee-saved-restored: 
true, 
-  debug-info-variable: '', debug-info-expression: '', debug-info-location: 
'' }
+  - { id: 0, name: retval, size: 4, alignment: 4 }
+  - { id: 1, name: fp, size: 8, alignment: 8 }
 callSites:
   - { bb: 0, offset: 6, fwdArgRegs: [], typeId: 
 123456789 }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553) (PR #126767)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/126767

>From 923d35bcf76529ba3afe736d160cb6be12b63e24 Mon Sep 17 00:00:00 2001
From: higher-performance 
Date: Tue, 11 Feb 2025 01:52:13 -0500
Subject: [PATCH] Fix false positive of
 [[clang::require_explicit_initialization]] on copy/move constructors
 (#126553)

Fixes #126490

(cherry picked from commit 90192e8872cc90b4d292b180a49babf72d17e579)
---
 clang/lib/Sema/SemaInit.cpp  |  4 +++-
 clang/test/SemaCXX/uninitialized.cpp | 34 +++-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 450edcb52ae15..37796758960cd 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -4576,7 +4576,9 @@ static void TryConstructorInitialization(Sema &S,
 if (!IsListInit &&
 (Kind.getKind() == InitializationKind::IK_Default ||
  Kind.getKind() == InitializationKind::IK_Direct) &&
-DestRecordDecl != nullptr && DestRecordDecl->isAggregate() &&
+DestRecordDecl != nullptr &&
+!(CtorDecl->isCopyOrMoveConstructor() && CtorDecl->isImplicit()) &&
+DestRecordDecl->isAggregate() &&
 DestRecordDecl->hasUninitializedExplicitInitFields()) {
   S.Diag(Kind.getLocation(), diag::warn_field_requires_explicit_init)
   << /* Var-in-Record */ 1 << DestRecordDecl;
diff --git a/clang/test/SemaCXX/uninitialized.cpp 
b/clang/test/SemaCXX/uninitialized.cpp
index 7578b288d7b3f..4af2c998f082e 100644
--- a/clang/test/SemaCXX/uninitialized.cpp
+++ b/clang/test/SemaCXX/uninitialized.cpp
@@ -1542,9 +1542,15 @@ void aggregate() {
 };
   };
 
+  struct CopyAndMove {
+CopyAndMove() = default;
+CopyAndMove(const CopyAndMove &) {}
+CopyAndMove(CopyAndMove &&) {}
+  };
   struct Embed {
 int embed1;  // #FIELD_EMBED1
 int embed2 [[clang::require_explicit_initialization]];  // #FIELD_EMBED2
+CopyAndMove force_separate_move_ctor;
   };
   struct EmbedDerived : Embed {};
   struct F {
@@ -1582,7 +1588,33 @@ void aggregate() {
   F("___"),
   F("")
   };
-  (void)ctors;
+
+  struct MoveOrCopy {
+Embed e;
+EmbedDerived ed;
+F f;
+// no-error
+MoveOrCopy(const MoveOrCopy &c) : e(c.e), ed(c.ed), f(c.f) {}
+// no-error
+MoveOrCopy(MoveOrCopy &&c)
+: e(std::move(c.e)), ed(std::move(c.ed)), f(std::move(c.f)) {}
+  };
+  F copy1(ctors[0]); // no-error
+  (void)copy1;
+  F move1(std::move(ctors[0])); // no-error
+  (void)move1;
+  F copy2{ctors[0]}; // no-error
+  (void)copy2;
+  F move2{std::move(ctors[0])}; // no-error
+  (void)move2;
+  F copy3 = ctors[0]; // no-error
+  (void)copy3;
+  F move3 = std::move(ctors[0]); // no-error
+  (void)move3;
+  F copy4 = {ctors[0]}; // no-error
+  (void)copy4;
+  F move4 = {std::move(ctors[0])}; // no-error
+  (void)move4;
 
   S::foo(S{1, 2, 3, 4});
   S::foo(S{.s1 = 100, .s4 = 100});

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553) (PR #126767)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/126767
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 923d35b - Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: higher-performance
Date: 2025-02-11T13:22:17-08:00
New Revision: 923d35bcf76529ba3afe736d160cb6be12b63e24

URL: 
https://github.com/llvm/llvm-project/commit/923d35bcf76529ba3afe736d160cb6be12b63e24
DIFF: 
https://github.com/llvm/llvm-project/commit/923d35bcf76529ba3afe736d160cb6be12b63e24.diff

LOG: Fix false positive of [[clang::require_explicit_initialization]] on 
copy/move constructors (#126553)

Fixes #126490

(cherry picked from commit 90192e8872cc90b4d292b180a49babf72d17e579)

Added: 


Modified: 
clang/lib/Sema/SemaInit.cpp
clang/test/SemaCXX/uninitialized.cpp

Removed: 




diff  --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 450edcb52ae15..37796758960cd 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -4576,7 +4576,9 @@ static void TryConstructorInitialization(Sema &S,
 if (!IsListInit &&
 (Kind.getKind() == InitializationKind::IK_Default ||
  Kind.getKind() == InitializationKind::IK_Direct) &&
-DestRecordDecl != nullptr && DestRecordDecl->isAggregate() &&
+DestRecordDecl != nullptr &&
+!(CtorDecl->isCopyOrMoveConstructor() && CtorDecl->isImplicit()) &&
+DestRecordDecl->isAggregate() &&
 DestRecordDecl->hasUninitializedExplicitInitFields()) {
   S.Diag(Kind.getLocation(), diag::warn_field_requires_explicit_init)
   << /* Var-in-Record */ 1 << DestRecordDecl;

diff  --git a/clang/test/SemaCXX/uninitialized.cpp 
b/clang/test/SemaCXX/uninitialized.cpp
index 7578b288d7b3f..4af2c998f082e 100644
--- a/clang/test/SemaCXX/uninitialized.cpp
+++ b/clang/test/SemaCXX/uninitialized.cpp
@@ -1542,9 +1542,15 @@ void aggregate() {
 };
   };
 
+  struct CopyAndMove {
+CopyAndMove() = default;
+CopyAndMove(const CopyAndMove &) {}
+CopyAndMove(CopyAndMove &&) {}
+  };
   struct Embed {
 int embed1;  // #FIELD_EMBED1
 int embed2 [[clang::require_explicit_initialization]];  // #FIELD_EMBED2
+CopyAndMove force_separate_move_ctor;
   };
   struct EmbedDerived : Embed {};
   struct F {
@@ -1582,7 +1588,33 @@ void aggregate() {
   F("___"),
   F("")
   };
-  (void)ctors;
+
+  struct MoveOrCopy {
+Embed e;
+EmbedDerived ed;
+F f;
+// no-error
+MoveOrCopy(const MoveOrCopy &c) : e(c.e), ed(c.ed), f(c.f) {}
+// no-error
+MoveOrCopy(MoveOrCopy &&c)
+: e(std::move(c.e)), ed(std::move(c.ed)), f(std::move(c.f)) {}
+  };
+  F copy1(ctors[0]); // no-error
+  (void)copy1;
+  F move1(std::move(ctors[0])); // no-error
+  (void)move1;
+  F copy2{ctors[0]}; // no-error
+  (void)copy2;
+  F move2{std::move(ctors[0])}; // no-error
+  (void)move2;
+  F copy3 = ctors[0]; // no-error
+  (void)copy3;
+  F move3 = std::move(ctors[0]); // no-error
+  (void)move3;
+  F copy4 = {ctors[0]}; // no-error
+  (void)copy4;
+  F move4 = {std::move(ctors[0])}; // no-error
+  (void)move4;
 
   S::foo(S{1, 2, 3, 4});
   S::foo(S{.s1 = 100, .s4 = 100});



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm] Extend CallSiteInfo with TypeId (PR #87574)

2025-02-11 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87574

>From 1d7ee612e408ee7e64e984eb08e6d7089a435d09 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran 
Date: Sun, 2 Feb 2025 00:58:49 +
Subject: [PATCH] Simplify MIR test.

Created using spr 1.3.6-beta.1
---
 .../CodeGen/MIR/X86/call-site-info-typeid.mir | 21 ++-
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir 
b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
index 5ab797bfcc18f..a99ee50a608fb 100644
--- a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
+++ b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
@@ -8,11 +8,6 @@
 # CHECK-NEXT: 123456789 }
 
 --- |
-  ; ModuleID = 'test.ll'
-  source_filename = "test.ll"
-  target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
-  target triple = "x86_64-unknown-linux-gnu"
-  
   define dso_local void @foo(i8 signext %a) {
   entry:
 ret void
@@ -21,10 +16,10 @@
   define dso_local i32 @main() {
   entry:
 %retval = alloca i32, align 4
-%fp = alloca void (i8)*, align 8
-store i32 0, i32* %retval, align 4
-store void (i8)* @foo, void (i8)** %fp, align 8
-%0 = load void (i8)*, void (i8)** %fp, align 8
+%fp = alloca ptr, align 8
+store i32 0, ptr %retval, align 4
+store ptr @foo, ptr %fp, align 8
+%0 = load ptr, ptr %fp, align 8
 call void %0(i8 signext 97)
 ret i32 0
   }
@@ -42,12 +37,8 @@ body: |
 name:main
 tracksRegLiveness: true
 stack:
-  - { id: 0, name: retval, type: default, offset: 0, size: 4, alignment: 4, 
-  stack-id: default, callee-saved-register: '', callee-saved-restored: 
true, 
-  debug-info-variable: '', debug-info-expression: '', debug-info-location: 
'' }
-  - { id: 1, name: fp, type: default, offset: 0, size: 8, alignment: 8, 
-  stack-id: default, callee-saved-register: '', callee-saved-restored: 
true, 
-  debug-info-variable: '', debug-info-expression: '', debug-info-location: 
'' }
+  - { id: 0, name: retval, size: 4, alignment: 4 }
+  - { id: 1, name: fp, size: 8, alignment: 8 }
 callSites:
   - { bb: 0, offset: 6, fwdArgRegs: [], typeId: 
 123456789 }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553) (PR #126767)

2025-02-11 Thread via llvm-branch-commits

github-actions[bot] wrote:

@higher-performance (or anyone else). If you would like to add a note about 
this fix in the release notes (completely optional). Please reply to this 
comment with a one or two sentence description of the fix.  When you are done, 
please add the release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/126767
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm] Add option to emit `callgraph` section (PR #87574)

2025-02-11 Thread via llvm-branch-commits

https://github.com/Prabhuk edited 
https://github.com/llvm/llvm-project/pull/87574
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Add option to emit call graph section (PR #87572)

2025-02-11 Thread via llvm-branch-commits

https://github.com/Prabhuk closed 
https://github.com/llvm/llvm-project/pull/87572
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope (PR #126685)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/126685
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope (PR #126685)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/126685

>From ac97cff5a3684be98f4863191f0006cdf0fa89b4 Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Tue, 11 Feb 2025 14:08:47 +0800
Subject: [PATCH] [C++20] [Modules] Don't diagnose duplicated declarations in
 different modules which is not in file scope

Close https://github.com/llvm/llvm-project/issues/126373

Although the root problems should be we shouldn't place the friend
declaration to the incorrect module, let's avoid bleeding the edge by
stoping diagnosing declarations not in file scope.

(cherry picked from commit 569e94f8f1c3e6998860e2b2ff577870433bdac9)
---
 clang/lib/Serialization/ASTReaderDecl.cpp |  7 +
 clang/test/Modules/pr126373.cppm  | 34 +++
 2 files changed, 41 insertions(+)
 create mode 100644 clang/test/Modules/pr126373.cppm

diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp 
b/clang/lib/Serialization/ASTReaderDecl.cpp
index 1aa94d5a22abe..8fbb0a8d3edd8 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -3751,6 +3751,13 @@ void 
ASTDeclReader::checkMultipleDefinitionInNamedModules(ASTReader &Reader,
   if (D->getFriendObjectKind() || Previous->getFriendObjectKind())
 return;
 
+  // Skip diagnosing in-class declarations.
+  if (!Previous->getLexicalDeclContext()
+   ->getNonTransparentContext()
+   ->isFileContext() ||
+  !D->getLexicalDeclContext()->getNonTransparentContext()->isFileContext())
+return;
+
   Module *M = Previous->getOwningModule();
   if (!M)
 return;
diff --git a/clang/test/Modules/pr126373.cppm b/clang/test/Modules/pr126373.cppm
new file mode 100644
index 0..f176a587b51ce
--- /dev/null
+++ b/clang/test/Modules/pr126373.cppm
@@ -0,0 +1,34 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// RUN: %clang_cc1 -std=c++20 %t/module1.cppm -emit-module-interface -o 
%t/module1.pcm
+// RUN: %clang_cc1 -std=c++20 -fmodule-file=module1=%t/module1.pcm  
%t/module2.cppm \
+// RUN: -emit-module-interface -o %t/module2.pcm
+// RUN: %clang_cc1 -std=c++20 %t/module2.pcm 
-fmodule-file=module1=%t/module1.pcm \
+// RUN: -emit-llvm -o - | FileCheck %t/module2.cppm
+
+//--- test.h
+template
+struct Test {
+  template
+  friend class Test;
+};
+
+//--- module1.cppm
+module;
+#include "test.h"
+export module module1;
+export void f1(Test) {}
+
+//--- module2.cppm
+module;
+#include "test.h"
+export module module2;
+import module1;
+export void f2(Test) {}
+
+extern "C" void func() {}
+
+// Fine enough to check the IR is emitted correctly.
+// CHECK: define{{.*}}@func

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope (PR #126685)

2025-02-11 Thread via llvm-branch-commits

github-actions[bot] wrote:

@ChuanqiXu9 (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/126685
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm] Extract and propagate indirect call type id (PR #87575)

2025-02-11 Thread via llvm-branch-commits


@@ -3631,6 +3631,12 @@ bool X86FastISel::fastLowerCall(CallLoweringInfo &CLI) {
   CLI.NumResultRegs = RVLocs.size();
   CLI.Call = MIB;
 
+  // Add call site info for call graph section.
+  if (TM.Options.EmitCallGraphSection && CB && CB->isIndirectCall()) {
+MachineFunction::CallSiteInfo CSInfo(*CB);
+MF->addCallSiteInfo(CLI.Call, std::move(CSInfo));

Prabhuk wrote:

Only other location which doesn't use std::move is in 
`llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp` where the temporary 
returned from `DAG->getCallSiteInfo(Node)` is directly passed to 
`addCallSiteInfo`. Adding std::move there will prevent copy elision.

https://github.com/llvm/llvm-project/pull/87575
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] f33b128 - [AVX10.2] Fix wrong intrinsic names after rename (#126390)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: Mikołaj Piróg
Date: 2025-02-11T14:09:54-08:00
New Revision: f33b128b3dc147a973cef55222549345b3201ad5

URL: 
https://github.com/llvm/llvm-project/commit/f33b128b3dc147a973cef55222549345b3201ad5
DIFF: 
https://github.com/llvm/llvm-project/commit/f33b128b3dc147a973cef55222549345b3201ad5.diff

LOG: [AVX10.2] Fix wrong intrinsic names after rename  (#126390)

In my previous PR (#123656) to update the names of AVX10.2 intrinsics
and mnemonics, I have erroneously deleted `_ph` from few intrinsics.
This PR corrects this.

(cherry picked from commit 161cfc6f39bef8994eb944687033ebd3570196e8)

Added: 


Modified: 
clang/lib/Headers/avx10_2_512convertintrin.h
clang/lib/Headers/avx10_2convertintrin.h
clang/test/CodeGen/X86/avx10_2_512convert-builtins.c
clang/test/CodeGen/X86/avx10_2convert-builtins.c

Removed: 




diff  --git a/clang/lib/Headers/avx10_2_512convertintrin.h 
b/clang/lib/Headers/avx10_2_512convertintrin.h
index 0b5fca5cda522..516ccc68672d6 100644
--- a/clang/lib/Headers/avx10_2_512convertintrin.h
+++ b/clang/lib/Headers/avx10_2_512convertintrin.h
@@ -213,19 +213,19 @@ _mm512_maskz_cvts2ph_hf8(__mmask64 __U, __m512h __A, 
__m512h __B) {
   (__v64qi)(__m512i)_mm512_setzero_si512());
 }
 
-static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_cvthf8(__m256i __A) {
+static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_cvthf8_ph(__m256i __A) {
   return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask(
   (__v32qi)__A, (__v32hf)(__m512h)_mm512_undefined_ph(), (__mmask32)-1);
 }
 
 static __inline__ __m512h __DEFAULT_FN_ATTRS512
-_mm512_mask_cvthf8(__m512h __W, __mmask32 __U, __m256i __A) {
+_mm512_mask_cvthf8_ph(__m512h __W, __mmask32 __U, __m256i __A) {
   return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask(
   (__v32qi)__A, (__v32hf)(__m512h)__W, (__mmask32)__U);
 }
 
 static __inline__ __m512h __DEFAULT_FN_ATTRS512
-_mm512_maskz_cvthf8(__mmask32 __U, __m256i __A) {
+_mm512_maskz_cvthf8_ph(__mmask32 __U, __m256i __A) {
   return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask(
   (__v32qi)__A, (__v32hf)(__m512h)_mm512_setzero_ph(), (__mmask32)__U);
 }

diff  --git a/clang/lib/Headers/avx10_2convertintrin.h 
b/clang/lib/Headers/avx10_2convertintrin.h
index 79d9def2207b8..07722090c30ee 100644
--- a/clang/lib/Headers/avx10_2convertintrin.h
+++ b/clang/lib/Headers/avx10_2convertintrin.h
@@ -381,37 +381,36 @@ _mm256_maskz_cvts2ph_hf8(__mmask32 __U, __m256h __A, 
__m256h __B) {
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 
-static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_cvthf8(__m128i __A) {
+static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_cvthf8_ph(__m128i __A) {
   return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask(
   (__v16qi)__A, (__v8hf)(__m128h)_mm_undefined_ph(), (__mmask8)-1);
 }
 
-static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_mask_cvthf8(__m128h __W,
-__mmask8 __U,
-__m128i __A) {
+static __inline__ __m128h __DEFAULT_FN_ATTRS128
+_mm_mask_cvthf8_ph(__m128h __W, __mmask8 __U, __m128i __A) {
   return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask(
   (__v16qi)__A, (__v8hf)(__m128h)__W, (__mmask8)__U);
 }
 
-static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_maskz_cvthf8(__mmask8 __U,
- __m128i __A) {
+static __inline__ __m128h __DEFAULT_FN_ATTRS128
+_mm_maskz_cvthf8_ph(__mmask8 __U, __m128i __A) {
   return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask(
   (__v16qi)__A, (__v8hf)(__m128h)_mm_setzero_ph(), (__mmask8)__U);
 }
 
-static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_cvthf8(__m128i __A) {
+static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_cvthf8_ph(__m128i __A) {
   return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask(
   (__v16qi)__A, (__v16hf)(__m256h)_mm256_undefined_ph(), (__mmask16)-1);
 }
 
 static __inline__ __m256h __DEFAULT_FN_ATTRS256
-_mm256_mask_cvthf8(__m256h __W, __mmask16 __U, __m128i __A) {
+_mm256_mask_cvthf8_ph(__m256h __W, __mmask16 __U, __m128i __A) {
   return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask(
   (__v16qi)__A, (__v16hf)(__m256h)__W, (__mmask16)__U);
 }
 
 static __inline__ __m256h __DEFAULT_FN_ATTRS256
-_mm256_maskz_cvthf8(__mmask16 __U, __m128i __A) {
+_mm256_maskz_cvthf8_ph(__mmask16 __U, __m128i __A) {
   return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask(
   (__v16qi)__A, (__v16hf)(__m256h)_mm256_setzero_ph(), (__mmask16)__U);
 }

diff  --git a/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c 
b/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c
index 22503c640a727..dcf7bbc005a7c 100644
--- a/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c
+++ b/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c
@@ -201,22 +201,22 @@ __m512i test_mm512_maskz_cvts2ph_hf8(__mmask64 __U, 
__m512h __A, __m512

[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong intrinsic names after rename (#126390) (PR #126687)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/126687

>From f33b128b3dc147a973cef55222549345b3201ad5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Miko=C5=82aj=20Pir=C3=B3g?= 
Date: Mon, 10 Feb 2025 05:48:02 +0100
Subject: [PATCH] [AVX10.2] Fix wrong intrinsic names after rename  (#126390)

In my previous PR (#123656) to update the names of AVX10.2 intrinsics
and mnemonics, I have erroneously deleted `_ph` from few intrinsics.
This PR corrects this.

(cherry picked from commit 161cfc6f39bef8994eb944687033ebd3570196e8)
---
 clang/lib/Headers/avx10_2_512convertintrin.h  |  6 ++--
 clang/lib/Headers/avx10_2convertintrin.h  | 17 +
 .../CodeGen/X86/avx10_2_512convert-builtins.c | 18 +-
 .../CodeGen/X86/avx10_2convert-builtins.c | 36 +--
 4 files changed, 38 insertions(+), 39 deletions(-)

diff --git a/clang/lib/Headers/avx10_2_512convertintrin.h 
b/clang/lib/Headers/avx10_2_512convertintrin.h
index 0b5fca5cda522..516ccc68672d6 100644
--- a/clang/lib/Headers/avx10_2_512convertintrin.h
+++ b/clang/lib/Headers/avx10_2_512convertintrin.h
@@ -213,19 +213,19 @@ _mm512_maskz_cvts2ph_hf8(__mmask64 __U, __m512h __A, 
__m512h __B) {
   (__v64qi)(__m512i)_mm512_setzero_si512());
 }
 
-static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_cvthf8(__m256i __A) {
+static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_cvthf8_ph(__m256i __A) {
   return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask(
   (__v32qi)__A, (__v32hf)(__m512h)_mm512_undefined_ph(), (__mmask32)-1);
 }
 
 static __inline__ __m512h __DEFAULT_FN_ATTRS512
-_mm512_mask_cvthf8(__m512h __W, __mmask32 __U, __m256i __A) {
+_mm512_mask_cvthf8_ph(__m512h __W, __mmask32 __U, __m256i __A) {
   return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask(
   (__v32qi)__A, (__v32hf)(__m512h)__W, (__mmask32)__U);
 }
 
 static __inline__ __m512h __DEFAULT_FN_ATTRS512
-_mm512_maskz_cvthf8(__mmask32 __U, __m256i __A) {
+_mm512_maskz_cvthf8_ph(__mmask32 __U, __m256i __A) {
   return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask(
   (__v32qi)__A, (__v32hf)(__m512h)_mm512_setzero_ph(), (__mmask32)__U);
 }
diff --git a/clang/lib/Headers/avx10_2convertintrin.h 
b/clang/lib/Headers/avx10_2convertintrin.h
index 79d9def2207b8..07722090c30ee 100644
--- a/clang/lib/Headers/avx10_2convertintrin.h
+++ b/clang/lib/Headers/avx10_2convertintrin.h
@@ -381,37 +381,36 @@ _mm256_maskz_cvts2ph_hf8(__mmask32 __U, __m256h __A, 
__m256h __B) {
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 
-static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_cvthf8(__m128i __A) {
+static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_cvthf8_ph(__m128i __A) {
   return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask(
   (__v16qi)__A, (__v8hf)(__m128h)_mm_undefined_ph(), (__mmask8)-1);
 }
 
-static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_mask_cvthf8(__m128h __W,
-__mmask8 __U,
-__m128i __A) {
+static __inline__ __m128h __DEFAULT_FN_ATTRS128
+_mm_mask_cvthf8_ph(__m128h __W, __mmask8 __U, __m128i __A) {
   return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask(
   (__v16qi)__A, (__v8hf)(__m128h)__W, (__mmask8)__U);
 }
 
-static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_maskz_cvthf8(__mmask8 __U,
- __m128i __A) {
+static __inline__ __m128h __DEFAULT_FN_ATTRS128
+_mm_maskz_cvthf8_ph(__mmask8 __U, __m128i __A) {
   return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask(
   (__v16qi)__A, (__v8hf)(__m128h)_mm_setzero_ph(), (__mmask8)__U);
 }
 
-static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_cvthf8(__m128i __A) {
+static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_cvthf8_ph(__m128i __A) {
   return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask(
   (__v16qi)__A, (__v16hf)(__m256h)_mm256_undefined_ph(), (__mmask16)-1);
 }
 
 static __inline__ __m256h __DEFAULT_FN_ATTRS256
-_mm256_mask_cvthf8(__m256h __W, __mmask16 __U, __m128i __A) {
+_mm256_mask_cvthf8_ph(__m256h __W, __mmask16 __U, __m128i __A) {
   return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask(
   (__v16qi)__A, (__v16hf)(__m256h)__W, (__mmask16)__U);
 }
 
 static __inline__ __m256h __DEFAULT_FN_ATTRS256
-_mm256_maskz_cvthf8(__mmask16 __U, __m128i __A) {
+_mm256_maskz_cvthf8_ph(__mmask16 __U, __m128i __A) {
   return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask(
   (__v16qi)__A, (__v16hf)(__m256h)_mm256_setzero_ph(), (__mmask16)__U);
 }
diff --git a/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c 
b/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c
index 22503c640a727..dcf7bbc005a7c 100644
--- a/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c
+++ b/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c
@@ -201,22 +201,22 @@ __m512i test_mm512_maskz_cvts2ph_hf8(__mmask64 __U, 
__m512h __A, __m512h __B) {
   return _mm512_maskz_cvts2ph_hf8(__U, __A, __B);
 }
 
-__m

[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627) (PR #126666)

2025-02-11 Thread via llvm-branch-commits

github-actions[bot] wrote:

@phoebewang (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/12
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236) (PR #126492)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/126492

>From 9bbf3a98b793f8fc6269a20a026ca6fe029a1790 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Fri, 7 Feb 2025 12:41:06 +0100
Subject: [PATCH 1/2] [IndVars] Add test for #126012 (NFC)

(cherry picked from commit ae08969a2068dd327fbf4d0f606550574fbb9e45)
---
 .../Transforms/IndVarSimplify/pr126012.ll | 49 +++
 1 file changed, 49 insertions(+)
 create mode 100644 llvm/test/Transforms/IndVarSimplify/pr126012.ll

diff --git a/llvm/test/Transforms/IndVarSimplify/pr126012.ll 
b/llvm/test/Transforms/IndVarSimplify/pr126012.ll
new file mode 100644
index 0..725ea89b8e651
--- /dev/null
+++ b/llvm/test/Transforms/IndVarSimplify/pr126012.ll
@@ -0,0 +1,49 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -passes=indvars < %s | FileCheck %s
+
+; FIXME: This is a miscompile.
+define i32 @test() {
+; CHECK-LABEL: define i32 @test() {
+; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:br label %[[FOR_PREHEADER:.*]]
+; CHECK:   [[FOR_PREHEADER]]:
+; CHECK-NEXT:[[INDVAR1:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[PHI:%.*]], 
%[[FOR_INC:.*]] ]
+; CHECK-NEXT:[[INDVAR3:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC:%.*]], 
%[[FOR_INC]] ]
+; CHECK-NEXT:[[COND1:%.*]] = icmp eq i32 [[INDVAR3]], 0
+; CHECK-NEXT:br i1 [[COND1]], label %[[FOR_INC]], label %[[FOR_END:.*]]
+; CHECK:   [[FOR_END]]:
+; CHECK-NEXT:[[EXT:%.*]] = zext i1 true to i32
+; CHECK-NEXT:br label %[[FOR_INC]]
+; CHECK:   [[FOR_INC]]:
+; CHECK-NEXT:[[PHI]] = phi i32 [ [[EXT]], %[[FOR_END]] ], [ 0, 
%[[FOR_PREHEADER]] ]
+; CHECK-NEXT:[[INC]] = add nuw nsw i32 [[INDVAR3]], 1
+; CHECK-NEXT:[[EXITCOND:%.*]] = icmp eq i32 [[INDVAR3]], 2
+; CHECK-NEXT:br i1 [[EXITCOND]], label %[[FOR_EXIT:.*]], label 
%[[FOR_PREHEADER]]
+; CHECK:   [[FOR_EXIT]]:
+; CHECK-NEXT:[[INDVAR1_LCSSA:%.*]] = phi i32 [ [[INDVAR1]], %[[FOR_INC]] ]
+; CHECK-NEXT:ret i32 [[INDVAR1_LCSSA]]
+;
+entry:
+  br label %for.preheader
+
+for.preheader:
+  %indvar1 = phi i32 [ 0, %entry ], [ %phi, %for.inc ]
+  %indvar2 = phi i32 [ 1, %entry ], [ %indvar3, %for.inc ]
+  %indvar3 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
+  %cond1 = icmp eq i32 %indvar3, 0
+  br i1 %cond1, label %for.inc, label %for.end
+
+for.end:
+  %cmp = icmp sgt i32 %indvar2, 0
+  %ext = zext i1 %cmp to i32
+  br label %for.inc
+
+for.inc:
+  %phi = phi i32 [ %ext, %for.end ], [ 0, %for.preheader ]
+  %inc = add i32 %indvar3, 1
+  %exitcond = icmp eq i32 %indvar3, 2
+  br i1 %exitcond, label %for.exit, label %for.preheader
+
+for.exit:
+  ret i32 %indvar1
+}

>From af970cd8753c37e7fcf66b6211f2a2d1e261325c Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Mon, 10 Feb 2025 10:07:21 +0100
Subject: [PATCH 2/2] [ScalarEvolution] Handle addrec incoming value in
 isImpliedViaMerge() (#126236)

The code already guards against values coming from a previous iteration
using properlyDominates(). However, addrecs are considered to properly
dominate the loop they are defined in.

Handle this special case separately, by checking for expressions that
have computable loop evolution (this should cover cases like a zext of
an addrec as well).

I considered changing the definition of properlyDominates() instead, but
decided against it. The current definition is useful in other context,
e.g. when deciding whether an expression is safe to expand in a given
block.

Fixes https://github.com/llvm/llvm-project/issues/126012.

(cherry picked from commit 7aed53eb1982113e825534f0f66d0a0e46e7a5ed)
---
 llvm/lib/Analysis/ScalarEvolution.cpp   |  6 ++
 llvm/test/Transforms/IndVarSimplify/pr126012.ll | 10 +++---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp 
b/llvm/lib/Analysis/ScalarEvolution.cpp
index 2ce40877b523e..c71202c8dd58e 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -12402,6 +12402,12 @@ bool ScalarEvolution::isImpliedViaMerge(CmpPredicate 
Pred, const SCEV *LHS,
   // iteration of a loop.
   if (!properlyDominates(L, LBB))
 return false;
+  // Addrecs are considered to properly dominate their loop, so are missed
+  // by the previous check. Discard any values that have computable
+  // evolution in this loop.
+  if (auto *Loop = LI.getLoopFor(LBB))
+if (hasComputableLoopEvolution(L, Loop))
+  return false;
   if (!ProvedEasily(L, RHS))
 return false;
 }
diff --git a/llvm/test/Transforms/IndVarSimplify/pr126012.ll 
b/llvm/test/Transforms/IndVarSimplify/pr126012.ll
index 725ea89b8e651..5189fe020dd3b 100644
--- a/llvm/test/Transforms/IndVarSimplify/pr126012.ll
+++ b/llvm/test/Transforms/IndVarSimplify/pr126012.ll
@@ -1,18 +1,22 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
 ; RUN: opt 

[llvm-branch-commits] [clang] release/20.x: [clang-format] Handle C-style cast of member function pointer type (#126340) (PR #126479)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/126479

>From bc87f9b80946dfe651d953c2fb4967ea32277a34 Mon Sep 17 00:00:00 2001
From: Owen Pan 
Date: Sat, 8 Feb 2025 23:22:33 -0800
Subject: [PATCH] [clang-format] Handle C-style cast of member function pointer
 type (#126340)

Fixes #125012.

(cherry picked from commit 8d373ceaec1f1b27c9e682cfaf71aae19ea48d98)
---
 clang/lib/Format/TokenAnnotator.cpp   | 7 +--
 clang/unittests/Format/TokenAnnotatorTest.cpp | 6 ++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index a172df5291ae6..4246ade6e19be 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -477,8 +477,9 @@ class AnnotatingParser {
 FormatToken *PossibleObjCForInToken = nullptr;
 while (CurrentToken) {
   const auto &Prev = *CurrentToken->Previous;
+  const auto *PrevPrev = Prev.Previous;
   if (Prev.is(TT_PointerOrReference) &&
-  Prev.Previous->isOneOf(tok::l_paren, tok::coloncolon)) {
+  PrevPrev->isOneOf(tok::l_paren, tok::coloncolon)) {
 ProbablyFunctionType = true;
   }
   if (CurrentToken->is(tok::comma))
@@ -486,8 +487,10 @@ class AnnotatingParser {
   if (Prev.is(TT_BinaryOperator))
 Contexts.back().IsExpression = true;
   if (CurrentToken->is(tok::r_paren)) {
-if (Prev.is(TT_PointerOrReference) && Prev.Previous == &OpeningParen)
+if (Prev.is(TT_PointerOrReference) &&
+(PrevPrev == &OpeningParen || PrevPrev->is(tok::coloncolon))) {
   MightBeFunctionType = true;
+}
 if (OpeningParen.isNot(TT_CppCastLParen) && MightBeFunctionType &&
 ProbablyFunctionType && CurrentToken->Next &&
 (CurrentToken->Next->is(tok::l_paren) ||
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp 
b/clang/unittests/Format/TokenAnnotatorTest.cpp
index fc77e277947c5..2147a1b950dd1 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -845,6 +845,12 @@ TEST_F(TokenAnnotatorTest, UnderstandsCasts) {
   EXPECT_TOKEN(Tokens[14], tok::r_paren, TT_CastRParen);
   EXPECT_TOKEN(Tokens[15], tok::amp, TT_UnaryOperator);
 
+  Tokens = annotate("return (Foo (Bar::*)())&Bar::foo;");
+  ASSERT_EQ(Tokens.size(), 17u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::l_paren, TT_FunctionTypeLParen);
+  EXPECT_TOKEN(Tokens[10], tok::r_paren, TT_CastRParen);
+  EXPECT_TOKEN(Tokens[11], tok::amp, TT_UnaryOperator);
+
   auto Style = getLLVMStyle();
   Style.TypeNames.push_back("Foo");
   Tokens = annotate("#define FOO(bar) foo((Foo)&bar)", Style);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [clang-format] Handle C-style cast of member function pointer type (#126340) (PR #126479)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/126479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] bc87f9b - [clang-format] Handle C-style cast of member function pointer type (#126340)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: Owen Pan
Date: 2025-02-11T14:20:03-08:00
New Revision: bc87f9b80946dfe651d953c2fb4967ea32277a34

URL: 
https://github.com/llvm/llvm-project/commit/bc87f9b80946dfe651d953c2fb4967ea32277a34
DIFF: 
https://github.com/llvm/llvm-project/commit/bc87f9b80946dfe651d953c2fb4967ea32277a34.diff

LOG: [clang-format] Handle C-style cast of member function pointer type 
(#126340)

Fixes #125012.

(cherry picked from commit 8d373ceaec1f1b27c9e682cfaf71aae19ea48d98)

Added: 


Modified: 
clang/lib/Format/TokenAnnotator.cpp
clang/unittests/Format/TokenAnnotatorTest.cpp

Removed: 




diff  --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index a172df5291ae6..4246ade6e19be 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -477,8 +477,9 @@ class AnnotatingParser {
 FormatToken *PossibleObjCForInToken = nullptr;
 while (CurrentToken) {
   const auto &Prev = *CurrentToken->Previous;
+  const auto *PrevPrev = Prev.Previous;
   if (Prev.is(TT_PointerOrReference) &&
-  Prev.Previous->isOneOf(tok::l_paren, tok::coloncolon)) {
+  PrevPrev->isOneOf(tok::l_paren, tok::coloncolon)) {
 ProbablyFunctionType = true;
   }
   if (CurrentToken->is(tok::comma))
@@ -486,8 +487,10 @@ class AnnotatingParser {
   if (Prev.is(TT_BinaryOperator))
 Contexts.back().IsExpression = true;
   if (CurrentToken->is(tok::r_paren)) {
-if (Prev.is(TT_PointerOrReference) && Prev.Previous == &OpeningParen)
+if (Prev.is(TT_PointerOrReference) &&
+(PrevPrev == &OpeningParen || PrevPrev->is(tok::coloncolon))) {
   MightBeFunctionType = true;
+}
 if (OpeningParen.isNot(TT_CppCastLParen) && MightBeFunctionType &&
 ProbablyFunctionType && CurrentToken->Next &&
 (CurrentToken->Next->is(tok::l_paren) ||

diff  --git a/clang/unittests/Format/TokenAnnotatorTest.cpp 
b/clang/unittests/Format/TokenAnnotatorTest.cpp
index fc77e277947c5..2147a1b950dd1 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -845,6 +845,12 @@ TEST_F(TokenAnnotatorTest, UnderstandsCasts) {
   EXPECT_TOKEN(Tokens[14], tok::r_paren, TT_CastRParen);
   EXPECT_TOKEN(Tokens[15], tok::amp, TT_UnaryOperator);
 
+  Tokens = annotate("return (Foo (Bar::*)())&Bar::foo;");
+  ASSERT_EQ(Tokens.size(), 17u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::l_paren, TT_FunctionTypeLParen);
+  EXPECT_TOKEN(Tokens[10], tok::r_paren, TT_CastRParen);
+  EXPECT_TOKEN(Tokens[11], tok::amp, TT_UnaryOperator);
+
   auto Style = getLLVMStyle();
   Style.TypeNames.push_back("Foo");
   Tokens = annotate("#define FOO(bar) foo((Foo)&bar)", Style);



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [clang-format] Handle C-style cast of member function pointer type (#126340) (PR #126479)

2025-02-11 Thread via llvm-branch-commits

github-actions[bot] wrote:

@owenca (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/126479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (PR #126607)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/126607
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] d43a971 - release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (#126607)

2025-02-11 Thread via llvm-branch-commits

Author: Amr Hesham
Date: 2025-02-11T14:22:01-08:00
New Revision: d43a97163c43d3cfbfc7c11287aea2233bc7ffb4

URL: 
https://github.com/llvm/llvm-project/commit/d43a97163c43d3cfbfc7c11287aea2233bc7ffb4
DIFF: 
https://github.com/llvm/llvm-project/commit/d43a97163c43d3cfbfc7c11287aea2233bc7ffb4.diff

LOG: release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when 
dump-section output path doesn't exist #125345 (#126607)

Add release note for llvm-objcopy fixing prints wrong path when
dump-section output path doesn't exist in #125345

Added: 


Modified: 
llvm/docs/ReleaseNotes.md

Removed: 




diff  --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index 44a0b17d6a07b..28908490b8f7c 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -460,6 +460,8 @@ Changes to the LLVM tools
   `--localize-symbol`, `--localize-symbols`,
   `--skip-symbol`, `--skip-symbols`.
 
+* llvm-objcopy now prints the correct file path in the error message when the 
output file specified by `--dump-section` cannot be opened.
+
 Changes to LLDB
 -
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] ac97cff - [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: Chuanqi Xu
Date: 2025-02-11T14:05:14-08:00
New Revision: ac97cff5a3684be98f4863191f0006cdf0fa89b4

URL: 
https://github.com/llvm/llvm-project/commit/ac97cff5a3684be98f4863191f0006cdf0fa89b4
DIFF: 
https://github.com/llvm/llvm-project/commit/ac97cff5a3684be98f4863191f0006cdf0fa89b4.diff

LOG: [C++20] [Modules] Don't diagnose duplicated declarations in different 
modules which is not in file scope

Close https://github.com/llvm/llvm-project/issues/126373

Although the root problems should be we shouldn't place the friend
declaration to the incorrect module, let's avoid bleeding the edge by
stoping diagnosing declarations not in file scope.

(cherry picked from commit 569e94f8f1c3e6998860e2b2ff577870433bdac9)

Added: 
clang/test/Modules/pr126373.cppm

Modified: 
clang/lib/Serialization/ASTReaderDecl.cpp

Removed: 




diff  --git a/clang/lib/Serialization/ASTReaderDecl.cpp 
b/clang/lib/Serialization/ASTReaderDecl.cpp
index 1aa94d5a22abe..8fbb0a8d3edd8 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -3751,6 +3751,13 @@ void 
ASTDeclReader::checkMultipleDefinitionInNamedModules(ASTReader &Reader,
   if (D->getFriendObjectKind() || Previous->getFriendObjectKind())
 return;
 
+  // Skip diagnosing in-class declarations.
+  if (!Previous->getLexicalDeclContext()
+   ->getNonTransparentContext()
+   ->isFileContext() ||
+  !D->getLexicalDeclContext()->getNonTransparentContext()->isFileContext())
+return;
+
   Module *M = Previous->getOwningModule();
   if (!M)
 return;

diff  --git a/clang/test/Modules/pr126373.cppm 
b/clang/test/Modules/pr126373.cppm
new file mode 100644
index 0..f176a587b51ce
--- /dev/null
+++ b/clang/test/Modules/pr126373.cppm
@@ -0,0 +1,34 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// RUN: %clang_cc1 -std=c++20 %t/module1.cppm -emit-module-interface -o 
%t/module1.pcm
+// RUN: %clang_cc1 -std=c++20 -fmodule-file=module1=%t/module1.pcm  
%t/module2.cppm \
+// RUN: -emit-module-interface -o %t/module2.pcm
+// RUN: %clang_cc1 -std=c++20 %t/module2.pcm 
-fmodule-file=module1=%t/module1.pcm \
+// RUN: -emit-llvm -o - | FileCheck %t/module2.cppm
+
+//--- test.h
+template
+struct Test {
+  template
+  friend class Test;
+};
+
+//--- module1.cppm
+module;
+#include "test.h"
+export module module1;
+export void f1(Test) {}
+
+//--- module2.cppm
+module;
+#include "test.h"
+export module module2;
+import module1;
+export void f2(Test) {}
+
+extern "C" void func() {}
+
+// Fine enough to check the IR is emitted correctly.
+// CHECK: define{{.*}}@func



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627) (PR #126666)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/12

>From 1c36697fbb554b49b00bd2e9bd842ffcb73d9a0f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Miko=C5=82aj=20Pir=C3=B3g?= 
Date: Tue, 11 Feb 2025 06:13:36 +0100
Subject: [PATCH] [AVX10.2] Fix wrong mask casting in some convert intrinsics
 (#126627)

Found during work on #120927. This caused the compiler to silently drop
ignore half of the mask in the specific intrinsics.

(cherry picked from commit af522c5dd3a38cc5e11e8e62009d7dbe2cde2d86)
---
 clang/lib/Headers/avx10_2convertintrin.h | 16 
 clang/test/CodeGen/X86/avx10_2convert-builtins.c | 16 
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/clang/lib/Headers/avx10_2convertintrin.h 
b/clang/lib/Headers/avx10_2convertintrin.h
index c67a5b890f195..79d9def2207b8 100644
--- a/clang/lib/Headers/avx10_2convertintrin.h
+++ b/clang/lib/Headers/avx10_2convertintrin.h
@@ -260,13 +260,13 @@ static __inline__ __m256i __DEFAULT_FN_ATTRS256 
_mm256_cvt2ph_bf8(__m256h __A,
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_cvt2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)__W);
+  (__mmask32)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)__W);
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_cvt2ph_bf8(__mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B),
+  (__mmask32)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B),
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 
@@ -297,13 +297,13 @@ _mm256_cvts2ph_bf8(__m256h __A, __m256h __B) {
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_cvts2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)__W);
+  (__mmask32)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)__W);
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_cvts2ph_bf8(__mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B),
+  (__mmask32)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B),
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 
@@ -334,13 +334,13 @@ static __inline__ __m256i __DEFAULT_FN_ATTRS256 
_mm256_cvt2ph_hf8(__m256h __A,
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_cvt2ph_hf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)__W);
+  (__mmask32)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)__W);
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_cvt2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B),
+  (__mmask32)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B),
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 
@@ -371,13 +371,13 @@ _mm256_cvts2ph_hf8(__m256h __A, __m256h __B) {
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_cvts2ph_hf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)__W);
+  (__mmask32)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)__W);
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_cvts2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B),
+  (__mmask32)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B),
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 
diff --git a/clang/test/CodeGen/X86/avx10_2convert-builtins.c 
b/clang/test/CodeGen/X86/avx10_2convert-builtins.c
index efd9a31c40875..e5e6f867e119e 100644
--- a/clang/test/CodeGen/X86/avx10_2convert-builtins.c
+++ b/clang/test/CodeGen/X86/avx10_2convert-builtins.c
@@ -231,7 +231,7 @@ __m256i test_mm256_cvt2ph_bf8(__m256h __A, __m256h __B) {
   return _mm256_cvt2ph_bf8(__A, __B);
 }
 
-__m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask16 __U, __m256h __A, 
__m256h __B) {
+__m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, 
__m256h __B) {
   // CHECK-LABEL: @test_mm256_mask_cvt2ph_bf8(
   // CHECK: call <32 x i8> @llvm.x86.avx10.vcvt2ph2bf8256(
   // CHECK: select <32 x i1> %{{.*}}, <32 x i8> %{{.*}}, <32 x i8> %{{.*}}
@@ -239,7 +239,7 @@ __m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask16 
__U, __m256h __A, __m2
   return _mm256_mask_cvt2ph_bf8(__W, __U, __A, __B);
 }
 
-__m256i test_mm256_maskz_cvt2ph_bf8(__mmask16 __U, __m256h __A, __m25

[llvm-branch-commits] [clang] 1c36697 - [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: Mikołaj Piróg
Date: 2025-02-11T14:07:33-08:00
New Revision: 1c36697fbb554b49b00bd2e9bd842ffcb73d9a0f

URL: 
https://github.com/llvm/llvm-project/commit/1c36697fbb554b49b00bd2e9bd842ffcb73d9a0f
DIFF: 
https://github.com/llvm/llvm-project/commit/1c36697fbb554b49b00bd2e9bd842ffcb73d9a0f.diff

LOG: [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627)

Found during work on #120927. This caused the compiler to silently drop
ignore half of the mask in the specific intrinsics.

(cherry picked from commit af522c5dd3a38cc5e11e8e62009d7dbe2cde2d86)

Added: 


Modified: 
clang/lib/Headers/avx10_2convertintrin.h
clang/test/CodeGen/X86/avx10_2convert-builtins.c

Removed: 




diff  --git a/clang/lib/Headers/avx10_2convertintrin.h 
b/clang/lib/Headers/avx10_2convertintrin.h
index c67a5b890f195..79d9def2207b8 100644
--- a/clang/lib/Headers/avx10_2convertintrin.h
+++ b/clang/lib/Headers/avx10_2convertintrin.h
@@ -260,13 +260,13 @@ static __inline__ __m256i __DEFAULT_FN_ATTRS256 
_mm256_cvt2ph_bf8(__m256h __A,
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_cvt2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)__W);
+  (__mmask32)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)__W);
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_cvt2ph_bf8(__mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B),
+  (__mmask32)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B),
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 
@@ -297,13 +297,13 @@ _mm256_cvts2ph_bf8(__m256h __A, __m256h __B) {
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_cvts2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)__W);
+  (__mmask32)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)__W);
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_cvts2ph_bf8(__mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B),
+  (__mmask32)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B),
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 
@@ -334,13 +334,13 @@ static __inline__ __m256i __DEFAULT_FN_ATTRS256 
_mm256_cvt2ph_hf8(__m256h __A,
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_cvt2ph_hf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)__W);
+  (__mmask32)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)__W);
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_cvt2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B),
+  (__mmask32)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B),
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 
@@ -371,13 +371,13 @@ _mm256_cvts2ph_hf8(__m256h __A, __m256h __B) {
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_mask_cvts2ph_hf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)__W);
+  (__mmask32)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)__W);
 }
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256
 _mm256_maskz_cvts2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) {
   return (__m256i)__builtin_ia32_selectb_256(
-  (__mmask16)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B),
+  (__mmask32)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B),
   (__v32qi)(__m256i)_mm256_setzero_si256());
 }
 

diff  --git a/clang/test/CodeGen/X86/avx10_2convert-builtins.c 
b/clang/test/CodeGen/X86/avx10_2convert-builtins.c
index efd9a31c40875..e5e6f867e119e 100644
--- a/clang/test/CodeGen/X86/avx10_2convert-builtins.c
+++ b/clang/test/CodeGen/X86/avx10_2convert-builtins.c
@@ -231,7 +231,7 @@ __m256i test_mm256_cvt2ph_bf8(__m256h __A, __m256h __B) {
   return _mm256_cvt2ph_bf8(__A, __B);
 }
 
-__m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask16 __U, __m256h __A, 
__m256h __B) {
+__m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, 
__m256h __B) {
   // CHECK-LABEL: @test_mm256_mask_cvt2ph_bf8(
   // CHECK: call <32 x i8> @llvm.x86.avx10.vcvt2ph2bf8256(
   // CHECK: select <32 x i1> %{{.*}}, <32 x i8> %{{.*}}, <32 x i8> %{{.*}}
@@ -239,7 +239,7 @@ __m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask16 
__U, __m256h __A, __m2
   return _mm256_mask_cvt2ph_bf8(__W, __U, __A, __B);
 }
 

[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627) (PR #126666)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/12
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong intrinsic names after rename (#126390) (PR #126687)

2025-02-11 Thread via llvm-branch-commits

github-actions[bot] wrote:

@phoebewang (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/126687
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong intrinsic names after rename (#126390) (PR #126687)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/126687
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 94c1a8e - [DSE] Don't use initializes on byval argument (#126259)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2025-02-11T14:12:42-08:00
New Revision: 94c1a8ea1bfea3cbd7191783c85b2cead642f653

URL: 
https://github.com/llvm/llvm-project/commit/94c1a8ea1bfea3cbd7191783c85b2cead642f653
DIFF: 
https://github.com/llvm/llvm-project/commit/94c1a8ea1bfea3cbd7191783c85b2cead642f653.diff

LOG: [DSE] Don't use initializes on byval argument (#126259)

There are two ways we can fix this problem, depending on how the
semantics of byval and initializes should interact:

* Don't infer initializes on byval arguments. initializes on byval
refers to the original caller memory (or having both attributes is made
a verifier error).
* Infer initializes on byval, but don't use it in DSE. initializes on
byval refers to the callee copy. This matches the semantics of readonly
on byval. This is slightly more powerful, for example, we could do a
backend optimization where byval + initializes will allocate the full
size of byval on the stack but not copy over the parts covered by
initializes.

I went with the second variant here, skipping byval + initializes in DSE
(FunctionAttrs already doesn't propagate initializes past byval). I'm
open to going in the other direction though.

Fixes https://github.com/llvm/llvm-project/issues/126181.

(cherry picked from commit 2d31a12dbe2339d20844ede70cbb54dbaf4ceea9)

Added: 


Modified: 
llvm/docs/LangRef.rst
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll

Removed: 




diff  --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index d004ced9dff14..e002195cb7ed5 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -1725,6 +1725,10 @@ Currently, only the following parameter attributes are 
defined:
 and negative values are allowed in case the argument points partway into
 an allocation. An empty list is not allowed.
 
+On a ``byval`` argument, ``initializes`` refers to the given parts of the
+callee copy being overwritten. A ``byval`` callee can never initialize the
+original caller memory passed to the ``byval`` argument.
+
 ``dead_on_unwind``
 At a high level, this attribute indicates that the pointer argument is dead
 if the call unwinds, in the sense that the caller will not depend on the

diff  --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp 
b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 13f3de07c3c44..0fdc3354753b1 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -2281,7 +2281,9 @@ DSEState::getInitializesArgMemLoc(const Instruction *I) {
   for (unsigned Idx = 0, Count = CB->arg_size(); Idx < Count; ++Idx) {
 ConstantRangeList Inits;
 Attribute InitializesAttr = CB->getParamAttr(Idx, Attribute::Initializes);
-if (InitializesAttr.isValid())
+// initializes on byval arguments refers to the callee copy, not the
+// original memory the caller passed in.
+if (InitializesAttr.isValid() && !CB->isByValArgument(Idx))
   Inits = InitializesAttr.getValueAsConstantRangeList();
 
 Value *CurArg = CB->getArgOperand(Idx);

diff  --git a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll 
b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
index e590c5bf4004a..5f8ab56c22754 100644
--- a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
+++ b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
@@ -338,3 +338,17 @@ define i16 @global_var_alias() {
   ret i16 %l
 }
 
+declare void @byval_fn(ptr byval(i32) initializes((0, 4)) %am)
+
+define void @test_byval() {
+; CHECK-LABEL: @test_byval(
+; CHECK-NEXT:[[A:%.*]] = alloca i32, align 4
+; CHECK-NEXT:store i32 0, ptr [[A]], align 4
+; CHECK-NEXT:call void @byval_fn(ptr [[A]])
+; CHECK-NEXT:ret void
+;
+  %a = alloca i32
+  store i32 0, ptr %a
+  call void @byval_fn(ptr %a)
+  ret void
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [DSE] Don't use initializes on byval argument (#126259) (PR #126493)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/126493
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [DSE] Don't use initializes on byval argument (#126259) (PR #126493)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/126493

>From 94c1a8ea1bfea3cbd7191783c85b2cead642f653 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Mon, 10 Feb 2025 10:34:03 +0100
Subject: [PATCH] [DSE] Don't use initializes on byval argument (#126259)

There are two ways we can fix this problem, depending on how the
semantics of byval and initializes should interact:

* Don't infer initializes on byval arguments. initializes on byval
refers to the original caller memory (or having both attributes is made
a verifier error).
* Infer initializes on byval, but don't use it in DSE. initializes on
byval refers to the callee copy. This matches the semantics of readonly
on byval. This is slightly more powerful, for example, we could do a
backend optimization where byval + initializes will allocate the full
size of byval on the stack but not copy over the parts covered by
initializes.

I went with the second variant here, skipping byval + initializes in DSE
(FunctionAttrs already doesn't propagate initializes past byval). I'm
open to going in the other direction though.

Fixes https://github.com/llvm/llvm-project/issues/126181.

(cherry picked from commit 2d31a12dbe2339d20844ede70cbb54dbaf4ceea9)
---
 llvm/docs/LangRef.rst  |  4 
 .../lib/Transforms/Scalar/DeadStoreElimination.cpp |  4 +++-
 .../DeadStoreElimination/inter-procedural.ll   | 14 ++
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index d004ced9dff14..e002195cb7ed5 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -1725,6 +1725,10 @@ Currently, only the following parameter attributes are 
defined:
 and negative values are allowed in case the argument points partway into
 an allocation. An empty list is not allowed.
 
+On a ``byval`` argument, ``initializes`` refers to the given parts of the
+callee copy being overwritten. A ``byval`` callee can never initialize the
+original caller memory passed to the ``byval`` argument.
+
 ``dead_on_unwind``
 At a high level, this attribute indicates that the pointer argument is dead
 if the call unwinds, in the sense that the caller will not depend on the
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp 
b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 13f3de07c3c44..0fdc3354753b1 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -2281,7 +2281,9 @@ DSEState::getInitializesArgMemLoc(const Instruction *I) {
   for (unsigned Idx = 0, Count = CB->arg_size(); Idx < Count; ++Idx) {
 ConstantRangeList Inits;
 Attribute InitializesAttr = CB->getParamAttr(Idx, Attribute::Initializes);
-if (InitializesAttr.isValid())
+// initializes on byval arguments refers to the callee copy, not the
+// original memory the caller passed in.
+if (InitializesAttr.isValid() && !CB->isByValArgument(Idx))
   Inits = InitializesAttr.getValueAsConstantRangeList();
 
 Value *CurArg = CB->getArgOperand(Idx);
diff --git a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll 
b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
index e590c5bf4004a..5f8ab56c22754 100644
--- a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
+++ b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
@@ -338,3 +338,17 @@ define i16 @global_var_alias() {
   ret i16 %l
 }
 
+declare void @byval_fn(ptr byval(i32) initializes((0, 4)) %am)
+
+define void @test_byval() {
+; CHECK-LABEL: @test_byval(
+; CHECK-NEXT:[[A:%.*]] = alloca i32, align 4
+; CHECK-NEXT:store i32 0, ptr [[A]], align 4
+; CHECK-NEXT:call void @byval_fn(ptr [[A]])
+; CHECK-NEXT:ret void
+;
+  %a = alloca i32
+  store i32 0, ptr %a
+  call void @byval_fn(ptr %a)
+  ret void
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [DSE] Don't use initializes on byval argument (#126259) (PR #126493)

2025-02-11 Thread via llvm-branch-commits

github-actions[bot] wrote:

@nikic (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/126493
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532) (PR #126496)

2025-02-11 Thread via llvm-branch-commits

github-actions[bot] wrote:

@nikic (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/126496
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236) (PR #126492)

2025-02-11 Thread via llvm-branch-commits

github-actions[bot] wrote:

@nikic (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/126492
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] af970cd - [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2025-02-11T14:14:21-08:00
New Revision: af970cd8753c37e7fcf66b6211f2a2d1e261325c

URL: 
https://github.com/llvm/llvm-project/commit/af970cd8753c37e7fcf66b6211f2a2d1e261325c
DIFF: 
https://github.com/llvm/llvm-project/commit/af970cd8753c37e7fcf66b6211f2a2d1e261325c.diff

LOG: [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() 
(#126236)

The code already guards against values coming from a previous iteration
using properlyDominates(). However, addrecs are considered to properly
dominate the loop they are defined in.

Handle this special case separately, by checking for expressions that
have computable loop evolution (this should cover cases like a zext of
an addrec as well).

I considered changing the definition of properlyDominates() instead, but
decided against it. The current definition is useful in other context,
e.g. when deciding whether an expression is safe to expand in a given
block.

Fixes https://github.com/llvm/llvm-project/issues/126012.

(cherry picked from commit 7aed53eb1982113e825534f0f66d0a0e46e7a5ed)

Added: 


Modified: 
llvm/lib/Analysis/ScalarEvolution.cpp
llvm/test/Transforms/IndVarSimplify/pr126012.ll

Removed: 




diff  --git a/llvm/lib/Analysis/ScalarEvolution.cpp 
b/llvm/lib/Analysis/ScalarEvolution.cpp
index 2ce40877b523e..c71202c8dd58e 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -12402,6 +12402,12 @@ bool ScalarEvolution::isImpliedViaMerge(CmpPredicate 
Pred, const SCEV *LHS,
   // iteration of a loop.
   if (!properlyDominates(L, LBB))
 return false;
+  // Addrecs are considered to properly dominate their loop, so are missed
+  // by the previous check. Discard any values that have computable
+  // evolution in this loop.
+  if (auto *Loop = LI.getLoopFor(LBB))
+if (hasComputableLoopEvolution(L, Loop))
+  return false;
   if (!ProvedEasily(L, RHS))
 return false;
 }

diff  --git a/llvm/test/Transforms/IndVarSimplify/pr126012.ll 
b/llvm/test/Transforms/IndVarSimplify/pr126012.ll
index 725ea89b8e651..5189fe020dd3b 100644
--- a/llvm/test/Transforms/IndVarSimplify/pr126012.ll
+++ b/llvm/test/Transforms/IndVarSimplify/pr126012.ll
@@ -1,18 +1,22 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
 ; RUN: opt -S -passes=indvars < %s | FileCheck %s
 
-; FIXME: This is a miscompile.
+; Do not infer that %cmp is true. The %indvar3 input of %indvar2 comes from
+; a previous iteration, so we should not compare it to a value from the current
+; iteration.
 define i32 @test() {
 ; CHECK-LABEL: define i32 @test() {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
 ; CHECK-NEXT:br label %[[FOR_PREHEADER:.*]]
 ; CHECK:   [[FOR_PREHEADER]]:
 ; CHECK-NEXT:[[INDVAR1:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[PHI:%.*]], 
%[[FOR_INC:.*]] ]
-; CHECK-NEXT:[[INDVAR3:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC:%.*]], 
%[[FOR_INC]] ]
+; CHECK-NEXT:[[INDVAR2:%.*]] = phi i32 [ 1, %[[ENTRY]] ], [ 
[[INDVAR3:%.*]], %[[FOR_INC]] ]
+; CHECK-NEXT:[[INDVAR3]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC:%.*]], 
%[[FOR_INC]] ]
 ; CHECK-NEXT:[[COND1:%.*]] = icmp eq i32 [[INDVAR3]], 0
 ; CHECK-NEXT:br i1 [[COND1]], label %[[FOR_INC]], label %[[FOR_END:.*]]
 ; CHECK:   [[FOR_END]]:
-; CHECK-NEXT:[[EXT:%.*]] = zext i1 true to i32
+; CHECK-NEXT:[[CMP:%.*]] = icmp ugt i32 [[INDVAR2]], 0
+; CHECK-NEXT:[[EXT:%.*]] = zext i1 [[CMP]] to i32
 ; CHECK-NEXT:br label %[[FOR_INC]]
 ; CHECK:   [[FOR_INC]]:
 ; CHECK-NEXT:[[PHI]] = phi i32 [ [[EXT]], %[[FOR_END]] ], [ 0, 
%[[FOR_PREHEADER]] ]



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236) (PR #126492)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/126492
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 9bbf3a9 - [IndVars] Add test for #126012 (NFC)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2025-02-11T14:14:21-08:00
New Revision: 9bbf3a98b793f8fc6269a20a026ca6fe029a1790

URL: 
https://github.com/llvm/llvm-project/commit/9bbf3a98b793f8fc6269a20a026ca6fe029a1790
DIFF: 
https://github.com/llvm/llvm-project/commit/9bbf3a98b793f8fc6269a20a026ca6fe029a1790.diff

LOG: [IndVars] Add test for #126012 (NFC)

(cherry picked from commit ae08969a2068dd327fbf4d0f606550574fbb9e45)

Added: 
llvm/test/Transforms/IndVarSimplify/pr126012.ll

Modified: 


Removed: 




diff  --git a/llvm/test/Transforms/IndVarSimplify/pr126012.ll 
b/llvm/test/Transforms/IndVarSimplify/pr126012.ll
new file mode 100644
index 0..725ea89b8e651
--- /dev/null
+++ b/llvm/test/Transforms/IndVarSimplify/pr126012.ll
@@ -0,0 +1,49 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -passes=indvars < %s | FileCheck %s
+
+; FIXME: This is a miscompile.
+define i32 @test() {
+; CHECK-LABEL: define i32 @test() {
+; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:br label %[[FOR_PREHEADER:.*]]
+; CHECK:   [[FOR_PREHEADER]]:
+; CHECK-NEXT:[[INDVAR1:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[PHI:%.*]], 
%[[FOR_INC:.*]] ]
+; CHECK-NEXT:[[INDVAR3:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC:%.*]], 
%[[FOR_INC]] ]
+; CHECK-NEXT:[[COND1:%.*]] = icmp eq i32 [[INDVAR3]], 0
+; CHECK-NEXT:br i1 [[COND1]], label %[[FOR_INC]], label %[[FOR_END:.*]]
+; CHECK:   [[FOR_END]]:
+; CHECK-NEXT:[[EXT:%.*]] = zext i1 true to i32
+; CHECK-NEXT:br label %[[FOR_INC]]
+; CHECK:   [[FOR_INC]]:
+; CHECK-NEXT:[[PHI]] = phi i32 [ [[EXT]], %[[FOR_END]] ], [ 0, 
%[[FOR_PREHEADER]] ]
+; CHECK-NEXT:[[INC]] = add nuw nsw i32 [[INDVAR3]], 1
+; CHECK-NEXT:[[EXITCOND:%.*]] = icmp eq i32 [[INDVAR3]], 2
+; CHECK-NEXT:br i1 [[EXITCOND]], label %[[FOR_EXIT:.*]], label 
%[[FOR_PREHEADER]]
+; CHECK:   [[FOR_EXIT]]:
+; CHECK-NEXT:[[INDVAR1_LCSSA:%.*]] = phi i32 [ [[INDVAR1]], %[[FOR_INC]] ]
+; CHECK-NEXT:ret i32 [[INDVAR1_LCSSA]]
+;
+entry:
+  br label %for.preheader
+
+for.preheader:
+  %indvar1 = phi i32 [ 0, %entry ], [ %phi, %for.inc ]
+  %indvar2 = phi i32 [ 1, %entry ], [ %indvar3, %for.inc ]
+  %indvar3 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
+  %cond1 = icmp eq i32 %indvar3, 0
+  br i1 %cond1, label %for.inc, label %for.end
+
+for.end:
+  %cmp = icmp sgt i32 %indvar2, 0
+  %ext = zext i1 %cmp to i32
+  br label %for.inc
+
+for.inc:
+  %phi = phi i32 [ %ext, %for.end ], [ 0, %for.preheader ]
+  %inc = add i32 %indvar3, 1
+  %exitcond = icmp eq i32 %indvar3, 2
+  br i1 %exitcond, label %for.exit, label %for.preheader
+
+for.exit:
+  ret i32 %indvar1
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532) (PR #126496)

2025-02-11 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/126496

>From a89e04e7f0caa28d607e38099b905063b47a88fb Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Mon, 3 Feb 2025 17:37:07 +0100
Subject: [PATCH 1/2] [ValueTracking] Add additional tests for computeKnownBits
 on GEPs (NFC)

These demonstrate miscompiles in the existing code.

(cherry picked from commit 3dc1ef1650c8389a6f195a474781cf2281208bed)
---
 llvm/unittests/Analysis/ValueTrackingTest.cpp | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/llvm/unittests/Analysis/ValueTrackingTest.cpp 
b/llvm/unittests/Analysis/ValueTrackingTest.cpp
index ee44aac45594d..39865fa195cf7 100644
--- a/llvm/unittests/Analysis/ValueTrackingTest.cpp
+++ b/llvm/unittests/Analysis/ValueTrackingTest.cpp
@@ -2679,6 +2679,41 @@ TEST_F(ComputeKnownBitsTest, 
ComputeKnownBitsAbsoluteSymbol) {
   EXPECT_EQ(0u, Known_0_256_Align8.countMinTrailingOnes());
 }
 
+TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPExtendBeforeMul) {
+  // FIXME: The index should be extended before multiplying with the scale.
+  parseAssembly(R"(
+target datalayout = "p:16:16:16"
+
+define void @test(i16 %arg) {
+  %and = and i16 %arg, u0x8000
+  %base = inttoptr i16 %and to ptr
+  %A = getelementptr i32, ptr %base, i8 80
+  ret void
+}
+)");
+  KnownBits Known = computeKnownBits(A, M->getDataLayout());
+  EXPECT_EQ(~64 & 0x7fff, Known.Zero);
+  EXPECT_EQ(64, Known.One);
+}
+
+TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPOnlyIndexBits) {
+  // FIXME: GEP should only affect the index width.
+  parseAssembly(R"(
+target datalayout = "p:16:16:16:8"
+
+define void @test(i16 %arg) {
+  %and = and i16 %arg, u0x8000
+  %or = or i16 %and, u0x00ff
+  %base = inttoptr i16 %or to ptr
+  %A = getelementptr i8, ptr %base, i8 1
+  ret void
+}
+)");
+  KnownBits Known = computeKnownBits(A, M->getDataLayout());
+  EXPECT_EQ(0x7eff, Known.Zero);
+  EXPECT_EQ(0x100, Known.One);
+}
+
 TEST_F(ValueTrackingTest, HaveNoCommonBitsSet) {
   {
 // Check for an inverted mask: (X & ~M) op (Y & M).

>From 5777d5df62a659e165b4df74aefae29ae01d2509 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Tue, 4 Feb 2025 14:29:58 +0100
Subject: [PATCH 2/2] [ValueTracking] Fix bit width handling in
 computeKnownBits() for GEPs (#125532)

For GEPs, we have three bit widths involved: The pointer bit width, the
index bit width, and the bit width of the GEP operands.

The correct behavior here is:
* We need to sextOrTrunc the GEP operand to the index width *before*
multiplying by the scale.
* If the index width and pointer width differ, GEP only ever modifies
the low bits. Adds should not overflow into the high bits.

I'm testing this via unit tests because it's a bit tricky to test in IR
with InstCombine canonicalization getting in the way.

(cherry picked from commit 3bd11b502c1846afa5e1257c94b7a70566e34686)
---
 llvm/lib/Analysis/ValueTracking.cpp   | 66 ++-
 llvm/unittests/Analysis/ValueTrackingTest.cpp | 12 ++--
 2 files changed, 42 insertions(+), 36 deletions(-)

diff --git a/llvm/lib/Analysis/ValueTracking.cpp 
b/llvm/lib/Analysis/ValueTracking.cpp
index b63a0a07f7de2..8a674914641a8 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1445,7 +1445,22 @@ static void computeKnownBitsFromOperator(const Operator 
*I,
 computeKnownBits(I->getOperand(0), Known, Depth + 1, Q);
 // Accumulate the constant indices in a separate variable
 // to minimize the number of calls to computeForAddSub.
-APInt AccConstIndices(BitWidth, 0, /*IsSigned*/ true);
+unsigned IndexWidth = Q.DL.getIndexTypeSizeInBits(I->getType());
+APInt AccConstIndices(IndexWidth, 0);
+
+auto AddIndexToKnown = [&](KnownBits IndexBits) {
+  if (IndexWidth == BitWidth) {
+// Note that inbounds does *not* guarantee nsw for the addition, as 
only
+// the offset is signed, while the base address is unsigned.
+Known = KnownBits::add(Known, IndexBits);
+  } else {
+// If the index width is smaller than the pointer width, only add the
+// value to the low bits.
+assert(IndexWidth < BitWidth &&
+   "Index width can't be larger than pointer width");
+Known.insertBits(KnownBits::add(Known.trunc(IndexWidth), IndexBits), 
0);
+  }
+};
 
 gep_type_iterator GTI = gep_type_begin(I);
 for (unsigned i = 1, e = I->getNumOperands(); i != e; ++i, ++GTI) {
@@ -1483,43 +1498,34 @@ static void computeKnownBitsFromOperator(const Operator 
*I,
 break;
   }
 
-  unsigned IndexBitWidth = Index->getType()->getScalarSizeInBits();
-  KnownBits IndexBits(IndexBitWidth);
-  computeKnownBits(Index, IndexBits, Depth + 1, Q);
-  TypeSize IndexTypeSize = GTI.getSequentialElementStride(Q.DL);
-  uint64_t TypeSizeInBytes = IndexTypeSize.getKnownMinValue();
-  KnownBits ScalingFa

[llvm-branch-commits] [llvm] a89e04e - [ValueTracking] Add additional tests for computeKnownBits on GEPs (NFC)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2025-02-11T14:15:45-08:00
New Revision: a89e04e7f0caa28d607e38099b905063b47a88fb

URL: 
https://github.com/llvm/llvm-project/commit/a89e04e7f0caa28d607e38099b905063b47a88fb
DIFF: 
https://github.com/llvm/llvm-project/commit/a89e04e7f0caa28d607e38099b905063b47a88fb.diff

LOG: [ValueTracking] Add additional tests for computeKnownBits on GEPs (NFC)

These demonstrate miscompiles in the existing code.

(cherry picked from commit 3dc1ef1650c8389a6f195a474781cf2281208bed)

Added: 


Modified: 
llvm/unittests/Analysis/ValueTrackingTest.cpp

Removed: 




diff  --git a/llvm/unittests/Analysis/ValueTrackingTest.cpp 
b/llvm/unittests/Analysis/ValueTrackingTest.cpp
index ee44aac45594d..39865fa195cf7 100644
--- a/llvm/unittests/Analysis/ValueTrackingTest.cpp
+++ b/llvm/unittests/Analysis/ValueTrackingTest.cpp
@@ -2679,6 +2679,41 @@ TEST_F(ComputeKnownBitsTest, 
ComputeKnownBitsAbsoluteSymbol) {
   EXPECT_EQ(0u, Known_0_256_Align8.countMinTrailingOnes());
 }
 
+TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPExtendBeforeMul) {
+  // FIXME: The index should be extended before multiplying with the scale.
+  parseAssembly(R"(
+target datalayout = "p:16:16:16"
+
+define void @test(i16 %arg) {
+  %and = and i16 %arg, u0x8000
+  %base = inttoptr i16 %and to ptr
+  %A = getelementptr i32, ptr %base, i8 80
+  ret void
+}
+)");
+  KnownBits Known = computeKnownBits(A, M->getDataLayout());
+  EXPECT_EQ(~64 & 0x7fff, Known.Zero);
+  EXPECT_EQ(64, Known.One);
+}
+
+TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPOnlyIndexBits) {
+  // FIXME: GEP should only affect the index width.
+  parseAssembly(R"(
+target datalayout = "p:16:16:16:8"
+
+define void @test(i16 %arg) {
+  %and = and i16 %arg, u0x8000
+  %or = or i16 %and, u0x00ff
+  %base = inttoptr i16 %or to ptr
+  %A = getelementptr i8, ptr %base, i8 1
+  ret void
+}
+)");
+  KnownBits Known = computeKnownBits(A, M->getDataLayout());
+  EXPECT_EQ(0x7eff, Known.Zero);
+  EXPECT_EQ(0x100, Known.One);
+}
+
 TEST_F(ValueTrackingTest, HaveNoCommonBitsSet) {
   {
 // Check for an inverted mask: (X & ~M) op (Y & M).



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 5777d5d - [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2025-02-11T14:15:45-08:00
New Revision: 5777d5df62a659e165b4df74aefae29ae01d2509

URL: 
https://github.com/llvm/llvm-project/commit/5777d5df62a659e165b4df74aefae29ae01d2509
DIFF: 
https://github.com/llvm/llvm-project/commit/5777d5df62a659e165b4df74aefae29ae01d2509.diff

LOG: [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs 
(#125532)

For GEPs, we have three bit widths involved: The pointer bit width, the
index bit width, and the bit width of the GEP operands.

The correct behavior here is:
* We need to sextOrTrunc the GEP operand to the index width *before*
multiplying by the scale.
* If the index width and pointer width differ, GEP only ever modifies
the low bits. Adds should not overflow into the high bits.

I'm testing this via unit tests because it's a bit tricky to test in IR
with InstCombine canonicalization getting in the way.

(cherry picked from commit 3bd11b502c1846afa5e1257c94b7a70566e34686)

Added: 


Modified: 
llvm/lib/Analysis/ValueTracking.cpp
llvm/unittests/Analysis/ValueTrackingTest.cpp

Removed: 




diff  --git a/llvm/lib/Analysis/ValueTracking.cpp 
b/llvm/lib/Analysis/ValueTracking.cpp
index b63a0a07f7de2..8a674914641a8 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1445,7 +1445,22 @@ static void computeKnownBitsFromOperator(const Operator 
*I,
 computeKnownBits(I->getOperand(0), Known, Depth + 1, Q);
 // Accumulate the constant indices in a separate variable
 // to minimize the number of calls to computeForAddSub.
-APInt AccConstIndices(BitWidth, 0, /*IsSigned*/ true);
+unsigned IndexWidth = Q.DL.getIndexTypeSizeInBits(I->getType());
+APInt AccConstIndices(IndexWidth, 0);
+
+auto AddIndexToKnown = [&](KnownBits IndexBits) {
+  if (IndexWidth == BitWidth) {
+// Note that inbounds does *not* guarantee nsw for the addition, as 
only
+// the offset is signed, while the base address is unsigned.
+Known = KnownBits::add(Known, IndexBits);
+  } else {
+// If the index width is smaller than the pointer width, only add the
+// value to the low bits.
+assert(IndexWidth < BitWidth &&
+   "Index width can't be larger than pointer width");
+Known.insertBits(KnownBits::add(Known.trunc(IndexWidth), IndexBits), 
0);
+  }
+};
 
 gep_type_iterator GTI = gep_type_begin(I);
 for (unsigned i = 1, e = I->getNumOperands(); i != e; ++i, ++GTI) {
@@ -1483,43 +1498,34 @@ static void computeKnownBitsFromOperator(const Operator 
*I,
 break;
   }
 
-  unsigned IndexBitWidth = Index->getType()->getScalarSizeInBits();
-  KnownBits IndexBits(IndexBitWidth);
-  computeKnownBits(Index, IndexBits, Depth + 1, Q);
-  TypeSize IndexTypeSize = GTI.getSequentialElementStride(Q.DL);
-  uint64_t TypeSizeInBytes = IndexTypeSize.getKnownMinValue();
-  KnownBits ScalingFactor(IndexBitWidth);
+  TypeSize Stride = GTI.getSequentialElementStride(Q.DL);
+  uint64_t StrideInBytes = Stride.getKnownMinValue();
+  if (!Stride.isScalable()) {
+// Fast path for constant offset.
+if (auto *CI = dyn_cast(Index)) {
+  AccConstIndices +=
+  CI->getValue().sextOrTrunc(IndexWidth) * StrideInBytes;
+  continue;
+}
+  }
+
+  KnownBits IndexBits =
+  computeKnownBits(Index, Depth + 1, Q).sextOrTrunc(IndexWidth);
+  KnownBits ScalingFactor(IndexWidth);
   // Multiply by current sizeof type.
   // &A[i] == A + i * sizeof(*A[i]).
-  if (IndexTypeSize.isScalable()) {
+  if (Stride.isScalable()) {
 // For scalable types the only thing we know about sizeof is
 // that this is a multiple of the minimum size.
-ScalingFactor.Zero.setLowBits(llvm::countr_zero(TypeSizeInBytes));
-  } else if (IndexBits.isConstant()) {
-APInt IndexConst = IndexBits.getConstant();
-APInt ScalingFactor(IndexBitWidth, TypeSizeInBytes);
-IndexConst *= ScalingFactor;
-AccConstIndices += IndexConst.sextOrTrunc(BitWidth);
-continue;
+ScalingFactor.Zero.setLowBits(llvm::countr_zero(StrideInBytes));
   } else {
 ScalingFactor =
-KnownBits::makeConstant(APInt(IndexBitWidth, TypeSizeInBytes));
+KnownBits::makeConstant(APInt(IndexWidth, StrideInBytes));
   }
-  IndexBits = KnownBits::mul(IndexBits, ScalingFactor);
-
-  // If the offsets have a 
diff erent width from the pointer, according
-  // to the language reference we need to sign-extend or truncate them
-  // to the width of the pointer.
-  IndexBits = IndexBits.sextOrTrunc(BitWidth);
-
-  // Note that inbounds does *not* guarantee nsw for the addition, as only
-  // the offset is signed, while the base address is unsigned.
-  Known = KnownBits:

[llvm-branch-commits] [llvm] release/20.x: [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532) (PR #126496)

2025-02-11 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/126496
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in MLIR (PR #125836)

2025-02-11 Thread Jakub Kuderski via llvm-branch-commits

https://github.com/kuhar commented:

Since this essentially breaks logic for gfx940 and gfx941, should we assert in 
code like `Chipset` that these are not used and silently miscompiled?

https://github.com/llvm/llvm-project/pull/125836
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in MLIR (PR #125836)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/125836

>From ebef8a82c9265ecea31795d726af402a96b89430 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Wed, 5 Feb 2025 05:50:12 -0500
Subject: [PATCH 1/2] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in
 MLIR

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

For SWDEV-512631
---
 mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td |  2 +-
 mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td  |  8 +++
 .../AMDGPUToROCDL/AMDGPUToROCDL.cpp   | 22 +--
 .../ArithToAMDGPU/ArithToAMDGPU.cpp   |  2 +-
 .../AMDGPU/Transforms/EmulateAtomics.cpp  |  8 +--
 .../AMDGPUToROCDL/8-bit-floats.mlir   |  2 +-
 mlir/test/Conversion/AMDGPUToROCDL/mfma.mlir  |  2 +-
 .../ArithToAMDGPU/8-bit-float-saturation.mlir |  2 +-
 .../ArithToAMDGPU/8-bit-floats.mlir   |  2 +-
 .../Dialect/AMDGPU/AMDGPUUtilsTest.cpp| 20 +++--
 10 files changed, 30 insertions(+), 40 deletions(-)

diff --git a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td 
b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
index 69745addfd748ec..24f541587cba88a 100644
--- a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
+++ b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
@@ -602,7 +602,7 @@ def AMDGPU_MFMAOp :
 order (that is, v[0] will go to arg[7:0], v[1] to arg[15:8] and so on).
 
 The negateA, negateB, and negateC flags are only supported for 
double-precision
-operations on gfx940+.
+operations on gfx942+.
   }];
   let assemblyFormat = [{
 $sourceA `*` $sourceB `+` $destC
diff --git a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td 
b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
index 7efa4ffa2aa6fe0..77401bd6de4bd56 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
@@ -348,11 +348,11 @@ def ROCDL_mfma_f32_16x16x4bf16_1k : 
ROCDL_Mfma_IntrOp<"mfma.f32.16x16x4bf16.1k">
 def ROCDL_mfma_f32_4x4x4bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.4x4x4bf16.1k">;
 def ROCDL_mfma_f32_32x32x8bf16_1k : 
ROCDL_Mfma_IntrOp<"mfma.f32.32x32x8bf16.1k">;
 def ROCDL_mfma_f32_16x16x16bf16_1k : 
ROCDL_Mfma_IntrOp<"mfma.f32.16x16x16bf16.1k">;
-// Note: in gfx940, unlike in gfx90a, the f64 xdlops use the "blgp" argument 
as a
-// NEG bitfield. See IntrinsicsAMDGPU.td for more info.
+// Note: in gfx942, unlike in gfx90a, the f64 xdlops use the "blgp" argument as
+// a NEG bitfield. See IntrinsicsAMDGPU.td for more info.
 def ROCDL_mfma_f64_16x16x4f64 : ROCDL_Mfma_IntrOp<"mfma.f64.16x16x4f64">;
 def ROCDL_mfma_f64_4x4x4f64 : ROCDL_Mfma_IntrOp<"mfma.f64.4x4x4f64">;
-// New in gfx940.
+// New in gfx942.
 def ROCDL_mfma_i32_16x16x32_i8 : ROCDL_Mfma_IntrOp<"mfma.i32.16x16x32.i8">;
 def ROCDL_mfma_i32_32x32x16_i8 : ROCDL_Mfma_IntrOp<"mfma.i32.32x32x16.i8">;
 def ROCDL_mfma_f32_16x16x8_xf32 : ROCDL_Mfma_IntrOp<"mfma.f32.16x16x8.xf32">;
@@ -375,7 +375,7 @@ def ROCDL_mfma_f32_32x32x16_f16 : 
ROCDL_Mfma_IntrOp<"mfma.f32.32x32x16.f16">;
 def ROCDL_mfma_scale_f32_16x16x128_f8f6f4 : 
ROCDL_Mfma_OO_IntrOp<"mfma.scale.f32.16x16x128.f8f6f4", [0,1]>;
 def ROCDL_mfma_scale_f32_32x32x64_f8f6f4 : 
ROCDL_Mfma_OO_IntrOp<"mfma.scale.f32.32x32x64.f8f6f4", [0,1]>;
 
-// 2:4 Sparsity ops (GFX940)
+// 2:4 Sparsity ops (GFX942)
 def ROCDL_smfmac_f32_16x16x32_f16 : 
ROCDL_Mfma_IntrOp<"smfmac.f32.16x16x32.f16">;
 def ROCDL_smfmac_f32_32x32x16_f16 : 
ROCDL_Mfma_IntrOp<"smfmac.f32.32x32x16.f16">;
 def ROCDL_smfmac_f32_16x16x32_bf16 : 
ROCDL_Mfma_IntrOp<"smfmac.f32.16x16x32.bf16">;
diff --git a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp 
b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
index 9fb51f0bc1f1ea7..18fd0fc3f038139 100644
--- a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+++ b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
@@ -80,7 +80,7 @@ namespace {
 // Define commonly used chipsets versions for convenience.
 constexpr Chipset kGfx908 = Chipset(9, 0, 8);
 constexpr Chipset kGfx90a = Chipset(9, 0, 0xa);
-constexpr Chipset kGfx940 = Chipset(9, 4, 0);
+constexpr Chipset kGfx942 = Chipset(9, 4, 2);
 
 /// Define lowering patterns for raw buffer ops
 template 
@@ -483,7 +483,7 @@ static std::optional mfmaOpToIntrinsic(MFMAOp 
mfma,
 destElem = destType.getElementType();
 
   if (sourceElem.isF32() && destElem.isF32()) {
-if (mfma.getReducePrecision() && chipset >= kGfx940) {
+if (mfma.getReducePrecision() && chipset >= kGfx942) {
   if (m == 32 && n == 32 && k == 4 && b == 1)
 return ROCDL::mfma_f32_32x32x4_xf32::getOperationName();
   if (m == 16 && n == 16 && k == 8 && b == 1)
@@ -551,9 +551,9 @@ static std::optional mfmaOpToIntrinsic(MFMAOp 
mfma,
   return ROCDL::mfma_i32_32x32x8i8::getOperationName();
 if (m == 16 && n == 16 && k == 16 && b == 1)
   return ROCDL::mfma_i32_16x16x16i8::getOperationName();
-if (m == 32 && n == 32 && k == 16 && b == 1 && chipset >= k

[llvm-branch-commits] [flang] [AMDGPU] Add missing gfx architectures to AddFlangOffloadRuntime.cmake (PR #125827)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/125827

>From 175aff53a41aebabdadfd93296d8b8bc22683197 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Wed, 5 Feb 2025 04:45:26 -0500
Subject: [PATCH] [AMDGPU] Add missing gfx architectures to
 AddFlangOffloadRuntime.cmake

---
 flang/cmake/modules/AddFlangOffloadRuntime.cmake | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/flang/cmake/modules/AddFlangOffloadRuntime.cmake 
b/flang/cmake/modules/AddFlangOffloadRuntime.cmake
index f1f6eb57c5d6cf3..eb0e964559ed566 100644
--- a/flang/cmake/modules/AddFlangOffloadRuntime.cmake
+++ b/flang/cmake/modules/AddFlangOffloadRuntime.cmake
@@ -98,10 +98,10 @@ macro(enable_omp_offload_compilation files)
 
   set(all_amdgpu_architectures
 "gfx700;gfx701;gfx801;gfx803;gfx900;gfx902;gfx906"
-"gfx908;gfx90a;gfx90c;gfx942;gfx1010;gfx1030"
+"gfx908;gfx90a;gfx90c;gfx942;gfx950;gfx1010;gfx1030"
 "gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036"
 "gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151"
-"gfx1152;gfx1153"
+"gfx1152;gfx1153;gfx1200;gfx1201"
 )
   set(all_nvptx_architectures
 "sm_35;sm_37;sm_50;sm_52;sm_53;sm_60;sm_61;sm_62"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in MLIR (PR #125836)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/125836

>From ebef8a82c9265ecea31795d726af402a96b89430 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Wed, 5 Feb 2025 05:50:12 -0500
Subject: [PATCH 1/2] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in
 MLIR

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

For SWDEV-512631
---
 mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td |  2 +-
 mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td  |  8 +++
 .../AMDGPUToROCDL/AMDGPUToROCDL.cpp   | 22 +--
 .../ArithToAMDGPU/ArithToAMDGPU.cpp   |  2 +-
 .../AMDGPU/Transforms/EmulateAtomics.cpp  |  8 +--
 .../AMDGPUToROCDL/8-bit-floats.mlir   |  2 +-
 mlir/test/Conversion/AMDGPUToROCDL/mfma.mlir  |  2 +-
 .../ArithToAMDGPU/8-bit-float-saturation.mlir |  2 +-
 .../ArithToAMDGPU/8-bit-floats.mlir   |  2 +-
 .../Dialect/AMDGPU/AMDGPUUtilsTest.cpp| 20 +++--
 10 files changed, 30 insertions(+), 40 deletions(-)

diff --git a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td 
b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
index 69745addfd748ec..24f541587cba88a 100644
--- a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
+++ b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
@@ -602,7 +602,7 @@ def AMDGPU_MFMAOp :
 order (that is, v[0] will go to arg[7:0], v[1] to arg[15:8] and so on).
 
 The negateA, negateB, and negateC flags are only supported for 
double-precision
-operations on gfx940+.
+operations on gfx942+.
   }];
   let assemblyFormat = [{
 $sourceA `*` $sourceB `+` $destC
diff --git a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td 
b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
index 7efa4ffa2aa6fe0..77401bd6de4bd56 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
@@ -348,11 +348,11 @@ def ROCDL_mfma_f32_16x16x4bf16_1k : 
ROCDL_Mfma_IntrOp<"mfma.f32.16x16x4bf16.1k">
 def ROCDL_mfma_f32_4x4x4bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.4x4x4bf16.1k">;
 def ROCDL_mfma_f32_32x32x8bf16_1k : 
ROCDL_Mfma_IntrOp<"mfma.f32.32x32x8bf16.1k">;
 def ROCDL_mfma_f32_16x16x16bf16_1k : 
ROCDL_Mfma_IntrOp<"mfma.f32.16x16x16bf16.1k">;
-// Note: in gfx940, unlike in gfx90a, the f64 xdlops use the "blgp" argument 
as a
-// NEG bitfield. See IntrinsicsAMDGPU.td for more info.
+// Note: in gfx942, unlike in gfx90a, the f64 xdlops use the "blgp" argument as
+// a NEG bitfield. See IntrinsicsAMDGPU.td for more info.
 def ROCDL_mfma_f64_16x16x4f64 : ROCDL_Mfma_IntrOp<"mfma.f64.16x16x4f64">;
 def ROCDL_mfma_f64_4x4x4f64 : ROCDL_Mfma_IntrOp<"mfma.f64.4x4x4f64">;
-// New in gfx940.
+// New in gfx942.
 def ROCDL_mfma_i32_16x16x32_i8 : ROCDL_Mfma_IntrOp<"mfma.i32.16x16x32.i8">;
 def ROCDL_mfma_i32_32x32x16_i8 : ROCDL_Mfma_IntrOp<"mfma.i32.32x32x16.i8">;
 def ROCDL_mfma_f32_16x16x8_xf32 : ROCDL_Mfma_IntrOp<"mfma.f32.16x16x8.xf32">;
@@ -375,7 +375,7 @@ def ROCDL_mfma_f32_32x32x16_f16 : 
ROCDL_Mfma_IntrOp<"mfma.f32.32x32x16.f16">;
 def ROCDL_mfma_scale_f32_16x16x128_f8f6f4 : 
ROCDL_Mfma_OO_IntrOp<"mfma.scale.f32.16x16x128.f8f6f4", [0,1]>;
 def ROCDL_mfma_scale_f32_32x32x64_f8f6f4 : 
ROCDL_Mfma_OO_IntrOp<"mfma.scale.f32.32x32x64.f8f6f4", [0,1]>;
 
-// 2:4 Sparsity ops (GFX940)
+// 2:4 Sparsity ops (GFX942)
 def ROCDL_smfmac_f32_16x16x32_f16 : 
ROCDL_Mfma_IntrOp<"smfmac.f32.16x16x32.f16">;
 def ROCDL_smfmac_f32_32x32x16_f16 : 
ROCDL_Mfma_IntrOp<"smfmac.f32.32x32x16.f16">;
 def ROCDL_smfmac_f32_16x16x32_bf16 : 
ROCDL_Mfma_IntrOp<"smfmac.f32.16x16x32.bf16">;
diff --git a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp 
b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
index 9fb51f0bc1f1ea7..18fd0fc3f038139 100644
--- a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+++ b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
@@ -80,7 +80,7 @@ namespace {
 // Define commonly used chipsets versions for convenience.
 constexpr Chipset kGfx908 = Chipset(9, 0, 8);
 constexpr Chipset kGfx90a = Chipset(9, 0, 0xa);
-constexpr Chipset kGfx940 = Chipset(9, 4, 0);
+constexpr Chipset kGfx942 = Chipset(9, 4, 2);
 
 /// Define lowering patterns for raw buffer ops
 template 
@@ -483,7 +483,7 @@ static std::optional mfmaOpToIntrinsic(MFMAOp 
mfma,
 destElem = destType.getElementType();
 
   if (sourceElem.isF32() && destElem.isF32()) {
-if (mfma.getReducePrecision() && chipset >= kGfx940) {
+if (mfma.getReducePrecision() && chipset >= kGfx942) {
   if (m == 32 && n == 32 && k == 4 && b == 1)
 return ROCDL::mfma_f32_32x32x4_xf32::getOperationName();
   if (m == 16 && n == 16 && k == 8 && b == 1)
@@ -551,9 +551,9 @@ static std::optional mfmaOpToIntrinsic(MFMAOp 
mfma,
   return ROCDL::mfma_i32_32x32x8i8::getOperationName();
 if (m == 16 && n == 16 && k == 16 && b == 1)
   return ROCDL::mfma_i32_16x16x16i8::getOperationName();
-if (m == 32 && n == 32 && k == 16 && b == 1 && chipset >= k

[llvm-branch-commits] [flang] [libc] [libclc] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in offload and libclc (PR #125826)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/125826

>From f2a70b996f1e7b0c71af2e77f5fb6d11266ec82e Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Wed, 5 Feb 2025 04:19:00 -0500
Subject: [PATCH] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in offload and
 libclc

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

For SWDEV-512631 and SWDEV-512633
---
 flang/cmake/modules/AddFlangOffloadRuntime.cmake | 2 +-
 libc/docs/gpu/using.rst  | 2 +-
 libclc/CMakeLists.txt| 2 +-
 offload/plugins-nextgen/amdgpu/src/rtl.cpp   | 6 --
 offload/test/lit.cfg | 4 +---
 5 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/flang/cmake/modules/AddFlangOffloadRuntime.cmake 
b/flang/cmake/modules/AddFlangOffloadRuntime.cmake
index 8e4f47d18535dcb..f1f6eb57c5d6cf3 100644
--- a/flang/cmake/modules/AddFlangOffloadRuntime.cmake
+++ b/flang/cmake/modules/AddFlangOffloadRuntime.cmake
@@ -98,7 +98,7 @@ macro(enable_omp_offload_compilation files)
 
   set(all_amdgpu_architectures
 "gfx700;gfx701;gfx801;gfx803;gfx900;gfx902;gfx906"
-"gfx908;gfx90a;gfx90c;gfx940;gfx1010;gfx1030"
+"gfx908;gfx90a;gfx90c;gfx942;gfx1010;gfx1030"
 "gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036"
 "gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151"
 "gfx1152;gfx1153"
diff --git a/libc/docs/gpu/using.rst b/libc/docs/gpu/using.rst
index 1c1f9c9bfb0c696..f17f6287be31349 100644
--- a/libc/docs/gpu/using.rst
+++ b/libc/docs/gpu/using.rst
@@ -44,7 +44,7 @@ this shouldn't be necessary.
 
   $> clang openmp.c -fopenmp --offload-arch=gfx90a -Xoffload-linker -lc
   $> clang cuda.cu --offload-arch=sm_80 --offload-new-driver -fgpu-rdc 
-Xoffload-linker -lc
-  $> clang hip.hip --offload-arch=gfx940 --offload-new-driver -fgpu-rdc 
-Xoffload-linker -lc
+  $> clang hip.hip --offload-arch=gfx942 --offload-new-driver -fgpu-rdc 
-Xoffload-linker -lc
 
 This will automatically link in the needed function definitions if they were
 required by the user's application. Normally using the ``-fgpu-rdc`` option
diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index b28da904ef68e15..a5f9c47f099080f 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -211,7 +211,7 @@ set( cayman_aliases aruba )
 set( tahiti_aliases pitcairn verde oland hainan bonaire kabini kaveri hawaii
   mullins tonga tongapro iceland carrizo fiji stoney polaris10 polaris11
   gfx602 gfx705 gfx805
-  gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90a gfx90c gfx940 gfx941 gfx942
+  gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90a gfx90c gfx942
   gfx1010 gfx1011 gfx1012 gfx1013
   gfx1030 gfx1031 gfx1032 gfx1033 gfx1034 gfx1035 gfx1036
   gfx1100 gfx1101 gfx1102 gfx1103
diff --git a/offload/plugins-nextgen/amdgpu/src/rtl.cpp 
b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
index 92184ba796dbd83..e83d38a14f77f67 100644
--- a/offload/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
@@ -2854,12 +2854,6 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, 
AMDGenericDeviceTy {
   Error checkIfAPU() {
 // TODO: replace with ROCr API once it becomes available.
 llvm::StringRef StrGfxName(ComputeUnitKind);
-IsAPU = llvm::StringSwitch(StrGfxName)
-.Case("gfx940", true)
-.Default(false);
-if (IsAPU)
-  return Plugin::success();
-
 bool MayBeAPU = llvm::StringSwitch(StrGfxName)
 .Case("gfx942", true)
 .Default(false);
diff --git a/offload/test/lit.cfg b/offload/test/lit.cfg
index 658ae5f9653ba90..fe28418d9c1b1a3 100644
--- a/offload/test/lit.cfg
+++ b/offload/test/lit.cfg
@@ -132,12 +132,10 @@ elif 
config.libomptarget_current_target.startswith('amdgcn'):
 # amdgpu_test_arch contains a list of AMD GPUs in the system
 # only check the first one assuming that we will run the test on it.
 if not (config.amdgpu_test_arch.startswith("gfx90a") or
-config.amdgpu_test_arch.startswith("gfx940") or
 config.amdgpu_test_arch.startswith("gfx942")):
supports_unified_shared_memory = False
 # check if AMD architecture is an APU:
-if (config.amdgpu_test_arch.startswith("gfx940") or
-(config.amdgpu_test_arch.startswith("gfx942") and
+if ((config.amdgpu_test_arch.startswith("gfx942") and
  evaluate_bool_env(config.environment['IS_APU']))):
supports_apu = True
 if supports_unified_shared_memory:

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [libc] [libclc] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in offload and libclc (PR #125826)

2025-02-11 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/125826

>From f2a70b996f1e7b0c71af2e77f5fb6d11266ec82e Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Wed, 5 Feb 2025 04:19:00 -0500
Subject: [PATCH] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in offload and
 libclc

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

For SWDEV-512631 and SWDEV-512633
---
 flang/cmake/modules/AddFlangOffloadRuntime.cmake | 2 +-
 libc/docs/gpu/using.rst  | 2 +-
 libclc/CMakeLists.txt| 2 +-
 offload/plugins-nextgen/amdgpu/src/rtl.cpp   | 6 --
 offload/test/lit.cfg | 4 +---
 5 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/flang/cmake/modules/AddFlangOffloadRuntime.cmake 
b/flang/cmake/modules/AddFlangOffloadRuntime.cmake
index 8e4f47d18535dcb..f1f6eb57c5d6cf3 100644
--- a/flang/cmake/modules/AddFlangOffloadRuntime.cmake
+++ b/flang/cmake/modules/AddFlangOffloadRuntime.cmake
@@ -98,7 +98,7 @@ macro(enable_omp_offload_compilation files)
 
   set(all_amdgpu_architectures
 "gfx700;gfx701;gfx801;gfx803;gfx900;gfx902;gfx906"
-"gfx908;gfx90a;gfx90c;gfx940;gfx1010;gfx1030"
+"gfx908;gfx90a;gfx90c;gfx942;gfx1010;gfx1030"
 "gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036"
 "gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151"
 "gfx1152;gfx1153"
diff --git a/libc/docs/gpu/using.rst b/libc/docs/gpu/using.rst
index 1c1f9c9bfb0c696..f17f6287be31349 100644
--- a/libc/docs/gpu/using.rst
+++ b/libc/docs/gpu/using.rst
@@ -44,7 +44,7 @@ this shouldn't be necessary.
 
   $> clang openmp.c -fopenmp --offload-arch=gfx90a -Xoffload-linker -lc
   $> clang cuda.cu --offload-arch=sm_80 --offload-new-driver -fgpu-rdc 
-Xoffload-linker -lc
-  $> clang hip.hip --offload-arch=gfx940 --offload-new-driver -fgpu-rdc 
-Xoffload-linker -lc
+  $> clang hip.hip --offload-arch=gfx942 --offload-new-driver -fgpu-rdc 
-Xoffload-linker -lc
 
 This will automatically link in the needed function definitions if they were
 required by the user's application. Normally using the ``-fgpu-rdc`` option
diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index b28da904ef68e15..a5f9c47f099080f 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -211,7 +211,7 @@ set( cayman_aliases aruba )
 set( tahiti_aliases pitcairn verde oland hainan bonaire kabini kaveri hawaii
   mullins tonga tongapro iceland carrizo fiji stoney polaris10 polaris11
   gfx602 gfx705 gfx805
-  gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90a gfx90c gfx940 gfx941 gfx942
+  gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90a gfx90c gfx942
   gfx1010 gfx1011 gfx1012 gfx1013
   gfx1030 gfx1031 gfx1032 gfx1033 gfx1034 gfx1035 gfx1036
   gfx1100 gfx1101 gfx1102 gfx1103
diff --git a/offload/plugins-nextgen/amdgpu/src/rtl.cpp 
b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
index 92184ba796dbd83..e83d38a14f77f67 100644
--- a/offload/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
@@ -2854,12 +2854,6 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, 
AMDGenericDeviceTy {
   Error checkIfAPU() {
 // TODO: replace with ROCr API once it becomes available.
 llvm::StringRef StrGfxName(ComputeUnitKind);
-IsAPU = llvm::StringSwitch(StrGfxName)
-.Case("gfx940", true)
-.Default(false);
-if (IsAPU)
-  return Plugin::success();
-
 bool MayBeAPU = llvm::StringSwitch(StrGfxName)
 .Case("gfx942", true)
 .Default(false);
diff --git a/offload/test/lit.cfg b/offload/test/lit.cfg
index 658ae5f9653ba90..fe28418d9c1b1a3 100644
--- a/offload/test/lit.cfg
+++ b/offload/test/lit.cfg
@@ -132,12 +132,10 @@ elif 
config.libomptarget_current_target.startswith('amdgcn'):
 # amdgpu_test_arch contains a list of AMD GPUs in the system
 # only check the first one assuming that we will run the test on it.
 if not (config.amdgpu_test_arch.startswith("gfx90a") or
-config.amdgpu_test_arch.startswith("gfx940") or
 config.amdgpu_test_arch.startswith("gfx942")):
supports_unified_shared_memory = False
 # check if AMD architecture is an APU:
-if (config.amdgpu_test_arch.startswith("gfx940") or
-(config.amdgpu_test_arch.startswith("gfx942") and
+if ((config.amdgpu_test_arch.startswith("gfx942") and
  evaluate_bool_env(config.environment['IS_APU']))):
supports_apu = True
 if supports_unified_shared_memory:

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)

2025-02-11 Thread kadir çetinkaya via llvm-branch-commits

kadircet wrote:

what about landing this one, and cherry picking it into the 20 release. then we 
can land #126689 and revert this one. giving us the next release cycle to vet 
the underlying change?

https://github.com/llvm/llvm-project/pull/126690
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)

2025-02-11 Thread kadir çetinkaya via llvm-branch-commits

https://github.com/kadircet approved this pull request.


https://github.com/llvm/llvm-project/pull/126690
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)

2025-02-11 Thread kadir çetinkaya via llvm-branch-commits

https://github.com/kadircet closed 
https://github.com/llvm/llvm-project/pull/126690
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)

2025-02-11 Thread kadir çetinkaya via llvm-branch-commits

kadircet wrote:

going to land this one, since review of #126689 can take longer and this should 
enable us to unblock our releases. I am happy to revert afterwards.

https://github.com/llvm/llvm-project/pull/126690
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] c7cfe02 - Revert "[clang][HeuristicResolver] Additional hardening against an infinite l…"

2025-02-11 Thread via llvm-branch-commits

Author: kadir çetinkaya
Date: 2025-02-11T09:13:27+01:00
New Revision: c7cfe02fd5be25ff65b42bd21cb16fa27933af64

URL: 
https://github.com/llvm/llvm-project/commit/c7cfe02fd5be25ff65b42bd21cb16fa27933af64
DIFF: 
https://github.com/llvm/llvm-project/commit/c7cfe02fd5be25ff65b42bd21cb16fa27933af64.diff

LOG: Revert "[clang][HeuristicResolver] Additional hardening against an 
infinite l…"

This reverts commit 780894689ff741c761457eec1c925679309336a3.

Added: 


Modified: 
clang/lib/Sema/HeuristicResolver.cpp

Removed: 




diff  --git a/clang/lib/Sema/HeuristicResolver.cpp 
b/clang/lib/Sema/HeuristicResolver.cpp
index adce403412f689..3cbf33dcdced38 100644
--- a/clang/lib/Sema/HeuristicResolver.cpp
+++ b/clang/lib/Sema/HeuristicResolver.cpp
@@ -258,11 +258,7 @@ QualType HeuristicResolverImpl::simplifyType(QualType 
Type, const Expr *E,
 }
 return T;
   };
-  // As an additional protection against infinite loops, bound the number of
-  // simplification steps.
-  size_t StepCount = 0;
-  const size_t MaxSteps = 64;
-  while (!Current.Type.isNull() && StepCount++ < MaxSteps) {
+  while (!Current.Type.isNull()) {
 TypeExprPair New = SimplifyOneStep(Current);
 if (New.Type == Current.Type)
   break;



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)

2025-02-11 Thread kadir çetinkaya via llvm-branch-commits

kadircet wrote:

oops I wasn't looking at the target branch :( let me cherry-pick this into main

https://github.com/llvm/llvm-project/pull/126690
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)

2025-02-11 Thread Nathan Ridge via llvm-branch-commits

HighCommander4 wrote:

> and cherry picking it into the 20 release

The regressing change, which introduced the `simplifyType()` function, isn't 
present on the llvm 20 branch (it landed a bit after the branch cut). So there 
should not be a need to cherry-pick the fix onto the llvm 20 branch either.



https://github.com/llvm/llvm-project/pull/126690
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (PR #126607)

2025-02-11 Thread James Henderson via llvm-branch-commits

https://github.com/jh7370 approved this pull request.

LGTM, with @MaskRay's suggestion.

https://github.com/llvm/llvm-project/pull/126607
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >