[llvm-branch-commits] [clang] [llvm] [mlir] [MLIR][OpenMP] Add LLVM translation support for OpenMP UserDefinedMappers (PR #124746)
@@ -9099,9 +9099,10 @@ void CGOpenMPRuntime::emitUserDefinedMapper(const OMPDeclareMapperDecl *D, CGM.getCXXABI().getMangleContext().mangleCanonicalTypeName(Ty, Out); std::string Name = getName({"omp_mapper", TyStr, D->getName()}); - auto *NewFn = OMPBuilder.emitUserDefinedMapper(PrivatizeAndGenMapInfoCB, - ElemTy, Name, CustomMapperCB); - UDMMap.try_emplace(D, NewFn); + llvm::Expected NewFn = OMPBuilder.emitUserDefinedMapper( + PrivatizeAndGenMapInfoCB, ElemTy, Name, CustomMapperCB); + assert(NewFn && "Unexpected error in emitUserDefinedMapper"); skatrak wrote: Wrap the call to `OMPBuilder.emitUserDefinedMapper` with `llvm::cantFail()` instead, which would consume and discard the error. Doing it with an assert has the problem that it will always crash if assertions are off, regardless of errors. In clang, the error handling process involves early process exit during execution of the callback, which allows us to just assume there are no errors here. In MLIR to LLVM IR translation, however, we have to forward or handle errors so we can exit gracefully. https://github.com/llvm/llvm-project/pull/124746 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [mlir] release/20.x: Fixes for flang/mlir dependencies (PR #125837)
https://github.com/nikic updated https://github.com/llvm/llvm-project/pull/125837 >From 88f8956711f7c8d306d08fff8603d6b99e8302c1 Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Tue, 4 Feb 2025 16:37:21 +0100 Subject: [PATCH 1/3] [mlir] Fix MLIRTestDialect dependency in MLIRTestIR This is a test library which is not part of libMLIR, so it should use normal LINK_LIBS instead of mlir_target_link_libraries. This fixes an issue introduced in #123910 and follows up on the fix in #125004, which added the library to DEPENDS, which is not sufficient. --- mlir/test/lib/IR/CMakeLists.txt | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mlir/test/lib/IR/CMakeLists.txt b/mlir/test/lib/IR/CMakeLists.txt index e5416da70d50080..71a96c7f92c0c7d 100644 --- a/mlir/test/lib/IR/CMakeLists.txt +++ b/mlir/test/lib/IR/CMakeLists.txt @@ -27,13 +27,15 @@ add_mlir_library(MLIRTestIR TestVisitorsGeneric.cpp EXCLUDE_FROM_LIBMLIR + + LINK_LIBS PUBLIC + MLIRTestDialect ) mlir_target_link_libraries(MLIRTestIR PUBLIC MLIRPass MLIRBytecodeReader MLIRBytecodeWriter MLIRFunctionInterfaces - MLIRTestDialect ) target_include_directories(MLIRTestIR >From dfa60a77e0bae875ea30340067bebea1c70b9d3d Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Wed, 5 Feb 2025 09:48:23 +0100 Subject: [PATCH 2/3] [flang] Move FIRSupport dependency to correct place (#125697) This library is provided by flang, not MLIR, so it should not be part of MLIR_LIBS. Fixes an issue introduced in https://github.com/llvm/llvm-project/pull/120966. (cherry picked from commit ee76bdac192ce86c5d13e4c712e0327aaefda45f) --- flang/lib/Optimizer/Analysis/CMakeLists.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/Analysis/CMakeLists.txt b/flang/lib/Optimizer/Analysis/CMakeLists.txt index 6fe9c70f83765f1..c4dae898f8e5722 100644 --- a/flang/lib/Optimizer/Analysis/CMakeLists.txt +++ b/flang/lib/Optimizer/Analysis/CMakeLists.txt @@ -12,6 +12,7 @@ add_flang_library(FIRAnalysis LINK_LIBS FIRBuilder FIRDialect + FIRSupport HLFIRDialect MLIR_LIBS @@ -19,5 +20,4 @@ add_flang_library(FIRAnalysis MLIRLLVMDialect MLIRMathTransforms MLIROpenMPDialect - FIRSupport ) >From 4c4ed5e2f5357d724e4c26d21ee3e840210b917f Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Wed, 5 Feb 2025 11:58:44 +0100 Subject: [PATCH 3/3] [flang][cmake] Fix bcc dependencies (#125822) The Fortran libraries are not part of MLIR, so they should use target_link_libraries() rather than mlir_target_link_libraries(). This fixes an issue introduced in https://github.com/llvm/llvm-project/pull/120966. (cherry picked from commit f9af5c145f40480d46874b643ca2b1237e9fbb2a) --- flang/tools/bbc/CMakeLists.txt | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/tools/bbc/CMakeLists.txt b/flang/tools/bbc/CMakeLists.txt index 85aeb85e0c53093..97462be83ea4389 100644 --- a/flang/tools/bbc/CMakeLists.txt +++ b/flang/tools/bbc/CMakeLists.txt @@ -29,6 +29,11 @@ target_link_libraries(bbc PRIVATE flangFrontend flangPasses FlangOpenMPTransforms + FortranCommon + FortranParser + FortranEvaluate + FortranSemantics + FortranLower ) mlir_target_link_libraries(bbc PRIVATE @@ -36,9 +41,4 @@ mlir_target_link_libraries(bbc PRIVATE ${extension_libs} MLIRAffineToStandard MLIRSCFToControlFlow - FortranCommon - FortranParser - FortranEvaluate - FortranSemantics - FortranLower ) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] 88f8956 - [mlir] Fix MLIRTestDialect dependency in MLIRTestIR
Author: Nikita Popov Date: 2025-02-11T14:59:41+01:00 New Revision: 88f8956711f7c8d306d08fff8603d6b99e8302c1 URL: https://github.com/llvm/llvm-project/commit/88f8956711f7c8d306d08fff8603d6b99e8302c1 DIFF: https://github.com/llvm/llvm-project/commit/88f8956711f7c8d306d08fff8603d6b99e8302c1.diff LOG: [mlir] Fix MLIRTestDialect dependency in MLIRTestIR This is a test library which is not part of libMLIR, so it should use normal LINK_LIBS instead of mlir_target_link_libraries. This fixes an issue introduced in #123910 and follows up on the fix in #125004, which added the library to DEPENDS, which is not sufficient. Added: Modified: mlir/test/lib/IR/CMakeLists.txt Removed: diff --git a/mlir/test/lib/IR/CMakeLists.txt b/mlir/test/lib/IR/CMakeLists.txt index e5416da70d500..71a96c7f92c0c 100644 --- a/mlir/test/lib/IR/CMakeLists.txt +++ b/mlir/test/lib/IR/CMakeLists.txt @@ -27,13 +27,15 @@ add_mlir_library(MLIRTestIR TestVisitorsGeneric.cpp EXCLUDE_FROM_LIBMLIR + + LINK_LIBS PUBLIC + MLIRTestDialect ) mlir_target_link_libraries(MLIRTestIR PUBLIC MLIRPass MLIRBytecodeReader MLIRBytecodeWriter MLIRFunctionInterfaces - MLIRTestDialect ) target_include_directories(MLIRTestIR ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [libc] release/20.x: [Clang] Fix test after new argument was added (PR #125912)
jhuber6 wrote: > @jhuber6 Can you take a look at these test failures. Looks green now. https://github.com/llvm/llvm-project/pull/125912 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [HLSL][RootSignature] Implement Parsing of Descriptor Tables (PR #122982)
inbelic wrote: Rebasing onto the lexer pr api changes of using `ConsumeToken`/`PeekNextToken` instead of pre-allocating the tokens https://github.com/llvm/llvm-project/pull/122982 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
https://github.com/ritter-x2a created https://github.com/llvm/llvm-project/pull/126762 gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all occurrences of gfx940/gfx941 from clang that can be removed without changes in the llvm directory. The target-invalid-cpu-note/amdgcn.c test is not included here since it tests a list of targets that is defined in llvm/lib/TargetParser/TargetParser.cpp. For SWDEV-512631 >From 3a165b2b1d718382d9ce2bb62679949684bc541c Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Tue, 11 Feb 2025 08:52:55 -0500 Subject: [PATCH] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all occurrences of gfx940/gfx941 from clang that can be removed without changes in the llvm directory. The target-invalid-cpu-note/amdgcn.c test is not included here since it tests a list of targets that is defined in llvm/lib/TargetParser/TargetParser.cpp. For SWDEV-512631 --- clang/include/clang/Basic/Cuda.h | 2 - clang/lib/Basic/Cuda.cpp | 2 - clang/lib/Basic/Targets/NVPTX.cpp | 2 - clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp | 2 - clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu | 2 +- clang/test/CodeGenOpenCL/amdgpu-features.cl | 4 - .../test/CodeGenOpenCL/builtins-amdgcn-fp8.cl | 2 +- ...cn-gfx940.cl => builtins-amdgcn-gfx942.cl} | 2 +- .../builtins-amdgcn-gfx950-err.cl | 2 +- .../builtins-amdgcn-gws-insts.cl | 2 +- .../CodeGenOpenCL/builtins-amdgcn-mfma.cl | 110 +- ...fx940.cl => builtins-fp-atomics-gfx942.cl} | 34 +++--- clang/test/Driver/amdgpu-macros.cl| 2 - clang/test/Driver/amdgpu-mcpu.cl | 4 - clang/test/Driver/cuda-bad-arch.cu| 2 +- clang/test/Driver/hip-macros.hip | 10 +- .../test/Misc/target-invalid-cpu-note/nvptx.c | 2 - ... => builtins-amdgcn-error-gfx942-param.cl} | 2 +- .../builtins-amdgcn-error-gfx950.cl | 2 +- ...0-err.cl => builtins-amdgcn-gfx942-err.cl} | 14 +-- 20 files changed, 91 insertions(+), 113 deletions(-) rename clang/test/CodeGenOpenCL/{builtins-amdgcn-gfx940.cl => builtins-amdgcn-gfx942.cl} (98%) rename clang/test/CodeGenOpenCL/{builtins-fp-atomics-gfx940.cl => builtins-fp-atomics-gfx942.cl} (84%) rename clang/test/SemaOpenCL/{builtins-amdgcn-error-gfx940-param.cl => builtins-amdgcn-error-gfx942-param.cl} (99%) rename clang/test/SemaOpenCL/{builtins-amdgcn-gfx940-err.cl => builtins-amdgcn-gfx942-err.cl} (81%) diff --git a/clang/include/clang/Basic/Cuda.h b/clang/include/clang/Basic/Cuda.h index f33ba46233a7a..793cab1f4e84a 100644 --- a/clang/include/clang/Basic/Cuda.h +++ b/clang/include/clang/Basic/Cuda.h @@ -106,8 +106,6 @@ enum class OffloadArch { GFX90a, GFX90c, GFX9_4_GENERIC, - GFX940, - GFX941, GFX942, GFX950, GFX10_1_GENERIC, diff --git a/clang/lib/Basic/Cuda.cpp b/clang/lib/Basic/Cuda.cpp index 1bfec0b37c5ee..f45fb0eca3714 100644 --- a/clang/lib/Basic/Cuda.cpp +++ b/clang/lib/Basic/Cuda.cpp @@ -124,8 +124,6 @@ static const OffloadArchToStringMap arch_names[] = { GFX(90a), // gfx90a GFX(90c), // gfx90c {OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"}, -GFX(940), // gfx940 -GFX(941), // gfx941 GFX(942), // gfx942 GFX(950), // gfx950 {OffloadArch::GFX10_1_GENERIC, "gfx10-1-generic", "compute_amdgcn"}, diff --git a/clang/lib/Basic/Targets/NVPTX.cpp b/clang/lib/Basic/Targets/NVPTX.cpp index 7d13c1f145440..547cf3dfa2be7 100644 --- a/clang/lib/Basic/Targets/NVPTX.cpp +++ b/clang/lib/Basic/Targets/NVPTX.cpp @@ -211,8 +211,6 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions &Opts, case OffloadArch::GFX90a: case OffloadArch::GFX90c: case OffloadArch::GFX9_4_GENERIC: - case OffloadArch::GFX940: - case OffloadArch::GFX941: case OffloadArch::GFX942: case OffloadArch::GFX950: case OffloadArch::GFX10_1_GENERIC: diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index c13928f61a748..826ec4da8ea28 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -2302,8 +2302,6 @@ void CGOpenMPRuntimeGPU::processRequiresDirective(const OMPRequiresDecl *D) { case OffloadArch::GFX90a: case OffloadArch::GFX90c: case OffloadArch::GFX9_4_GENERIC: - case OffloadArch::GFX940: - case OffloadArch::GFX941: case OffloadArch::GFX942: case OffloadArch::GFX950: case OffloadArch::GFX10_1_GENERIC: diff --git a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu index 47fa3967fe237..37fca614c3111 100644 --- a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.c
[llvm-branch-commits] [clang] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (PR #126763)
ritter-x2a wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/126763?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#126763** https://app.graphite.dev/github/pr/llvm/llvm-project/126763?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/126763?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#126762** https://app.graphite.dev/github/pr/llvm/llvm-project/126762?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#125836** https://app.graphite.dev/github/pr/llvm/llvm-project/125836?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#125827** https://app.graphite.dev/github/pr/llvm/llvm-project/125827?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#125826** https://app.graphite.dev/github/pr/llvm/llvm-project/125826?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#125711** https://app.graphite.dev/github/pr/llvm/llvm-project/125711?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/126763 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
ritter-x2a wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/126762?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#126763** https://app.graphite.dev/github/pr/llvm/llvm-project/126763?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#126762** https://app.graphite.dev/github/pr/llvm/llvm-project/126762?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/126762?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#125836** https://app.graphite.dev/github/pr/llvm/llvm-project/125836?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#125827** https://app.graphite.dev/github/pr/llvm/llvm-project/125827?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#125826** https://app.graphite.dev/github/pr/llvm/llvm-project/125826?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#125711** https://app.graphite.dev/github/pr/llvm/llvm-project/125711?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/126762 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Add OMP Mapper field to MapInfoOp (PR #120994)
https://github.com/TIFitis updated https://github.com/llvm/llvm-project/pull/120994 >From 57858d2e19897a72057464bd33311d2cd4d4f156 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Mon, 23 Dec 2024 20:53:47 + Subject: [PATCH 1/2] Add mapper field to mapInfoOp. --- flang/lib/Lower/OpenMP/Utils.cpp| 3 ++- flang/lib/Lower/OpenMP/Utils.h | 3 ++- flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp | 5 - flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp | 1 + mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 2 ++ mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp| 2 +- mlir/test/Dialect/OpenMP/ops.mlir | 4 ++-- 7 files changed, 14 insertions(+), 6 deletions(-) diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp index 35722fa7d1b12..fa1975dac789b 100644 --- a/flang/lib/Lower/OpenMP/Utils.cpp +++ b/flang/lib/Lower/OpenMP/Utils.cpp @@ -125,7 +125,7 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc, llvm::ArrayRef members, mlir::ArrayAttr membersIndex, uint64_t mapType, mlir::omp::VariableCaptureKind mapCaptureType, mlir::Type retTy, -bool partialMap) { +bool partialMap, mlir::FlatSymbolRefAttr mapperId) { if (auto boxTy = llvm::dyn_cast(baseAddr.getType())) { baseAddr = builder.create(loc, baseAddr); retTy = baseAddr.getType(); @@ -144,6 +144,7 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc, mlir::omp::MapInfoOp op = builder.create( loc, retTy, baseAddr, varType, varPtrPtr, members, membersIndex, bounds, builder.getIntegerAttr(builder.getIntegerType(64, false), mapType), + mapperId, builder.getAttr(mapCaptureType), builder.getStringAttr(name), builder.getBoolAttr(partialMap)); return op; diff --git a/flang/lib/Lower/OpenMP/Utils.h b/flang/lib/Lower/OpenMP/Utils.h index f2e378443e5f2..3943eb633b04e 100644 --- a/flang/lib/Lower/OpenMP/Utils.h +++ b/flang/lib/Lower/OpenMP/Utils.h @@ -116,7 +116,8 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc, llvm::ArrayRef members, mlir::ArrayAttr membersIndex, uint64_t mapType, mlir::omp::VariableCaptureKind mapCaptureType, mlir::Type retTy, -bool partialMap = false); +bool partialMap = false, +mlir::FlatSymbolRefAttr mapperId = mlir::FlatSymbolRefAttr()); void insertChildMapInfoIntoParent( Fortran::lower::AbstractConverter &converter, diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index e7c1d1d9d560f..beea7543e54b3 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -184,6 +184,7 @@ class MapInfoFinalizationPass /*members=*/mlir::SmallVector{}, /*membersIndex=*/mlir::ArrayAttr{}, bounds, builder.getIntegerAttr(builder.getIntegerType(64, false), mapType), +/*mapperId*/ mlir::FlatSymbolRefAttr(), builder.getAttr( mlir::omp::VariableCaptureKind::ByRef), /*name=*/builder.getStringAttr(""), @@ -329,7 +330,8 @@ class MapInfoFinalizationPass builder.getIntegerAttr( builder.getIntegerType(64, false), getDescriptorMapType(op.getMapType().value_or(0), target)), -op.getMapCaptureTypeAttr(), op.getNameAttr(), +/*mapperId*/ mlir::FlatSymbolRefAttr(), op.getMapCaptureTypeAttr(), +op.getNameAttr(), /*partial_map=*/builder.getBoolAttr(false)); op.replaceAllUsesWith(newDescParentMapOp.getResult()); op->erase(); @@ -623,6 +625,7 @@ class MapInfoFinalizationPass /*members=*/mlir::ValueRange{}, /*members_index=*/mlir::ArrayAttr{}, /*bounds=*/bounds, op.getMapTypeAttr(), + /*mapperId*/ mlir::FlatSymbolRefAttr(), builder.getAttr( mlir::omp::VariableCaptureKind::ByRef), builder.getStringAttr(op.getNameAttr().strref() + "." + diff --git a/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp b/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp index 963ae863c1fc5..97ea463a3c495 100644 --- a/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp +++ b/flang/lib/Optimizer/OpenMP/MapsForPrivatizedSymbols.cpp @@ -91,6 +91,7 @@ class MapsForPrivatizedSymbolsPass /*bounds=*/ValueRange{}, builder.getIntegerAttr(builder.getIntegerType(64, /*isSigned=*/false), mapTypeTo), +/*mapperId*/ mlir::FlatSymbolRefAttr(), builder.getAttr( omp::VariableCaptureKind::ByRef), StringAttr(), builder.getBoolAttr(false)); diff --git a/mlir/include/mlir/
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Fabian Ritter (ritter-x2a) Changes gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all occurrences of gfx940/gfx941 from clang that can be removed without changes in the llvm directory. The target-invalid-cpu-note/amdgcn.c test is not included here since it tests a list of targets that is defined in llvm/lib/TargetParser/TargetParser.cpp. For SWDEV-512631 --- Patch is 41.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126762.diff 20 Files Affected: - (modified) clang/include/clang/Basic/Cuda.h (-2) - (modified) clang/lib/Basic/Cuda.cpp (-2) - (modified) clang/lib/Basic/Targets/NVPTX.cpp (-2) - (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (-2) - (modified) clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu (+1-1) - (modified) clang/test/CodeGenOpenCL/amdgpu-features.cl (-4) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-fp8.cl (+1-1) - (renamed) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx942.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950-err.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gws-insts.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-mfma.cl (+55-55) - (renamed) clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx942.cl (+17-17) - (modified) clang/test/Driver/amdgpu-macros.cl (-2) - (modified) clang/test/Driver/amdgpu-mcpu.cl (-4) - (modified) clang/test/Driver/cuda-bad-arch.cu (+1-1) - (modified) clang/test/Driver/hip-macros.hip (+4-6) - (modified) clang/test/Misc/target-invalid-cpu-note/nvptx.c (-2) - (renamed) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx942-param.cl (+1-1) - (modified) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950.cl (+1-1) - (renamed) clang/test/SemaOpenCL/builtins-amdgcn-gfx942-err.cl (+7-7) ``diff diff --git a/clang/include/clang/Basic/Cuda.h b/clang/include/clang/Basic/Cuda.h index f33ba46233a7a..793cab1f4e84a 100644 --- a/clang/include/clang/Basic/Cuda.h +++ b/clang/include/clang/Basic/Cuda.h @@ -106,8 +106,6 @@ enum class OffloadArch { GFX90a, GFX90c, GFX9_4_GENERIC, - GFX940, - GFX941, GFX942, GFX950, GFX10_1_GENERIC, diff --git a/clang/lib/Basic/Cuda.cpp b/clang/lib/Basic/Cuda.cpp index 1bfec0b37c5ee..f45fb0eca3714 100644 --- a/clang/lib/Basic/Cuda.cpp +++ b/clang/lib/Basic/Cuda.cpp @@ -124,8 +124,6 @@ static const OffloadArchToStringMap arch_names[] = { GFX(90a), // gfx90a GFX(90c), // gfx90c {OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"}, -GFX(940), // gfx940 -GFX(941), // gfx941 GFX(942), // gfx942 GFX(950), // gfx950 {OffloadArch::GFX10_1_GENERIC, "gfx10-1-generic", "compute_amdgcn"}, diff --git a/clang/lib/Basic/Targets/NVPTX.cpp b/clang/lib/Basic/Targets/NVPTX.cpp index 7d13c1f145440..547cf3dfa2be7 100644 --- a/clang/lib/Basic/Targets/NVPTX.cpp +++ b/clang/lib/Basic/Targets/NVPTX.cpp @@ -211,8 +211,6 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions &Opts, case OffloadArch::GFX90a: case OffloadArch::GFX90c: case OffloadArch::GFX9_4_GENERIC: - case OffloadArch::GFX940: - case OffloadArch::GFX941: case OffloadArch::GFX942: case OffloadArch::GFX950: case OffloadArch::GFX10_1_GENERIC: diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index c13928f61a748..826ec4da8ea28 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -2302,8 +2302,6 @@ void CGOpenMPRuntimeGPU::processRequiresDirective(const OMPRequiresDecl *D) { case OffloadArch::GFX90a: case OffloadArch::GFX90c: case OffloadArch::GFX9_4_GENERIC: - case OffloadArch::GFX940: - case OffloadArch::GFX941: case OffloadArch::GFX942: case OffloadArch::GFX950: case OffloadArch::GFX10_1_GENERIC: diff --git a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu index 47fa3967fe237..37fca614c3111 100644 --- a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu +++ b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu @@ -11,7 +11,7 @@ // RUN: -fnative-half-arguments-and-returns | FileCheck -check-prefix=SAFE %s // RUN: %clang_cc1 -x hip %s -O3 -S -o - -triple=amdgcn-amd-amdhsa \ -// RUN: -fcuda-is-device -target-cpu gfx940 -fnative-half-type \ +// RUN: -fcuda-is-device -target-cpu gfx942 -fnative-half-type \ // RUN: -fnative-half-arguments-and-returns -munsafe-fp-atomics \ // RUN: | FileCheck -check-prefix=UNSAFE %s diff --git a/clang/test/CodeGenOpenCL/amdgpu-features.cl b/clang/test/CodeGenOpenCL/amdgpu-features.cl index 633f1dec5e370..d12dcead6fadf 100644 --- a/clang/test/CodeGenOpenCL/amdgpu-features.cl +++ b/clang/test/CodeGenOpenCL/amdgpu-features.cl @@ -29,8 +29,6 @@ // RUN: %clang_cc1 -triple amdgcn
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Fabian Ritter (ritter-x2a) Changes gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all occurrences of gfx940/gfx941 from clang that can be removed without changes in the llvm directory. The target-invalid-cpu-note/amdgcn.c test is not included here since it tests a list of targets that is defined in llvm/lib/TargetParser/TargetParser.cpp. For SWDEV-512631 --- Patch is 41.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126762.diff 20 Files Affected: - (modified) clang/include/clang/Basic/Cuda.h (-2) - (modified) clang/lib/Basic/Cuda.cpp (-2) - (modified) clang/lib/Basic/Targets/NVPTX.cpp (-2) - (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (-2) - (modified) clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu (+1-1) - (modified) clang/test/CodeGenOpenCL/amdgpu-features.cl (-4) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-fp8.cl (+1-1) - (renamed) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx942.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950-err.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gws-insts.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-mfma.cl (+55-55) - (renamed) clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx942.cl (+17-17) - (modified) clang/test/Driver/amdgpu-macros.cl (-2) - (modified) clang/test/Driver/amdgpu-mcpu.cl (-4) - (modified) clang/test/Driver/cuda-bad-arch.cu (+1-1) - (modified) clang/test/Driver/hip-macros.hip (+4-6) - (modified) clang/test/Misc/target-invalid-cpu-note/nvptx.c (-2) - (renamed) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx942-param.cl (+1-1) - (modified) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950.cl (+1-1) - (renamed) clang/test/SemaOpenCL/builtins-amdgcn-gfx942-err.cl (+7-7) ``diff diff --git a/clang/include/clang/Basic/Cuda.h b/clang/include/clang/Basic/Cuda.h index f33ba46233a7a..793cab1f4e84a 100644 --- a/clang/include/clang/Basic/Cuda.h +++ b/clang/include/clang/Basic/Cuda.h @@ -106,8 +106,6 @@ enum class OffloadArch { GFX90a, GFX90c, GFX9_4_GENERIC, - GFX940, - GFX941, GFX942, GFX950, GFX10_1_GENERIC, diff --git a/clang/lib/Basic/Cuda.cpp b/clang/lib/Basic/Cuda.cpp index 1bfec0b37c5ee..f45fb0eca3714 100644 --- a/clang/lib/Basic/Cuda.cpp +++ b/clang/lib/Basic/Cuda.cpp @@ -124,8 +124,6 @@ static const OffloadArchToStringMap arch_names[] = { GFX(90a), // gfx90a GFX(90c), // gfx90c {OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"}, -GFX(940), // gfx940 -GFX(941), // gfx941 GFX(942), // gfx942 GFX(950), // gfx950 {OffloadArch::GFX10_1_GENERIC, "gfx10-1-generic", "compute_amdgcn"}, diff --git a/clang/lib/Basic/Targets/NVPTX.cpp b/clang/lib/Basic/Targets/NVPTX.cpp index 7d13c1f145440..547cf3dfa2be7 100644 --- a/clang/lib/Basic/Targets/NVPTX.cpp +++ b/clang/lib/Basic/Targets/NVPTX.cpp @@ -211,8 +211,6 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions &Opts, case OffloadArch::GFX90a: case OffloadArch::GFX90c: case OffloadArch::GFX9_4_GENERIC: - case OffloadArch::GFX940: - case OffloadArch::GFX941: case OffloadArch::GFX942: case OffloadArch::GFX950: case OffloadArch::GFX10_1_GENERIC: diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index c13928f61a748..826ec4da8ea28 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -2302,8 +2302,6 @@ void CGOpenMPRuntimeGPU::processRequiresDirective(const OMPRequiresDecl *D) { case OffloadArch::GFX90a: case OffloadArch::GFX90c: case OffloadArch::GFX9_4_GENERIC: - case OffloadArch::GFX940: - case OffloadArch::GFX941: case OffloadArch::GFX942: case OffloadArch::GFX950: case OffloadArch::GFX10_1_GENERIC: diff --git a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu index 47fa3967fe237..37fca614c3111 100644 --- a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu +++ b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu @@ -11,7 +11,7 @@ // RUN: -fnative-half-arguments-and-returns | FileCheck -check-prefix=SAFE %s // RUN: %clang_cc1 -x hip %s -O3 -S -o - -triple=amdgcn-amd-amdhsa \ -// RUN: -fcuda-is-device -target-cpu gfx940 -fnative-half-type \ +// RUN: -fcuda-is-device -target-cpu gfx942 -fnative-half-type \ // RUN: -fnative-half-arguments-and-returns -munsafe-fp-atomics \ // RUN: | FileCheck -check-prefix=UNSAFE %s diff --git a/clang/test/CodeGenOpenCL/amdgpu-features.cl b/clang/test/CodeGenOpenCL/amdgpu-features.cl index 633f1dec5e370..d12dcead6fadf 100644 --- a/clang/test/CodeGenOpenCL/amdgpu-features.cl +++ b/clang/test/CodeGenOpenCL/amdgpu-features.cl @@ -29,8 +29,6 @@ // RUN: %clang_cc1 -trip
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
https://github.com/shiltian approved this pull request. https://github.com/llvm/llvm-project/pull/126762 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
llvmbot wrote: @llvm/pr-subscribers-clang-driver Author: Fabian Ritter (ritter-x2a) Changes gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all occurrences of gfx940/gfx941 from clang that can be removed without changes in the llvm directory. The target-invalid-cpu-note/amdgcn.c test is not included here since it tests a list of targets that is defined in llvm/lib/TargetParser/TargetParser.cpp. For SWDEV-512631 --- Patch is 41.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126762.diff 20 Files Affected: - (modified) clang/include/clang/Basic/Cuda.h (-2) - (modified) clang/lib/Basic/Cuda.cpp (-2) - (modified) clang/lib/Basic/Targets/NVPTX.cpp (-2) - (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (-2) - (modified) clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu (+1-1) - (modified) clang/test/CodeGenOpenCL/amdgpu-features.cl (-4) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-fp8.cl (+1-1) - (renamed) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx942.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx950-err.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gws-insts.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-mfma.cl (+55-55) - (renamed) clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx942.cl (+17-17) - (modified) clang/test/Driver/amdgpu-macros.cl (-2) - (modified) clang/test/Driver/amdgpu-mcpu.cl (-4) - (modified) clang/test/Driver/cuda-bad-arch.cu (+1-1) - (modified) clang/test/Driver/hip-macros.hip (+4-6) - (modified) clang/test/Misc/target-invalid-cpu-note/nvptx.c (-2) - (renamed) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx942-param.cl (+1-1) - (modified) clang/test/SemaOpenCL/builtins-amdgcn-error-gfx950.cl (+1-1) - (renamed) clang/test/SemaOpenCL/builtins-amdgcn-gfx942-err.cl (+7-7) ``diff diff --git a/clang/include/clang/Basic/Cuda.h b/clang/include/clang/Basic/Cuda.h index f33ba46233a7a..793cab1f4e84a 100644 --- a/clang/include/clang/Basic/Cuda.h +++ b/clang/include/clang/Basic/Cuda.h @@ -106,8 +106,6 @@ enum class OffloadArch { GFX90a, GFX90c, GFX9_4_GENERIC, - GFX940, - GFX941, GFX942, GFX950, GFX10_1_GENERIC, diff --git a/clang/lib/Basic/Cuda.cpp b/clang/lib/Basic/Cuda.cpp index 1bfec0b37c5ee..f45fb0eca3714 100644 --- a/clang/lib/Basic/Cuda.cpp +++ b/clang/lib/Basic/Cuda.cpp @@ -124,8 +124,6 @@ static const OffloadArchToStringMap arch_names[] = { GFX(90a), // gfx90a GFX(90c), // gfx90c {OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"}, -GFX(940), // gfx940 -GFX(941), // gfx941 GFX(942), // gfx942 GFX(950), // gfx950 {OffloadArch::GFX10_1_GENERIC, "gfx10-1-generic", "compute_amdgcn"}, diff --git a/clang/lib/Basic/Targets/NVPTX.cpp b/clang/lib/Basic/Targets/NVPTX.cpp index 7d13c1f145440..547cf3dfa2be7 100644 --- a/clang/lib/Basic/Targets/NVPTX.cpp +++ b/clang/lib/Basic/Targets/NVPTX.cpp @@ -211,8 +211,6 @@ void NVPTXTargetInfo::getTargetDefines(const LangOptions &Opts, case OffloadArch::GFX90a: case OffloadArch::GFX90c: case OffloadArch::GFX9_4_GENERIC: - case OffloadArch::GFX940: - case OffloadArch::GFX941: case OffloadArch::GFX942: case OffloadArch::GFX950: case OffloadArch::GFX10_1_GENERIC: diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index c13928f61a748..826ec4da8ea28 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -2302,8 +2302,6 @@ void CGOpenMPRuntimeGPU::processRequiresDirective(const OMPRequiresDecl *D) { case OffloadArch::GFX90a: case OffloadArch::GFX90c: case OffloadArch::GFX9_4_GENERIC: - case OffloadArch::GFX940: - case OffloadArch::GFX941: case OffloadArch::GFX942: case OffloadArch::GFX950: case OffloadArch::GFX10_1_GENERIC: diff --git a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu index 47fa3967fe237..37fca614c3111 100644 --- a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu +++ b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu @@ -11,7 +11,7 @@ // RUN: -fnative-half-arguments-and-returns | FileCheck -check-prefix=SAFE %s // RUN: %clang_cc1 -x hip %s -O3 -S -o - -triple=amdgcn-amd-amdhsa \ -// RUN: -fcuda-is-device -target-cpu gfx940 -fnative-half-type \ +// RUN: -fcuda-is-device -target-cpu gfx942 -fnative-half-type \ // RUN: -fnative-half-arguments-and-returns -munsafe-fp-atomics \ // RUN: | FileCheck -check-prefix=UNSAFE %s diff --git a/clang/test/CodeGenOpenCL/amdgpu-features.cl b/clang/test/CodeGenOpenCL/amdgpu-features.cl index 633f1dec5e370..d12dcead6fadf 100644 --- a/clang/test/CodeGenOpenCL/amdgpu-features.cl +++ b/clang/test/CodeGenOpenCL/amdgpu-features.cl @@ -29,8 +29,6 @@ // RUN: %clang_cc1 -triple
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
https://github.com/ritter-x2a ready_for_review https://github.com/llvm/llvm-project/pull/126762 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
@@ -106,8 +106,6 @@ enum class OffloadArch { GFX90a, GFX90c, GFX9_4_GENERIC, - GFX940, - GFX941, jhuber6 wrote: Seems bizarre to just fully remove support when we still accept things like `gfx600` to this day. As far as I understand, these are basically just being replaced by `gfx942`. Would it be at all possible to do `GFX940 = GFX942`? https://github.com/llvm/llvm-project/pull/126762 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
@@ -106,8 +106,6 @@ enum class OffloadArch { GFX90a, GFX90c, GFX9_4_GENERIC, - GFX940, - GFX941, shiltian wrote: I think the discussion yesterday decided to simply just remove them w/o more explanation and aliases. https://github.com/llvm/llvm-project/pull/126762 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Add Lowering support for OpenMP custom mappers in map clause (PR #121001)
@@ -1003,6 +1006,20 @@ void ClauseProcessor::processMapObjects( } } +if (!mapperIdName.empty()) { TIFitis wrote: The if is contained in a loop and I want the if to execute only the first iteration. So this replacement won't be helpful here. https://github.com/llvm/llvm-project/pull/121001 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)
@@ -0,0 +1,85 @@ +! This test checks lowering of OpenMP declare mapper Directive. + +! RUN: split-file %s %t +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-1.f90 -o - | FileCheck %t/omp-declare-mapper-1.f90 +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-2.f90 -o - | FileCheck %t/omp-declare-mapper-2.f90 + +!--- omp-declare-mapper-1.f90 +subroutine declare_mapper_1 + integer, parameter :: nvals = 250 + type my_type + integer :: num_vals + integer, allocatable :: values(:) + end type + + type my_type2 + type(my_type):: my_type_var + type(my_type):: temp + real, dimension(nvals) :: unmapped + real, dimension(nvals) :: arr + end type + type(my_type2):: t + real :: x, y(nvals) + !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { + !CHECK: ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>): + !CHECK:%[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFdeclare_mapper_1Evar"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE]]>, !fir.ref<[[MY_TYPE]]>) + !CHECK:%[[VAL_2:.*]] = hlfir.designate %[[VAL_1]]#0{"values"} {fortran_attrs = #fir.var_attrs} : (!fir.ref<[[MY_TYPE]]>) -> !fir.ref>>> + !CHECK:%[[VAL_3:.*]] = fir.load %[[VAL_2]] : !fir.ref>>> + !CHECK:%[[VAL_4:.*]] = fir.box_addr %[[VAL_3]] : (!fir.box>>) -> !fir.heap> + !CHECK:%[[VAL_5:.*]] = arith.constant 0 : index + !CHECK:%[[VAL_6:.*]]:3 = fir.box_dims %[[VAL_3]], %[[VAL_5]] : (!fir.box>>, index) -> (index, index, index) + !CHECK:%[[VAL_7:.*]] = arith.constant 0 : index + !CHECK:%[[VAL_8:.*]] = arith.constant 1 : index + !CHECK:%[[VAL_9:.*]] = arith.constant 1 : index + !CHECK:%[[VAL_10:.*]] = arith.subi %[[VAL_9]], %[[VAL_6]]#0 : index + !CHECK:%[[VAL_11:.*]] = hlfir.designate %[[VAL_1]]#0{"num_vals"} : (!fir.ref<[[MY_TYPE]]>) -> !fir.ref + !CHECK:%[[VAL_12:.*]] = fir.load %[[VAL_11]] : !fir.ref + !CHECK:%[[VAL_13:.*]] = fir.convert %[[VAL_12]] : (i32) -> i64 + !CHECK:%[[VAL_14:.*]] = fir.convert %[[VAL_13]] : (i64) -> index + !CHECK:%[[VAL_15:.*]] = arith.subi %[[VAL_14]], %[[VAL_6]]#0 : index + !CHECK:%[[VAL_16:.*]] = omp.map.bounds lower_bound(%[[VAL_10]] : index) upper_bound(%[[VAL_15]] : index) extent(%[[VAL_6]]#1 : index) stride(%[[VAL_8]] : index) start_idx(%[[VAL_6]]#0 : index) + !CHECK:%[[VAL_17:.*]] = arith.constant 1 : index + !CHECK:%[[VAL_18:.*]] = fir.coordinate_of %[[VAL_1]]#0, %[[VAL_17]] : (!fir.ref<[[MY_TYPE]]>, index) -> !fir.ref>>> + !CHECK:%[[VAL_19:.*]] = fir.box_offset %[[VAL_18]] base_addr : (!fir.ref>>>) -> !fir.llvm_ptr>> + !CHECK:%[[VAL_20:.*]] = omp.map.info var_ptr(%[[VAL_18]] : !fir.ref>>>, i32) var_ptr_ptr(%[[VAL_19]] : !fir.llvm_ptr>>) map_clauses(tofrom) capture(ByRef) bounds(%[[VAL_16]]) -> !fir.llvm_ptr>> {name = ""} + !CHECK:%[[VAL_21:.*]] = omp.map.info var_ptr(%[[VAL_18]] : !fir.ref>>>, !fir.box>>) map_clauses(to) capture(ByRef) -> !fir.ref>>> {name = "var%[[VAL_22:.*]](1:var%[[VAL_23:.*]])"} + !CHECK:%[[VAL_24:.*]] = omp.map.info var_ptr(%[[VAL_1]]#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) members(%[[VAL_21]], %[[VAL_20]] : [1], [1, 0] : !fir.ref>>>, !fir.llvm_ptr>>) -> !fir.ref<[[MY_TYPE]]> {name = "var"} + !CHECK:omp.declare_mapper.info map_entries(%[[VAL_24]], %[[VAL_21]], %[[VAL_20]] : !fir.ref<[[MY_TYPE]]>, !fir.ref>>>, !fir.llvm_ptr>>) + !CHECK: } + !$omp declare mapper (my_type :: var) map (var, var%values (1:var%num_vals)) +end subroutine declare_mapper_1 + +!--- omp-declare-mapper-2.f90 +subroutine declare_mapper_2 + integer, parameter :: nvals = 250 + type my_type + integer :: num_vals + integer, allocatable :: values(:) + end type + + type my_type2 + type(my_type):: my_type_var + type(my_type):: temp + real, dimension(nvals) :: unmapped + real, dimension(nvals) :: arr + end type + type(my_type2):: t + real :: x, y(nvals) + !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_2my_mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_2Tmy_type2\{my_type_var:!fir\.type<_QFdeclare_mapper_2Tmy_type\{num_vals:i32,values:!fir\.box>>\}>,temp:!fir\.type<_QFdeclare_mapper_2Tmy_type\{num_vals:i32,values:!fir\.box>>\}>,unmapped:!fir\.array<250xf32>,arr:!fir\.array<250xf32>\}>]] { + !CHECK: ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>): + !CHECK:%[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFdeclare_mapper_2Ev"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
https://github.com/jhuber6 approved this pull request. Okay, so I guess we can delete these because the cards that corresponded to these were never fully released as I understand it. https://github.com/llvm/llvm-project/pull/126762 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (PR #126763)
https://github.com/arsenm requested changes to this pull request. Should just leave the subtarget feature name alone. It's not worth the trouble, and this will now start spewing warnings on old IR (due to unnecessary target-features spam clang should stop emitting). It really should have been named 94-insts, but I think it's best to leave it alone https://github.com/llvm/llvm-project/pull/126763 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Add Lowering support for OpenMP custom mappers in map clause (PR #121001)
@@ -1003,6 +1006,20 @@ void ClauseProcessor::processMapObjects( } } +if (!mapperIdName.empty()) { skatrak wrote: Nevermind, thank you for bringing to my attention that this is inside of a loop. https://github.com/llvm/llvm-project/pull/121001 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (PR #126763)
https://github.com/ritter-x2a ready_for_review https://github.com/llvm/llvm-project/pull/126763 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/126762 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP custom mappers in map clause (PR #121001)
https://github.com/TIFitis updated https://github.com/llvm/llvm-project/pull/121001 >From 107f59d06dcb0523e373682b5879bb79c824bb2f Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Mon, 23 Dec 2024 21:13:42 + Subject: [PATCH 1/5] Add flang lowering changes for mapper field in map clause. --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 32 + flang/lib/Lower/OpenMP/ClauseProcessor.h| 3 +- flang/test/Lower/OpenMP/Todo/map-mapper.f90 | 16 --- flang/test/Lower/OpenMP/map-mapper.f90 | 23 +++ 4 files changed, 52 insertions(+), 22 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/map-mapper.f90 create mode 100644 flang/test/Lower/OpenMP/map-mapper.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index febc6adcf9d6f..467a0dcebf2b8 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -969,8 +969,10 @@ void ClauseProcessor::processMapObjects( llvm::omp::OpenMPOffloadMappingFlags mapTypeBits, std::map &parentMemberIndices, llvm::SmallVectorImpl &mapVars, -llvm::SmallVectorImpl &mapSyms) const { +llvm::SmallVectorImpl &mapSyms, +std::string mapperIdName) const { fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::FlatSymbolRefAttr mapperId; for (const omp::Object &object : objects) { llvm::SmallVector bounds; @@ -1003,6 +1005,20 @@ void ClauseProcessor::processMapObjects( } } +if (!mapperIdName.empty()) { + if (mapperIdName == "default") { +auto &typeSpec = object.sym()->owner().IsDerivedType() + ? *object.sym()->owner().derivedTypeSpec() + : object.sym()->GetType()->derivedTypeSpec(); +mapperIdName = typeSpec.name().ToString() + ".default"; +mapperIdName = converter.mangleName(mapperIdName, *typeSpec.GetScope()); + } + assert(converter.getMLIRSymbolTable()->lookup(mapperIdName) && + "mapper not found"); + mapperId = mlir::FlatSymbolRefAttr::get(&converter.getMLIRContext(), + mapperIdName); + mapperIdName.clear(); +} // Explicit map captures are captured ByRef by default, // optimisation passes may alter this to ByCopy or other capture // types to optimise @@ -1016,7 +1032,8 @@ void ClauseProcessor::processMapObjects( static_cast< std::underlying_type_t>( mapTypeBits), -mlir::omp::VariableCaptureKind::ByRef, baseOp.getType()); +mlir::omp::VariableCaptureKind::ByRef, baseOp.getType(), false, +mapperId); if (parentObj.has_value()) { parentMemberIndices[parentObj.value()].addChildIndexAndMapToParent( @@ -1047,6 +1064,7 @@ bool ClauseProcessor::processMap( const auto &[mapType, typeMods, mappers, iterator, objects] = clause.t; llvm::omp::OpenMPOffloadMappingFlags mapTypeBits = llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_NONE; +std::string mapperIdName; // If the map type is specified, then process it else Tofrom is the // default. Map::MapType type = mapType.value_or(Map::MapType::Tofrom); @@ -1090,13 +1108,17 @@ bool ClauseProcessor::processMap( "Support for iterator modifiers is not implemented yet"); } if (mappers) { - TODO(currentLocation, - "Support for mapper modifiers is not implemented yet"); + assert(mappers->size() == 1 && "more than one mapper"); + mapperIdName = mappers->front().v.id().symbol->name().ToString(); + if (mapperIdName != "default") +mapperIdName = converter.mangleName( +mapperIdName, mappers->front().v.id().symbol->owner()); } processMapObjects(stmtCtx, clauseLocation, std::get(clause.t), mapTypeBits, - parentMemberIndices, result.mapVars, *ptrMapSyms); + parentMemberIndices, result.mapVars, *ptrMapSyms, + mapperIdName); }; bool clauseFound = findRepeatableClause(process); diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index e05f66c766684..2b319e890a5ad 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -175,7 +175,8 @@ class ClauseProcessor { llvm::omp::OpenMPOffloadMappingFlags mapTypeBits, std::map &parentMemberIndices, llvm::SmallVectorImpl &mapVars, - llvm::SmallVectorImpl &mapSyms) const; + llvm::SmallVectorImpl &mapSyms, + std::string mapperIdName = "") const; lower::AbstractConverter &converter; semantics::SemanticsContext &semaCtx; diff --git a/flang/test/Lower/OpenMP/Todo/map-mapper.f90 b/flang/test/Lower/OpenMP/Todo/map-mapper.f90 deleted file mode 100644 index 9554ffd5fda7b..0 --- a/flang/test/Lowe
[llvm-branch-commits] [llvm] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (PR #126607)
https://github.com/AmrDeveloper updated https://github.com/llvm/llvm-project/pull/126607 >From 8886b33981f73da04adadb3e02a740b8e376e042 Mon Sep 17 00:00:00 2001 From: AmrDeveloper Date: Mon, 10 Feb 2025 23:03:15 +0100 Subject: [PATCH 1/2] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 --- llvm/docs/ReleaseNotes.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md index 44a0b17d6a07b..c92338345f1cb 100644 --- a/llvm/docs/ReleaseNotes.md +++ b/llvm/docs/ReleaseNotes.md @@ -460,6 +460,8 @@ Changes to the LLVM tools `--localize-symbol`, `--localize-symbols`, `--skip-symbol`, `--skip-symbols`. +* llvm-objcopy now prints the correct file path in the error message when the output file specified by --dump-section cannot be opened. + Changes to LLDB - >From 6f5ced04f9fb5769f367a26b306a3ca961324926 Mon Sep 17 00:00:00 2001 From: AmrDeveloper Date: Tue, 11 Feb 2025 17:53:03 +0100 Subject: [PATCH 2/2] Add quote around option name --- llvm/docs/ReleaseNotes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md index c92338345f1cb..28908490b8f7c 100644 --- a/llvm/docs/ReleaseNotes.md +++ b/llvm/docs/ReleaseNotes.md @@ -460,7 +460,7 @@ Changes to the LLVM tools `--localize-symbol`, `--localize-symbols`, `--skip-symbol`, `--skip-symbols`. -* llvm-objcopy now prints the correct file path in the error message when the output file specified by --dump-section cannot be opened. +* llvm-objcopy now prints the correct file path in the error message when the output file specified by `--dump-section` cannot be opened. Changes to LLDB - ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (PR #126607)
@@ -460,6 +460,8 @@ Changes to the LLVM tools `--localize-symbol`, `--localize-symbols`, `--skip-symbol`, `--skip-symbols`. +* llvm-objcopy now prints the correct file path in the error message when the output file specified by --dump-section cannot be opened. AmrDeveloper wrote: I made it `--dump-section` to be similar to all other options in the same file https://github.com/llvm/llvm-project/pull/126607 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [llvm] [mlir] [MLIR][OpenMP] Add LLVM translation support for OpenMP UserDefinedMappers (PR #124746)
@@ -3529,6 +3549,84 @@ static void genMapInfos(llvm::IRBuilderBase &builder, } } +static llvm::Expected +emitUserDefinedMapper(Operation *declMapperOp, llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation); + +static llvm::Expected +getOrCreateUserDefinedMapperFunc(Operation *declMapperOp, + llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation) { + static llvm::DenseMap userDefMapperMap; TIFitis wrote: Thanks for the suggestion, I've reworked this bit of code. https://github.com/llvm/llvm-project/pull/124746 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)
@@ -106,8 +106,6 @@ enum class OffloadArch { GFX90a, GFX90c, GFX9_4_GENERIC, - GFX940, - GFX941, jhuber6 wrote: So `--offload-arch=gfx940` will be a hard error after working at least since clang 16? That sounds very silly. https://github.com/llvm/llvm-project/pull/126762 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Add conversion support from FIR to LLVM Dialect for OMP DeclareMapper (PR #121005)
https://github.com/TIFitis updated https://github.com/llvm/llvm-project/pull/121005 >From 0cba204faac851d186470b74aab3601a987e0f2d Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Mon, 23 Dec 2024 21:50:03 + Subject: [PATCH 1/2] Add OpenMP to LLVM dialect conversion support for DeclareMapperOp. --- .../Fir/convert-to-llvm-openmp-and-fir.fir| 27 +-- .../Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp | 48 +++ .../OpenMPToLLVM/convert-to-llvmir.mlir | 13 + 3 files changed, 74 insertions(+), 14 deletions(-) diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir index 8e4e1fe824d9f..82f2aea3ad983 100644 --- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir +++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir @@ -936,9 +936,9 @@ func.func @omp_map_info_descriptor_type_conversion(%arg0 : !fir.ref>, i32) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr> {name = ""} // CHECK: %[[DESC_MAP:.*]] = omp.map.info var_ptr(%[[ARG_0]] : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(always, delete) capture(ByRef) members(%[[MEMBER_MAP]] : [0] : !llvm.ptr) -> !llvm.ptr {name = ""} %2 = omp.map.info var_ptr(%arg0 : !fir.ref>>, !fir.box>) map_clauses(always, delete) capture(ByRef) members(%1 : [0] : !fir.llvm_ptr>) -> !fir.ref>> {name = ""} - // CHECK: omp.target_exit_data map_entries(%[[DESC_MAP]] : !llvm.ptr) + // CHECK: omp.target_exit_data map_entries(%[[DESC_MAP]] : !llvm.ptr) omp.target_exit_data map_entries(%2 : !fir.ref>>) - return + return } // - @@ -956,8 +956,8 @@ func.func @omp_map_info_derived_type_explicit_member_conversion(%arg0 : !fir.ref %3 = fir.field_index real, !fir.type<_QFderived_type{real:f32,array:!fir.array<10xi32>,int:i32}> %4 = fir.coordinate_of %arg0, %3 : (!fir.ref,int:i32}>>, !fir.field) -> !fir.ref // CHECK: %[[MAP_MEMBER_2:.*]] = omp.map.info var_ptr(%[[GEP_2]] : !llvm.ptr, f32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = "dtype%real"} - %5 = omp.map.info var_ptr(%4 : !fir.ref, f32) map_clauses(tofrom) capture(ByRef) -> !fir.ref {name = "dtype%real"} - // CHECK: %[[MAP_PARENT:.*]] = omp.map.info var_ptr(%[[ARG_0]] : !llvm.ptr, !llvm.struct<"_QFderived_type", (f32, array<10 x i32>, i32)>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_MEMBER_1]], %[[MAP_MEMBER_2]] : [2], [0] : !llvm.ptr, !llvm.ptr) -> !llvm.ptr {name = "dtype", partial_map = true} + %5 = omp.map.info var_ptr(%4 : !fir.ref, f32) map_clauses(tofrom) capture(ByRef) -> !fir.ref {name = "dtype%real"} + // CHECK: %[[MAP_PARENT:.*]] = omp.map.info var_ptr(%[[ARG_0]] : !llvm.ptr, !llvm.struct<"_QFderived_type", (f32, array<10 x i32>, i32)>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_MEMBER_1]], %[[MAP_MEMBER_2]] : [2], [0] : !llvm.ptr, !llvm.ptr) -> !llvm.ptr {name = "dtype", partial_map = true} %6 = omp.map.info var_ptr(%arg0 : !fir.ref,int:i32}>>, !fir.type<_QFderived_type{real:f32,array:!fir.array<10xi32>,int:i32}>) map_clauses(tofrom) capture(ByRef) members(%2, %5 : [2], [0] : !fir.ref, !fir.ref) -> !fir.ref,int:i32}>> {name = "dtype", partial_map = true} // CHECK: omp.target map_entries(%[[MAP_MEMBER_1]] -> %[[ARG_1:.*]], %[[MAP_MEMBER_2]] -> %[[ARG_2:.*]], %[[MAP_PARENT]] -> %[[ARG_3:.*]] : !llvm.ptr, !llvm.ptr, !llvm.ptr) { omp.target map_entries(%2 -> %arg1, %5 -> %arg2, %6 -> %arg3 : !fir.ref, !fir.ref, !fir.ref,int:i32}>>) { @@ -1275,3 +1275,22 @@ func.func @map_nested_dtype_alloca_mem2(%arg0 : !fir.ref { +omp.declare_mapper @my_mapper : !fir.type<_QFdeclare_mapperTmy_type{data:i32}> { +// CHECK: ^bb0(%[[VAL_0:.*]]: !llvm.ptr): +^bb0(%0: !fir.ref>): +// CHECK: %[[VAL_1:.*]] = llvm.mlir.constant(0 : i32) : i32 + %1 = fir.field_index data, !fir.type<_QFdeclare_mapperTmy_type{data:i32}> +// CHECK: %[[VAL_2:.*]] = llvm.getelementptr %[[VAL_0]][0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"_QFdeclare_mapperTmy_type", (i32)> + %2 = fir.coordinate_of %0, %1 : (!fir.ref>, !fir.field) -> !fir.ref +// CHECK: %[[VAL_3:.*]] = omp.map.info var_ptr(%[[VAL_2]] : !llvm.ptr, i32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = "var%[[VAL_4:.*]]"} + %3 = omp.map.info var_ptr(%2 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref {name = "var%data"} +// CHECK: %[[VAL_5:.*]] = omp.map.info var_ptr(%[[VAL_0]] : !llvm.ptr, !llvm.struct<"_QFdeclare_mapperTmy_type", (i32)>) map_clauses(tofrom) capture(ByRef) members(%[[VAL_3]] : [0] : !llvm.ptr) -> !llvm.ptr {name = "var", partial_map = true} + %4 = omp.map.info var_ptr(%0 : !fir.ref>, !fir.type<_QFdeclare_mapperTmy_type{data:i32}>) map_clauses(tofrom) capture(ByRef) members(%3 : [0] : !fir.ref) -> !fir.ref> {name = "var", partial_map = true} +// CHECK: omp.declare_mapper_info map_entries(%[[VAL_5]], %[[VAL_3]] : !llvm.ptr, !llvm.ptr) + omp.declare_mappe
[llvm-branch-commits] [llvm] [AMDGPU] Remove dead function metadata after amdgpu-lower-kernel-arguments (PR #126147)
@@ -1,7 +1,10 @@ -; RUN: not --crash opt -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 -passes='amdgpu-attributor,function(amdgpu-lower-kernel-arguments)' -amdgpu-kernarg-preload-count=16 -S < %s 2>&1 | FileCheck %s +; RUN: opt -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 -passes='amdgpu-attributor,function(amdgpu-lower-kernel-arguments)' -amdgpu-kernarg-preload-count=16 -S < %s 2>&1 \ +; RUN: | FileCheck -implicit-check-not='declare {{.*}} !dbg' %s arsenm wrote: this is still a big aggressive for check-not, and I'm not sure it supports regex. Can you simplify to just check-not=declare and explicitly check the few declares that are expected? https://github.com/llvm/llvm-project/pull/126147 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [clang-format] Handle C-style cast of member function pointer type (#126340) (PR #126479)
https://github.com/mydeveloperday approved this pull request. https://github.com/llvm/llvm-project/pull/126479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [VPlan] Only skip expansion for SCEVUnknown if it isn't an instruction. (#125235) (PR #126718)
https://github.com/fhahn approved this pull request. LGTM, thanks although I am surprised the bot requested a review from myself. Maybe it should have asked @nikic as well, who reviewed the original PR? https://github.com/llvm/llvm-project/pull/126718 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port RegAllocPriorityAdvisor analysis to NPM (PR #118462)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/118462 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)
@@ -0,0 +1,43 @@ +// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.3-compute \ +// RUN: -fnative-half-type -emit-llvm -disable-llvm-passes -o - %s | FileCheck %s + +// CHECK: %"struct.__cblayout_$Globals" = type { float, float, %struct.__cblayout_S } +// CHECK: %struct.__cblayout_S = type { float } + +// CHECK-DAG: @"$Globals.cb" = external constant target("dx.CBuffer", %"struct.__cblayout_$Globals") +// CHECK-DAG: @a = external addrspace(2) global float +// CHECK-DAG: @g = external addrspace(2) global float +// CHECK-DAG: @h = external addrspace(2) global %struct.__cblayout_S + +struct EmptyStruct { +}; + +struct S { + RWBuffer buf; + EmptyStruct es; + float ea[0]; + float b; +}; + +float a; +RWBuffer b; +EmptyStruct c; +float d[0]; +RWBuffer e[2]; +groupshared float f; +float g; +S h; + +RWBuffer Buf; + +[numthreads(4,1,1)] +void main() { + Buf[0] = a; +} + +// CHECK: !hlsl.cblayouts = !{![[S_LAYOUT:.*]], ![[CB_LAYOUT:.*]]} +// CHECK: !hlsl.cbs = !{![[CB:.*]]} + +// CHECK: ![[S_LAYOUT]] = !{!"struct.__cblayout_S", i32 4, i32 0} +// CHECK: ![[CB_LAYOUT]] = !{!"struct.__cblayout_$Globals", i32 20, i32 0, i32 4, i32 16} +// CHECK: ![[CB]] = !{ptr @"$Globals.cb", ptr addrspace(2) @a, ptr addrspace(2) @g, ptr addrspace(2) @h} spall wrote: newline https://github.com/llvm/llvm-project/pull/125807 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)
@@ -5753,6 +5765,30 @@ void HLSLBufferDecl::addLayoutStruct(CXXRecordDecl *LS) { addDecl(LS); } +void HLSLBufferDecl::addDefaultBufferDecl(Decl *D) { + assert(isImplicit() && + "default decls can only be added to the implicit/default constant " + "buffer $Globals"); + DefaultBufferDecls.push_back(D); +} + +HLSLBufferDecl::buffer_decl_iterator +HLSLBufferDecl::buffer_decls_begin() const { + return buffer_decl_iterator(llvm::iterator_range(DefaultBufferDecls.begin(), + DefaultBufferDecls.end()), + decl_range(decls_begin(), decls_end())); +} + +HLSLBufferDecl::buffer_decl_iterator HLSLBufferDecl::buffer_decls_end() const { + return buffer_decl_iterator( + llvm::iterator_range(DefaultBufferDecls.end(), DefaultBufferDecls.end()), spall wrote: this is supposed to say end, end? https://github.com/llvm/llvm-project/pull/125807 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)
@@ -5072,6 +5080,20 @@ class HLSLBufferDecl final : public NamedDecl, public DeclContext { return static_cast(const_cast(DC)); } + // Iterator for the buffer decls. Concatenates the list of decls parented spall wrote: I just want to clarify what this comment says. The children decls of this hlslbufferdecl are concatenated with the list of default buffer decls? Does the order of this matter? https://github.com/llvm/llvm-project/pull/125807 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)
@@ -159,11 +159,16 @@ class SemaHLSL : public SemaBase { // List of all resource bindings ResourceBindings Bindings; + // default constant buffer $Globals + HLSLBufferDecl *DefaultCBuffer; + private: void collectResourcesOnVarDecl(VarDecl *D); void collectResourcesOnUserRecordDecl(const VarDecl *VD, const RecordType *RT); void processExplicitBindingsOnDecl(VarDecl *D); + + void diagnoseAvailabilityViolations(TranslationUnitDecl *TU); spall wrote: Why do you want this to be private now? https://github.com/llvm/llvm-project/pull/125807 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [HLSL] Implement default constant buffer `$Globals` (PR #125807)
@@ -286,10 +286,7 @@ void CGHLSLRuntime::emitBufferGlobalsAndMetadata(const HLSLBufferDecl *BufDecl, .str( && "layout type does not match the converted element type"); -// there might be resources inside the used defined structs -if (VDTy->isStructureType() && VDTy->isHLSLIntangibleType()) - // FIXME: handle resources in cbuffer structs - llvm_unreachable("resources in cbuffer are not supported yet"); +// FIXME: handle resources in cbuffer user-defined structs spall wrote: this is future work? not a reminder for this pr? https://github.com/llvm/llvm-project/pull/125807 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)
llvmbot wrote: @llvm/pr-subscribers-offload Author: None (llvmbot) Changes Backport baf7a3c1e561ff7e3f7da2261ce1012c4f2ba1c0 Requested by: @jhuber6 --- Full diff: https://github.com/llvm/llvm-project/pull/126795.diff 2 Files Affected: - (modified) offload/plugins-nextgen/common/include/RPC.h (+9-3) - (modified) offload/plugins-nextgen/common/src/RPC.cpp (+4-1) ``diff diff --git a/offload/plugins-nextgen/common/include/RPC.h b/offload/plugins-nextgen/common/include/RPC.h index 42fca4aa4aebc..08556f15a76bf 100644 --- a/offload/plugins-nextgen/common/include/RPC.h +++ b/offload/plugins-nextgen/common/include/RPC.h @@ -72,6 +72,9 @@ struct RPCServerTy { /// Array of associated devices. These must be alive as long as the server is. std::unique_ptr Devices; + /// Mutex that guards accesses to the buffers and device array. + std::mutex BufferMutex{}; + /// A helper class for running the user thread that handles the RPC interface. /// Because we only need to check the RPC server while any kernels are /// working, we track submission / completion events to allow the thread to @@ -90,6 +93,9 @@ struct RPCServerTy { std::condition_variable CV; std::mutex Mutex; +/// A reference to the main server's mutex. +std::mutex &BufferMutex; + /// A reference to all the RPC interfaces that the server is handling. llvm::ArrayRef Buffers; @@ -98,9 +104,9 @@ struct RPCServerTy { /// Initialize the worker thread to run in the background. ServerThread(void *Buffers[], plugin::GenericDeviceTy *Devices[], - size_t Length) -: Running(false), NumUsers(0), CV(), Mutex(), Buffers(Buffers, Length), - Devices(Devices, Length) {} + size_t Length, std::mutex &BufferMutex) +: Running(false), NumUsers(0), CV(), Mutex(), BufferMutex(BufferMutex), + Buffers(Buffers, Length), Devices(Devices, Length) {} ~ServerThread() { assert(!Running && "Thread not shut down explicitly\n"); } diff --git a/offload/plugins-nextgen/common/src/RPC.cpp b/offload/plugins-nextgen/common/src/RPC.cpp index e6750a540b391..eb305736d6264 100644 --- a/offload/plugins-nextgen/common/src/RPC.cpp +++ b/offload/plugins-nextgen/common/src/RPC.cpp @@ -131,6 +131,7 @@ void RPCServerTy::ServerThread::run() { Lock.unlock(); while (NumUsers.load(std::memory_order_relaxed) > 0 && Running.load(std::memory_order_relaxed)) { + std::lock_guard Lock(BufferMutex); for (const auto &[Buffer, Device] : llvm::zip_equal(Buffers, Devices)) { if (!Buffer || !Device) continue; @@ -149,7 +150,7 @@ RPCServerTy::RPCServerTy(plugin::GenericPluginTy &Plugin) Devices(std::make_unique( Plugin.getNumDevices())), Thread(new ServerThread(Buffers.get(), Devices.get(), - Plugin.getNumDevices())) {} + Plugin.getNumDevices(), BufferMutex)) {} llvm::Error RPCServerTy::startThread() { Thread->startThread(); @@ -190,6 +191,7 @@ Error RPCServerTy::initDevice(plugin::GenericDeviceTy &Device, if (auto Err = Device.dataSubmit(ClientGlobal.getPtr(), &client, sizeof(rpc::Client), nullptr)) return Err; + std::lock_guard Lock(BufferMutex); Buffers[Device.getDeviceId()] = RPCBuffer; Devices[Device.getDeviceId()] = &Device; @@ -197,6 +199,7 @@ Error RPCServerTy::initDevice(plugin::GenericDeviceTy &Device, } Error RPCServerTy::deinitDevice(plugin::GenericDeviceTy &Device) { + std::lock_guard Lock(BufferMutex); Device.free(Buffers[Device.getDeviceId()], TARGET_ALLOC_HOST); Buffers[Device.getDeviceId()] = nullptr; Devices[Device.getDeviceId()] = nullptr; `` https://github.com/llvm/llvm-project/pull/126795 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)
https://github.com/jplehr approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/126795 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/126795 Backport baf7a3c1e561ff7e3f7da2261ce1012c4f2ba1c0 Requested by: @jhuber6 >From c03f46f2f0eac60ee407a6c645cfdb62e97fa77b Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 11 Feb 2025 14:57:31 -0600 Subject: [PATCH] [Offload] Properly guard modifications to the RPC device array (#126790) Summary: If the user deallocates an RPC device this can sometimes fail if the RPC server is still running. This will happen if the modification happens while the server is still checking it. This patch adds a mutex to guard modifications to it. (cherry picked from commit baf7a3c1e561ff7e3f7da2261ce1012c4f2ba1c0) --- offload/plugins-nextgen/common/include/RPC.h | 12 +--- offload/plugins-nextgen/common/src/RPC.cpp | 5 - 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/offload/plugins-nextgen/common/include/RPC.h b/offload/plugins-nextgen/common/include/RPC.h index 42fca4aa4aebc..08556f15a76bf 100644 --- a/offload/plugins-nextgen/common/include/RPC.h +++ b/offload/plugins-nextgen/common/include/RPC.h @@ -72,6 +72,9 @@ struct RPCServerTy { /// Array of associated devices. These must be alive as long as the server is. std::unique_ptr Devices; + /// Mutex that guards accesses to the buffers and device array. + std::mutex BufferMutex{}; + /// A helper class for running the user thread that handles the RPC interface. /// Because we only need to check the RPC server while any kernels are /// working, we track submission / completion events to allow the thread to @@ -90,6 +93,9 @@ struct RPCServerTy { std::condition_variable CV; std::mutex Mutex; +/// A reference to the main server's mutex. +std::mutex &BufferMutex; + /// A reference to all the RPC interfaces that the server is handling. llvm::ArrayRef Buffers; @@ -98,9 +104,9 @@ struct RPCServerTy { /// Initialize the worker thread to run in the background. ServerThread(void *Buffers[], plugin::GenericDeviceTy *Devices[], - size_t Length) -: Running(false), NumUsers(0), CV(), Mutex(), Buffers(Buffers, Length), - Devices(Devices, Length) {} + size_t Length, std::mutex &BufferMutex) +: Running(false), NumUsers(0), CV(), Mutex(), BufferMutex(BufferMutex), + Buffers(Buffers, Length), Devices(Devices, Length) {} ~ServerThread() { assert(!Running && "Thread not shut down explicitly\n"); } diff --git a/offload/plugins-nextgen/common/src/RPC.cpp b/offload/plugins-nextgen/common/src/RPC.cpp index e6750a540b391..eb305736d6264 100644 --- a/offload/plugins-nextgen/common/src/RPC.cpp +++ b/offload/plugins-nextgen/common/src/RPC.cpp @@ -131,6 +131,7 @@ void RPCServerTy::ServerThread::run() { Lock.unlock(); while (NumUsers.load(std::memory_order_relaxed) > 0 && Running.load(std::memory_order_relaxed)) { + std::lock_guard Lock(BufferMutex); for (const auto &[Buffer, Device] : llvm::zip_equal(Buffers, Devices)) { if (!Buffer || !Device) continue; @@ -149,7 +150,7 @@ RPCServerTy::RPCServerTy(plugin::GenericPluginTy &Plugin) Devices(std::make_unique( Plugin.getNumDevices())), Thread(new ServerThread(Buffers.get(), Devices.get(), - Plugin.getNumDevices())) {} + Plugin.getNumDevices(), BufferMutex)) {} llvm::Error RPCServerTy::startThread() { Thread->startThread(); @@ -190,6 +191,7 @@ Error RPCServerTy::initDevice(plugin::GenericDeviceTy &Device, if (auto Err = Device.dataSubmit(ClientGlobal.getPtr(), &client, sizeof(rpc::Client), nullptr)) return Err; + std::lock_guard Lock(BufferMutex); Buffers[Device.getDeviceId()] = RPCBuffer; Devices[Device.getDeviceId()] = &Device; @@ -197,6 +199,7 @@ Error RPCServerTy::initDevice(plugin::GenericDeviceTy &Device, } Error RPCServerTy::deinitDevice(plugin::GenericDeviceTy &Device) { + std::lock_guard Lock(BufferMutex); Device.free(Buffers[Device.getDeviceId()], TARGET_ALLOC_HOST); Buffers[Device.getDeviceId()] = nullptr; Devices[Device.getDeviceId()] = nullptr; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)
llvmbot wrote: @jplehr What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/126795 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/126795 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/126795 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [Offload] Properly guard modifications to the RPC device array (#126790) (PR #126795)
github-actions[bot] wrote: @jhuber6 (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/126795 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm] Extend CallSiteInfo with TypeId (PR #87574)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87574 >From 1d7ee612e408ee7e64e984eb08e6d7089a435d09 Mon Sep 17 00:00:00 2001 From: Necip Fazil Yildiran Date: Sun, 2 Feb 2025 00:58:49 + Subject: [PATCH] Simplify MIR test. Created using spr 1.3.6-beta.1 --- .../CodeGen/MIR/X86/call-site-info-typeid.mir | 21 ++- 1 file changed, 6 insertions(+), 15 deletions(-) diff --git a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir index 5ab797bfcc18f..a99ee50a608fb 100644 --- a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir +++ b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir @@ -8,11 +8,6 @@ # CHECK-NEXT: 123456789 } --- | - ; ModuleID = 'test.ll' - source_filename = "test.ll" - target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" - target triple = "x86_64-unknown-linux-gnu" - define dso_local void @foo(i8 signext %a) { entry: ret void @@ -21,10 +16,10 @@ define dso_local i32 @main() { entry: %retval = alloca i32, align 4 -%fp = alloca void (i8)*, align 8 -store i32 0, i32* %retval, align 4 -store void (i8)* @foo, void (i8)** %fp, align 8 -%0 = load void (i8)*, void (i8)** %fp, align 8 +%fp = alloca ptr, align 8 +store i32 0, ptr %retval, align 4 +store ptr @foo, ptr %fp, align 8 +%0 = load ptr, ptr %fp, align 8 call void %0(i8 signext 97) ret i32 0 } @@ -42,12 +37,8 @@ body: | name:main tracksRegLiveness: true stack: - - { id: 0, name: retval, type: default, offset: 0, size: 4, alignment: 4, - stack-id: default, callee-saved-register: '', callee-saved-restored: true, - debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } - - { id: 1, name: fp, type: default, offset: 0, size: 8, alignment: 8, - stack-id: default, callee-saved-register: '', callee-saved-restored: true, - debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } + - { id: 0, name: retval, size: 4, alignment: 4 } + - { id: 1, name: fp, size: 8, alignment: 8 } callSites: - { bb: 0, offset: 6, fwdArgRegs: [], typeId: 123456789 } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553) (PR #126767)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/126767 >From 923d35bcf76529ba3afe736d160cb6be12b63e24 Mon Sep 17 00:00:00 2001 From: higher-performance Date: Tue, 11 Feb 2025 01:52:13 -0500 Subject: [PATCH] Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553) Fixes #126490 (cherry picked from commit 90192e8872cc90b4d292b180a49babf72d17e579) --- clang/lib/Sema/SemaInit.cpp | 4 +++- clang/test/SemaCXX/uninitialized.cpp | 34 +++- 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp index 450edcb52ae15..37796758960cd 100644 --- a/clang/lib/Sema/SemaInit.cpp +++ b/clang/lib/Sema/SemaInit.cpp @@ -4576,7 +4576,9 @@ static void TryConstructorInitialization(Sema &S, if (!IsListInit && (Kind.getKind() == InitializationKind::IK_Default || Kind.getKind() == InitializationKind::IK_Direct) && -DestRecordDecl != nullptr && DestRecordDecl->isAggregate() && +DestRecordDecl != nullptr && +!(CtorDecl->isCopyOrMoveConstructor() && CtorDecl->isImplicit()) && +DestRecordDecl->isAggregate() && DestRecordDecl->hasUninitializedExplicitInitFields()) { S.Diag(Kind.getLocation(), diag::warn_field_requires_explicit_init) << /* Var-in-Record */ 1 << DestRecordDecl; diff --git a/clang/test/SemaCXX/uninitialized.cpp b/clang/test/SemaCXX/uninitialized.cpp index 7578b288d7b3f..4af2c998f082e 100644 --- a/clang/test/SemaCXX/uninitialized.cpp +++ b/clang/test/SemaCXX/uninitialized.cpp @@ -1542,9 +1542,15 @@ void aggregate() { }; }; + struct CopyAndMove { +CopyAndMove() = default; +CopyAndMove(const CopyAndMove &) {} +CopyAndMove(CopyAndMove &&) {} + }; struct Embed { int embed1; // #FIELD_EMBED1 int embed2 [[clang::require_explicit_initialization]]; // #FIELD_EMBED2 +CopyAndMove force_separate_move_ctor; }; struct EmbedDerived : Embed {}; struct F { @@ -1582,7 +1588,33 @@ void aggregate() { F("___"), F("") }; - (void)ctors; + + struct MoveOrCopy { +Embed e; +EmbedDerived ed; +F f; +// no-error +MoveOrCopy(const MoveOrCopy &c) : e(c.e), ed(c.ed), f(c.f) {} +// no-error +MoveOrCopy(MoveOrCopy &&c) +: e(std::move(c.e)), ed(std::move(c.ed)), f(std::move(c.f)) {} + }; + F copy1(ctors[0]); // no-error + (void)copy1; + F move1(std::move(ctors[0])); // no-error + (void)move1; + F copy2{ctors[0]}; // no-error + (void)copy2; + F move2{std::move(ctors[0])}; // no-error + (void)move2; + F copy3 = ctors[0]; // no-error + (void)copy3; + F move3 = std::move(ctors[0]); // no-error + (void)move3; + F copy4 = {ctors[0]}; // no-error + (void)copy4; + F move4 = {std::move(ctors[0])}; // no-error + (void)move4; S::foo(S{1, 2, 3, 4}); S::foo(S{.s1 = 100, .s4 = 100}); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553) (PR #126767)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/126767 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] 923d35b - Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553)
Author: higher-performance Date: 2025-02-11T13:22:17-08:00 New Revision: 923d35bcf76529ba3afe736d160cb6be12b63e24 URL: https://github.com/llvm/llvm-project/commit/923d35bcf76529ba3afe736d160cb6be12b63e24 DIFF: https://github.com/llvm/llvm-project/commit/923d35bcf76529ba3afe736d160cb6be12b63e24.diff LOG: Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553) Fixes #126490 (cherry picked from commit 90192e8872cc90b4d292b180a49babf72d17e579) Added: Modified: clang/lib/Sema/SemaInit.cpp clang/test/SemaCXX/uninitialized.cpp Removed: diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp index 450edcb52ae15..37796758960cd 100644 --- a/clang/lib/Sema/SemaInit.cpp +++ b/clang/lib/Sema/SemaInit.cpp @@ -4576,7 +4576,9 @@ static void TryConstructorInitialization(Sema &S, if (!IsListInit && (Kind.getKind() == InitializationKind::IK_Default || Kind.getKind() == InitializationKind::IK_Direct) && -DestRecordDecl != nullptr && DestRecordDecl->isAggregate() && +DestRecordDecl != nullptr && +!(CtorDecl->isCopyOrMoveConstructor() && CtorDecl->isImplicit()) && +DestRecordDecl->isAggregate() && DestRecordDecl->hasUninitializedExplicitInitFields()) { S.Diag(Kind.getLocation(), diag::warn_field_requires_explicit_init) << /* Var-in-Record */ 1 << DestRecordDecl; diff --git a/clang/test/SemaCXX/uninitialized.cpp b/clang/test/SemaCXX/uninitialized.cpp index 7578b288d7b3f..4af2c998f082e 100644 --- a/clang/test/SemaCXX/uninitialized.cpp +++ b/clang/test/SemaCXX/uninitialized.cpp @@ -1542,9 +1542,15 @@ void aggregate() { }; }; + struct CopyAndMove { +CopyAndMove() = default; +CopyAndMove(const CopyAndMove &) {} +CopyAndMove(CopyAndMove &&) {} + }; struct Embed { int embed1; // #FIELD_EMBED1 int embed2 [[clang::require_explicit_initialization]]; // #FIELD_EMBED2 +CopyAndMove force_separate_move_ctor; }; struct EmbedDerived : Embed {}; struct F { @@ -1582,7 +1588,33 @@ void aggregate() { F("___"), F("") }; - (void)ctors; + + struct MoveOrCopy { +Embed e; +EmbedDerived ed; +F f; +// no-error +MoveOrCopy(const MoveOrCopy &c) : e(c.e), ed(c.ed), f(c.f) {} +// no-error +MoveOrCopy(MoveOrCopy &&c) +: e(std::move(c.e)), ed(std::move(c.ed)), f(std::move(c.f)) {} + }; + F copy1(ctors[0]); // no-error + (void)copy1; + F move1(std::move(ctors[0])); // no-error + (void)move1; + F copy2{ctors[0]}; // no-error + (void)copy2; + F move2{std::move(ctors[0])}; // no-error + (void)move2; + F copy3 = ctors[0]; // no-error + (void)copy3; + F move3 = std::move(ctors[0]); // no-error + (void)move3; + F copy4 = {ctors[0]}; // no-error + (void)copy4; + F move4 = {std::move(ctors[0])}; // no-error + (void)move4; S::foo(S{1, 2, 3, 4}); S::foo(S{.s1 = 100, .s4 = 100}); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm] Extend CallSiteInfo with TypeId (PR #87574)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87574 >From 1d7ee612e408ee7e64e984eb08e6d7089a435d09 Mon Sep 17 00:00:00 2001 From: Necip Fazil Yildiran Date: Sun, 2 Feb 2025 00:58:49 + Subject: [PATCH] Simplify MIR test. Created using spr 1.3.6-beta.1 --- .../CodeGen/MIR/X86/call-site-info-typeid.mir | 21 ++- 1 file changed, 6 insertions(+), 15 deletions(-) diff --git a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir index 5ab797bfcc18f..a99ee50a608fb 100644 --- a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir +++ b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir @@ -8,11 +8,6 @@ # CHECK-NEXT: 123456789 } --- | - ; ModuleID = 'test.ll' - source_filename = "test.ll" - target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" - target triple = "x86_64-unknown-linux-gnu" - define dso_local void @foo(i8 signext %a) { entry: ret void @@ -21,10 +16,10 @@ define dso_local i32 @main() { entry: %retval = alloca i32, align 4 -%fp = alloca void (i8)*, align 8 -store i32 0, i32* %retval, align 4 -store void (i8)* @foo, void (i8)** %fp, align 8 -%0 = load void (i8)*, void (i8)** %fp, align 8 +%fp = alloca ptr, align 8 +store i32 0, ptr %retval, align 4 +store ptr @foo, ptr %fp, align 8 +%0 = load ptr, ptr %fp, align 8 call void %0(i8 signext 97) ret i32 0 } @@ -42,12 +37,8 @@ body: | name:main tracksRegLiveness: true stack: - - { id: 0, name: retval, type: default, offset: 0, size: 4, alignment: 4, - stack-id: default, callee-saved-register: '', callee-saved-restored: true, - debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } - - { id: 1, name: fp, type: default, offset: 0, size: 8, alignment: 8, - stack-id: default, callee-saved-register: '', callee-saved-restored: true, - debug-info-variable: '', debug-info-expression: '', debug-info-location: '' } + - { id: 0, name: retval, size: 4, alignment: 4 } + - { id: 1, name: fp, size: 8, alignment: 8 } callSites: - { bb: 0, offset: 6, fwdArgRegs: [], typeId: 123456789 } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: Fix false positive of [[clang::require_explicit_initialization]] on copy/move constructors (#126553) (PR #126767)
github-actions[bot] wrote: @higher-performance (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/126767 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm] Add option to emit `callgraph` section (PR #87574)
https://github.com/Prabhuk edited https://github.com/llvm/llvm-project/pull/87574 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add option to emit call graph section (PR #87572)
https://github.com/Prabhuk closed https://github.com/llvm/llvm-project/pull/87572 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope (PR #126685)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/126685 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope (PR #126685)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/126685 >From ac97cff5a3684be98f4863191f0006cdf0fa89b4 Mon Sep 17 00:00:00 2001 From: Chuanqi Xu Date: Tue, 11 Feb 2025 14:08:47 +0800 Subject: [PATCH] [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope Close https://github.com/llvm/llvm-project/issues/126373 Although the root problems should be we shouldn't place the friend declaration to the incorrect module, let's avoid bleeding the edge by stoping diagnosing declarations not in file scope. (cherry picked from commit 569e94f8f1c3e6998860e2b2ff577870433bdac9) --- clang/lib/Serialization/ASTReaderDecl.cpp | 7 + clang/test/Modules/pr126373.cppm | 34 +++ 2 files changed, 41 insertions(+) create mode 100644 clang/test/Modules/pr126373.cppm diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp b/clang/lib/Serialization/ASTReaderDecl.cpp index 1aa94d5a22abe..8fbb0a8d3edd8 100644 --- a/clang/lib/Serialization/ASTReaderDecl.cpp +++ b/clang/lib/Serialization/ASTReaderDecl.cpp @@ -3751,6 +3751,13 @@ void ASTDeclReader::checkMultipleDefinitionInNamedModules(ASTReader &Reader, if (D->getFriendObjectKind() || Previous->getFriendObjectKind()) return; + // Skip diagnosing in-class declarations. + if (!Previous->getLexicalDeclContext() + ->getNonTransparentContext() + ->isFileContext() || + !D->getLexicalDeclContext()->getNonTransparentContext()->isFileContext()) +return; + Module *M = Previous->getOwningModule(); if (!M) return; diff --git a/clang/test/Modules/pr126373.cppm b/clang/test/Modules/pr126373.cppm new file mode 100644 index 0..f176a587b51ce --- /dev/null +++ b/clang/test/Modules/pr126373.cppm @@ -0,0 +1,34 @@ +// RUN: rm -rf %t +// RUN: mkdir -p %t +// RUN: split-file %s %t +// +// RUN: %clang_cc1 -std=c++20 %t/module1.cppm -emit-module-interface -o %t/module1.pcm +// RUN: %clang_cc1 -std=c++20 -fmodule-file=module1=%t/module1.pcm %t/module2.cppm \ +// RUN: -emit-module-interface -o %t/module2.pcm +// RUN: %clang_cc1 -std=c++20 %t/module2.pcm -fmodule-file=module1=%t/module1.pcm \ +// RUN: -emit-llvm -o - | FileCheck %t/module2.cppm + +//--- test.h +template +struct Test { + template + friend class Test; +}; + +//--- module1.cppm +module; +#include "test.h" +export module module1; +export void f1(Test) {} + +//--- module2.cppm +module; +#include "test.h" +export module module2; +import module1; +export void f2(Test) {} + +extern "C" void func() {} + +// Fine enough to check the IR is emitted correctly. +// CHECK: define{{.*}}@func ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope (PR #126685)
github-actions[bot] wrote: @ChuanqiXu9 (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/126685 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm] Extract and propagate indirect call type id (PR #87575)
@@ -3631,6 +3631,12 @@ bool X86FastISel::fastLowerCall(CallLoweringInfo &CLI) { CLI.NumResultRegs = RVLocs.size(); CLI.Call = MIB; + // Add call site info for call graph section. + if (TM.Options.EmitCallGraphSection && CB && CB->isIndirectCall()) { +MachineFunction::CallSiteInfo CSInfo(*CB); +MF->addCallSiteInfo(CLI.Call, std::move(CSInfo)); Prabhuk wrote: Only other location which doesn't use std::move is in `llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp` where the temporary returned from `DAG->getCallSiteInfo(Node)` is directly passed to `addCallSiteInfo`. Adding std::move there will prevent copy elision. https://github.com/llvm/llvm-project/pull/87575 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] f33b128 - [AVX10.2] Fix wrong intrinsic names after rename (#126390)
Author: Mikołaj Piróg Date: 2025-02-11T14:09:54-08:00 New Revision: f33b128b3dc147a973cef55222549345b3201ad5 URL: https://github.com/llvm/llvm-project/commit/f33b128b3dc147a973cef55222549345b3201ad5 DIFF: https://github.com/llvm/llvm-project/commit/f33b128b3dc147a973cef55222549345b3201ad5.diff LOG: [AVX10.2] Fix wrong intrinsic names after rename (#126390) In my previous PR (#123656) to update the names of AVX10.2 intrinsics and mnemonics, I have erroneously deleted `_ph` from few intrinsics. This PR corrects this. (cherry picked from commit 161cfc6f39bef8994eb944687033ebd3570196e8) Added: Modified: clang/lib/Headers/avx10_2_512convertintrin.h clang/lib/Headers/avx10_2convertintrin.h clang/test/CodeGen/X86/avx10_2_512convert-builtins.c clang/test/CodeGen/X86/avx10_2convert-builtins.c Removed: diff --git a/clang/lib/Headers/avx10_2_512convertintrin.h b/clang/lib/Headers/avx10_2_512convertintrin.h index 0b5fca5cda522..516ccc68672d6 100644 --- a/clang/lib/Headers/avx10_2_512convertintrin.h +++ b/clang/lib/Headers/avx10_2_512convertintrin.h @@ -213,19 +213,19 @@ _mm512_maskz_cvts2ph_hf8(__mmask64 __U, __m512h __A, __m512h __B) { (__v64qi)(__m512i)_mm512_setzero_si512()); } -static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_cvthf8(__m256i __A) { +static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_cvthf8_ph(__m256i __A) { return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask( (__v32qi)__A, (__v32hf)(__m512h)_mm512_undefined_ph(), (__mmask32)-1); } static __inline__ __m512h __DEFAULT_FN_ATTRS512 -_mm512_mask_cvthf8(__m512h __W, __mmask32 __U, __m256i __A) { +_mm512_mask_cvthf8_ph(__m512h __W, __mmask32 __U, __m256i __A) { return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask( (__v32qi)__A, (__v32hf)(__m512h)__W, (__mmask32)__U); } static __inline__ __m512h __DEFAULT_FN_ATTRS512 -_mm512_maskz_cvthf8(__mmask32 __U, __m256i __A) { +_mm512_maskz_cvthf8_ph(__mmask32 __U, __m256i __A) { return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask( (__v32qi)__A, (__v32hf)(__m512h)_mm512_setzero_ph(), (__mmask32)__U); } diff --git a/clang/lib/Headers/avx10_2convertintrin.h b/clang/lib/Headers/avx10_2convertintrin.h index 79d9def2207b8..07722090c30ee 100644 --- a/clang/lib/Headers/avx10_2convertintrin.h +++ b/clang/lib/Headers/avx10_2convertintrin.h @@ -381,37 +381,36 @@ _mm256_maskz_cvts2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) { (__v32qi)(__m256i)_mm256_setzero_si256()); } -static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_cvthf8(__m128i __A) { +static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_cvthf8_ph(__m128i __A) { return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask( (__v16qi)__A, (__v8hf)(__m128h)_mm_undefined_ph(), (__mmask8)-1); } -static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_mask_cvthf8(__m128h __W, -__mmask8 __U, -__m128i __A) { +static __inline__ __m128h __DEFAULT_FN_ATTRS128 +_mm_mask_cvthf8_ph(__m128h __W, __mmask8 __U, __m128i __A) { return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask( (__v16qi)__A, (__v8hf)(__m128h)__W, (__mmask8)__U); } -static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_maskz_cvthf8(__mmask8 __U, - __m128i __A) { +static __inline__ __m128h __DEFAULT_FN_ATTRS128 +_mm_maskz_cvthf8_ph(__mmask8 __U, __m128i __A) { return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask( (__v16qi)__A, (__v8hf)(__m128h)_mm_setzero_ph(), (__mmask8)__U); } -static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_cvthf8(__m128i __A) { +static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_cvthf8_ph(__m128i __A) { return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask( (__v16qi)__A, (__v16hf)(__m256h)_mm256_undefined_ph(), (__mmask16)-1); } static __inline__ __m256h __DEFAULT_FN_ATTRS256 -_mm256_mask_cvthf8(__m256h __W, __mmask16 __U, __m128i __A) { +_mm256_mask_cvthf8_ph(__m256h __W, __mmask16 __U, __m128i __A) { return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask( (__v16qi)__A, (__v16hf)(__m256h)__W, (__mmask16)__U); } static __inline__ __m256h __DEFAULT_FN_ATTRS256 -_mm256_maskz_cvthf8(__mmask16 __U, __m128i __A) { +_mm256_maskz_cvthf8_ph(__mmask16 __U, __m128i __A) { return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask( (__v16qi)__A, (__v16hf)(__m256h)_mm256_setzero_ph(), (__mmask16)__U); } diff --git a/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c b/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c index 22503c640a727..dcf7bbc005a7c 100644 --- a/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c +++ b/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c @@ -201,22 +201,22 @@ __m512i test_mm512_maskz_cvts2ph_hf8(__mmask64 __U, __m512h __A, __m512
[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong intrinsic names after rename (#126390) (PR #126687)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/126687 >From f33b128b3dc147a973cef55222549345b3201ad5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miko=C5=82aj=20Pir=C3=B3g?= Date: Mon, 10 Feb 2025 05:48:02 +0100 Subject: [PATCH] [AVX10.2] Fix wrong intrinsic names after rename (#126390) In my previous PR (#123656) to update the names of AVX10.2 intrinsics and mnemonics, I have erroneously deleted `_ph` from few intrinsics. This PR corrects this. (cherry picked from commit 161cfc6f39bef8994eb944687033ebd3570196e8) --- clang/lib/Headers/avx10_2_512convertintrin.h | 6 ++-- clang/lib/Headers/avx10_2convertintrin.h | 17 + .../CodeGen/X86/avx10_2_512convert-builtins.c | 18 +- .../CodeGen/X86/avx10_2convert-builtins.c | 36 +-- 4 files changed, 38 insertions(+), 39 deletions(-) diff --git a/clang/lib/Headers/avx10_2_512convertintrin.h b/clang/lib/Headers/avx10_2_512convertintrin.h index 0b5fca5cda522..516ccc68672d6 100644 --- a/clang/lib/Headers/avx10_2_512convertintrin.h +++ b/clang/lib/Headers/avx10_2_512convertintrin.h @@ -213,19 +213,19 @@ _mm512_maskz_cvts2ph_hf8(__mmask64 __U, __m512h __A, __m512h __B) { (__v64qi)(__m512i)_mm512_setzero_si512()); } -static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_cvthf8(__m256i __A) { +static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_cvthf8_ph(__m256i __A) { return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask( (__v32qi)__A, (__v32hf)(__m512h)_mm512_undefined_ph(), (__mmask32)-1); } static __inline__ __m512h __DEFAULT_FN_ATTRS512 -_mm512_mask_cvthf8(__m512h __W, __mmask32 __U, __m256i __A) { +_mm512_mask_cvthf8_ph(__m512h __W, __mmask32 __U, __m256i __A) { return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask( (__v32qi)__A, (__v32hf)(__m512h)__W, (__mmask32)__U); } static __inline__ __m512h __DEFAULT_FN_ATTRS512 -_mm512_maskz_cvthf8(__mmask32 __U, __m256i __A) { +_mm512_maskz_cvthf8_ph(__mmask32 __U, __m256i __A) { return (__m512h)__builtin_ia32_vcvthf8_2ph512_mask( (__v32qi)__A, (__v32hf)(__m512h)_mm512_setzero_ph(), (__mmask32)__U); } diff --git a/clang/lib/Headers/avx10_2convertintrin.h b/clang/lib/Headers/avx10_2convertintrin.h index 79d9def2207b8..07722090c30ee 100644 --- a/clang/lib/Headers/avx10_2convertintrin.h +++ b/clang/lib/Headers/avx10_2convertintrin.h @@ -381,37 +381,36 @@ _mm256_maskz_cvts2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) { (__v32qi)(__m256i)_mm256_setzero_si256()); } -static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_cvthf8(__m128i __A) { +static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_cvthf8_ph(__m128i __A) { return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask( (__v16qi)__A, (__v8hf)(__m128h)_mm_undefined_ph(), (__mmask8)-1); } -static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_mask_cvthf8(__m128h __W, -__mmask8 __U, -__m128i __A) { +static __inline__ __m128h __DEFAULT_FN_ATTRS128 +_mm_mask_cvthf8_ph(__m128h __W, __mmask8 __U, __m128i __A) { return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask( (__v16qi)__A, (__v8hf)(__m128h)__W, (__mmask8)__U); } -static __inline__ __m128h __DEFAULT_FN_ATTRS128 _mm_maskz_cvthf8(__mmask8 __U, - __m128i __A) { +static __inline__ __m128h __DEFAULT_FN_ATTRS128 +_mm_maskz_cvthf8_ph(__mmask8 __U, __m128i __A) { return (__m128h)__builtin_ia32_vcvthf8_2ph128_mask( (__v16qi)__A, (__v8hf)(__m128h)_mm_setzero_ph(), (__mmask8)__U); } -static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_cvthf8(__m128i __A) { +static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_cvthf8_ph(__m128i __A) { return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask( (__v16qi)__A, (__v16hf)(__m256h)_mm256_undefined_ph(), (__mmask16)-1); } static __inline__ __m256h __DEFAULT_FN_ATTRS256 -_mm256_mask_cvthf8(__m256h __W, __mmask16 __U, __m128i __A) { +_mm256_mask_cvthf8_ph(__m256h __W, __mmask16 __U, __m128i __A) { return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask( (__v16qi)__A, (__v16hf)(__m256h)__W, (__mmask16)__U); } static __inline__ __m256h __DEFAULT_FN_ATTRS256 -_mm256_maskz_cvthf8(__mmask16 __U, __m128i __A) { +_mm256_maskz_cvthf8_ph(__mmask16 __U, __m128i __A) { return (__m256h)__builtin_ia32_vcvthf8_2ph256_mask( (__v16qi)__A, (__v16hf)(__m256h)_mm256_setzero_ph(), (__mmask16)__U); } diff --git a/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c b/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c index 22503c640a727..dcf7bbc005a7c 100644 --- a/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c +++ b/clang/test/CodeGen/X86/avx10_2_512convert-builtins.c @@ -201,22 +201,22 @@ __m512i test_mm512_maskz_cvts2ph_hf8(__mmask64 __U, __m512h __A, __m512h __B) { return _mm512_maskz_cvts2ph_hf8(__U, __A, __B); } -__m
[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627) (PR #126666)
github-actions[bot] wrote: @phoebewang (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/12 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236) (PR #126492)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/126492 >From 9bbf3a98b793f8fc6269a20a026ca6fe029a1790 Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Fri, 7 Feb 2025 12:41:06 +0100 Subject: [PATCH 1/2] [IndVars] Add test for #126012 (NFC) (cherry picked from commit ae08969a2068dd327fbf4d0f606550574fbb9e45) --- .../Transforms/IndVarSimplify/pr126012.ll | 49 +++ 1 file changed, 49 insertions(+) create mode 100644 llvm/test/Transforms/IndVarSimplify/pr126012.ll diff --git a/llvm/test/Transforms/IndVarSimplify/pr126012.ll b/llvm/test/Transforms/IndVarSimplify/pr126012.ll new file mode 100644 index 0..725ea89b8e651 --- /dev/null +++ b/llvm/test/Transforms/IndVarSimplify/pr126012.ll @@ -0,0 +1,49 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 +; RUN: opt -S -passes=indvars < %s | FileCheck %s + +; FIXME: This is a miscompile. +define i32 @test() { +; CHECK-LABEL: define i32 @test() { +; CHECK-NEXT: [[ENTRY:.*]]: +; CHECK-NEXT:br label %[[FOR_PREHEADER:.*]] +; CHECK: [[FOR_PREHEADER]]: +; CHECK-NEXT:[[INDVAR1:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[PHI:%.*]], %[[FOR_INC:.*]] ] +; CHECK-NEXT:[[INDVAR3:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC:%.*]], %[[FOR_INC]] ] +; CHECK-NEXT:[[COND1:%.*]] = icmp eq i32 [[INDVAR3]], 0 +; CHECK-NEXT:br i1 [[COND1]], label %[[FOR_INC]], label %[[FOR_END:.*]] +; CHECK: [[FOR_END]]: +; CHECK-NEXT:[[EXT:%.*]] = zext i1 true to i32 +; CHECK-NEXT:br label %[[FOR_INC]] +; CHECK: [[FOR_INC]]: +; CHECK-NEXT:[[PHI]] = phi i32 [ [[EXT]], %[[FOR_END]] ], [ 0, %[[FOR_PREHEADER]] ] +; CHECK-NEXT:[[INC]] = add nuw nsw i32 [[INDVAR3]], 1 +; CHECK-NEXT:[[EXITCOND:%.*]] = icmp eq i32 [[INDVAR3]], 2 +; CHECK-NEXT:br i1 [[EXITCOND]], label %[[FOR_EXIT:.*]], label %[[FOR_PREHEADER]] +; CHECK: [[FOR_EXIT]]: +; CHECK-NEXT:[[INDVAR1_LCSSA:%.*]] = phi i32 [ [[INDVAR1]], %[[FOR_INC]] ] +; CHECK-NEXT:ret i32 [[INDVAR1_LCSSA]] +; +entry: + br label %for.preheader + +for.preheader: + %indvar1 = phi i32 [ 0, %entry ], [ %phi, %for.inc ] + %indvar2 = phi i32 [ 1, %entry ], [ %indvar3, %for.inc ] + %indvar3 = phi i32 [ 0, %entry ], [ %inc, %for.inc ] + %cond1 = icmp eq i32 %indvar3, 0 + br i1 %cond1, label %for.inc, label %for.end + +for.end: + %cmp = icmp sgt i32 %indvar2, 0 + %ext = zext i1 %cmp to i32 + br label %for.inc + +for.inc: + %phi = phi i32 [ %ext, %for.end ], [ 0, %for.preheader ] + %inc = add i32 %indvar3, 1 + %exitcond = icmp eq i32 %indvar3, 2 + br i1 %exitcond, label %for.exit, label %for.preheader + +for.exit: + ret i32 %indvar1 +} >From af970cd8753c37e7fcf66b6211f2a2d1e261325c Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Mon, 10 Feb 2025 10:07:21 +0100 Subject: [PATCH 2/2] [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236) The code already guards against values coming from a previous iteration using properlyDominates(). However, addrecs are considered to properly dominate the loop they are defined in. Handle this special case separately, by checking for expressions that have computable loop evolution (this should cover cases like a zext of an addrec as well). I considered changing the definition of properlyDominates() instead, but decided against it. The current definition is useful in other context, e.g. when deciding whether an expression is safe to expand in a given block. Fixes https://github.com/llvm/llvm-project/issues/126012. (cherry picked from commit 7aed53eb1982113e825534f0f66d0a0e46e7a5ed) --- llvm/lib/Analysis/ScalarEvolution.cpp | 6 ++ llvm/test/Transforms/IndVarSimplify/pr126012.ll | 10 +++--- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp index 2ce40877b523e..c71202c8dd58e 100644 --- a/llvm/lib/Analysis/ScalarEvolution.cpp +++ b/llvm/lib/Analysis/ScalarEvolution.cpp @@ -12402,6 +12402,12 @@ bool ScalarEvolution::isImpliedViaMerge(CmpPredicate Pred, const SCEV *LHS, // iteration of a loop. if (!properlyDominates(L, LBB)) return false; + // Addrecs are considered to properly dominate their loop, so are missed + // by the previous check. Discard any values that have computable + // evolution in this loop. + if (auto *Loop = LI.getLoopFor(LBB)) +if (hasComputableLoopEvolution(L, Loop)) + return false; if (!ProvedEasily(L, RHS)) return false; } diff --git a/llvm/test/Transforms/IndVarSimplify/pr126012.ll b/llvm/test/Transforms/IndVarSimplify/pr126012.ll index 725ea89b8e651..5189fe020dd3b 100644 --- a/llvm/test/Transforms/IndVarSimplify/pr126012.ll +++ b/llvm/test/Transforms/IndVarSimplify/pr126012.ll @@ -1,18 +1,22 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 ; RUN: opt
[llvm-branch-commits] [clang] release/20.x: [clang-format] Handle C-style cast of member function pointer type (#126340) (PR #126479)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/126479 >From bc87f9b80946dfe651d953c2fb4967ea32277a34 Mon Sep 17 00:00:00 2001 From: Owen Pan Date: Sat, 8 Feb 2025 23:22:33 -0800 Subject: [PATCH] [clang-format] Handle C-style cast of member function pointer type (#126340) Fixes #125012. (cherry picked from commit 8d373ceaec1f1b27c9e682cfaf71aae19ea48d98) --- clang/lib/Format/TokenAnnotator.cpp | 7 +-- clang/unittests/Format/TokenAnnotatorTest.cpp | 6 ++ 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp index a172df5291ae6..4246ade6e19be 100644 --- a/clang/lib/Format/TokenAnnotator.cpp +++ b/clang/lib/Format/TokenAnnotator.cpp @@ -477,8 +477,9 @@ class AnnotatingParser { FormatToken *PossibleObjCForInToken = nullptr; while (CurrentToken) { const auto &Prev = *CurrentToken->Previous; + const auto *PrevPrev = Prev.Previous; if (Prev.is(TT_PointerOrReference) && - Prev.Previous->isOneOf(tok::l_paren, tok::coloncolon)) { + PrevPrev->isOneOf(tok::l_paren, tok::coloncolon)) { ProbablyFunctionType = true; } if (CurrentToken->is(tok::comma)) @@ -486,8 +487,10 @@ class AnnotatingParser { if (Prev.is(TT_BinaryOperator)) Contexts.back().IsExpression = true; if (CurrentToken->is(tok::r_paren)) { -if (Prev.is(TT_PointerOrReference) && Prev.Previous == &OpeningParen) +if (Prev.is(TT_PointerOrReference) && +(PrevPrev == &OpeningParen || PrevPrev->is(tok::coloncolon))) { MightBeFunctionType = true; +} if (OpeningParen.isNot(TT_CppCastLParen) && MightBeFunctionType && ProbablyFunctionType && CurrentToken->Next && (CurrentToken->Next->is(tok::l_paren) || diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp index fc77e277947c5..2147a1b950dd1 100644 --- a/clang/unittests/Format/TokenAnnotatorTest.cpp +++ b/clang/unittests/Format/TokenAnnotatorTest.cpp @@ -845,6 +845,12 @@ TEST_F(TokenAnnotatorTest, UnderstandsCasts) { EXPECT_TOKEN(Tokens[14], tok::r_paren, TT_CastRParen); EXPECT_TOKEN(Tokens[15], tok::amp, TT_UnaryOperator); + Tokens = annotate("return (Foo (Bar::*)())&Bar::foo;"); + ASSERT_EQ(Tokens.size(), 17u) << Tokens; + EXPECT_TOKEN(Tokens[3], tok::l_paren, TT_FunctionTypeLParen); + EXPECT_TOKEN(Tokens[10], tok::r_paren, TT_CastRParen); + EXPECT_TOKEN(Tokens[11], tok::amp, TT_UnaryOperator); + auto Style = getLLVMStyle(); Style.TypeNames.push_back("Foo"); Tokens = annotate("#define FOO(bar) foo((Foo)&bar)", Style); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [clang-format] Handle C-style cast of member function pointer type (#126340) (PR #126479)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/126479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] bc87f9b - [clang-format] Handle C-style cast of member function pointer type (#126340)
Author: Owen Pan Date: 2025-02-11T14:20:03-08:00 New Revision: bc87f9b80946dfe651d953c2fb4967ea32277a34 URL: https://github.com/llvm/llvm-project/commit/bc87f9b80946dfe651d953c2fb4967ea32277a34 DIFF: https://github.com/llvm/llvm-project/commit/bc87f9b80946dfe651d953c2fb4967ea32277a34.diff LOG: [clang-format] Handle C-style cast of member function pointer type (#126340) Fixes #125012. (cherry picked from commit 8d373ceaec1f1b27c9e682cfaf71aae19ea48d98) Added: Modified: clang/lib/Format/TokenAnnotator.cpp clang/unittests/Format/TokenAnnotatorTest.cpp Removed: diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp index a172df5291ae6..4246ade6e19be 100644 --- a/clang/lib/Format/TokenAnnotator.cpp +++ b/clang/lib/Format/TokenAnnotator.cpp @@ -477,8 +477,9 @@ class AnnotatingParser { FormatToken *PossibleObjCForInToken = nullptr; while (CurrentToken) { const auto &Prev = *CurrentToken->Previous; + const auto *PrevPrev = Prev.Previous; if (Prev.is(TT_PointerOrReference) && - Prev.Previous->isOneOf(tok::l_paren, tok::coloncolon)) { + PrevPrev->isOneOf(tok::l_paren, tok::coloncolon)) { ProbablyFunctionType = true; } if (CurrentToken->is(tok::comma)) @@ -486,8 +487,10 @@ class AnnotatingParser { if (Prev.is(TT_BinaryOperator)) Contexts.back().IsExpression = true; if (CurrentToken->is(tok::r_paren)) { -if (Prev.is(TT_PointerOrReference) && Prev.Previous == &OpeningParen) +if (Prev.is(TT_PointerOrReference) && +(PrevPrev == &OpeningParen || PrevPrev->is(tok::coloncolon))) { MightBeFunctionType = true; +} if (OpeningParen.isNot(TT_CppCastLParen) && MightBeFunctionType && ProbablyFunctionType && CurrentToken->Next && (CurrentToken->Next->is(tok::l_paren) || diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp index fc77e277947c5..2147a1b950dd1 100644 --- a/clang/unittests/Format/TokenAnnotatorTest.cpp +++ b/clang/unittests/Format/TokenAnnotatorTest.cpp @@ -845,6 +845,12 @@ TEST_F(TokenAnnotatorTest, UnderstandsCasts) { EXPECT_TOKEN(Tokens[14], tok::r_paren, TT_CastRParen); EXPECT_TOKEN(Tokens[15], tok::amp, TT_UnaryOperator); + Tokens = annotate("return (Foo (Bar::*)())&Bar::foo;"); + ASSERT_EQ(Tokens.size(), 17u) << Tokens; + EXPECT_TOKEN(Tokens[3], tok::l_paren, TT_FunctionTypeLParen); + EXPECT_TOKEN(Tokens[10], tok::r_paren, TT_CastRParen); + EXPECT_TOKEN(Tokens[11], tok::amp, TT_UnaryOperator); + auto Style = getLLVMStyle(); Style.TypeNames.push_back("Foo"); Tokens = annotate("#define FOO(bar) foo((Foo)&bar)", Style); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [clang-format] Handle C-style cast of member function pointer type (#126340) (PR #126479)
github-actions[bot] wrote: @owenca (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/126479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (PR #126607)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/126607 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] d43a971 - release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (#126607)
Author: Amr Hesham Date: 2025-02-11T14:22:01-08:00 New Revision: d43a97163c43d3cfbfc7c11287aea2233bc7ffb4 URL: https://github.com/llvm/llvm-project/commit/d43a97163c43d3cfbfc7c11287aea2233bc7ffb4 DIFF: https://github.com/llvm/llvm-project/commit/d43a97163c43d3cfbfc7c11287aea2233bc7ffb4.diff LOG: release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (#126607) Add release note for llvm-objcopy fixing prints wrong path when dump-section output path doesn't exist in #125345 Added: Modified: llvm/docs/ReleaseNotes.md Removed: diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md index 44a0b17d6a07b..28908490b8f7c 100644 --- a/llvm/docs/ReleaseNotes.md +++ b/llvm/docs/ReleaseNotes.md @@ -460,6 +460,8 @@ Changes to the LLVM tools `--localize-symbol`, `--localize-symbols`, `--skip-symbol`, `--skip-symbols`. +* llvm-objcopy now prints the correct file path in the error message when the output file specified by `--dump-section` cannot be opened. + Changes to LLDB - ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] ac97cff - [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope
Author: Chuanqi Xu Date: 2025-02-11T14:05:14-08:00 New Revision: ac97cff5a3684be98f4863191f0006cdf0fa89b4 URL: https://github.com/llvm/llvm-project/commit/ac97cff5a3684be98f4863191f0006cdf0fa89b4 DIFF: https://github.com/llvm/llvm-project/commit/ac97cff5a3684be98f4863191f0006cdf0fa89b4.diff LOG: [C++20] [Modules] Don't diagnose duplicated declarations in different modules which is not in file scope Close https://github.com/llvm/llvm-project/issues/126373 Although the root problems should be we shouldn't place the friend declaration to the incorrect module, let's avoid bleeding the edge by stoping diagnosing declarations not in file scope. (cherry picked from commit 569e94f8f1c3e6998860e2b2ff577870433bdac9) Added: clang/test/Modules/pr126373.cppm Modified: clang/lib/Serialization/ASTReaderDecl.cpp Removed: diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp b/clang/lib/Serialization/ASTReaderDecl.cpp index 1aa94d5a22abe..8fbb0a8d3edd8 100644 --- a/clang/lib/Serialization/ASTReaderDecl.cpp +++ b/clang/lib/Serialization/ASTReaderDecl.cpp @@ -3751,6 +3751,13 @@ void ASTDeclReader::checkMultipleDefinitionInNamedModules(ASTReader &Reader, if (D->getFriendObjectKind() || Previous->getFriendObjectKind()) return; + // Skip diagnosing in-class declarations. + if (!Previous->getLexicalDeclContext() + ->getNonTransparentContext() + ->isFileContext() || + !D->getLexicalDeclContext()->getNonTransparentContext()->isFileContext()) +return; + Module *M = Previous->getOwningModule(); if (!M) return; diff --git a/clang/test/Modules/pr126373.cppm b/clang/test/Modules/pr126373.cppm new file mode 100644 index 0..f176a587b51ce --- /dev/null +++ b/clang/test/Modules/pr126373.cppm @@ -0,0 +1,34 @@ +// RUN: rm -rf %t +// RUN: mkdir -p %t +// RUN: split-file %s %t +// +// RUN: %clang_cc1 -std=c++20 %t/module1.cppm -emit-module-interface -o %t/module1.pcm +// RUN: %clang_cc1 -std=c++20 -fmodule-file=module1=%t/module1.pcm %t/module2.cppm \ +// RUN: -emit-module-interface -o %t/module2.pcm +// RUN: %clang_cc1 -std=c++20 %t/module2.pcm -fmodule-file=module1=%t/module1.pcm \ +// RUN: -emit-llvm -o - | FileCheck %t/module2.cppm + +//--- test.h +template +struct Test { + template + friend class Test; +}; + +//--- module1.cppm +module; +#include "test.h" +export module module1; +export void f1(Test) {} + +//--- module2.cppm +module; +#include "test.h" +export module module2; +import module1; +export void f2(Test) {} + +extern "C" void func() {} + +// Fine enough to check the IR is emitted correctly. +// CHECK: define{{.*}}@func ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627) (PR #126666)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/12 >From 1c36697fbb554b49b00bd2e9bd842ffcb73d9a0f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Miko=C5=82aj=20Pir=C3=B3g?= Date: Tue, 11 Feb 2025 06:13:36 +0100 Subject: [PATCH] [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627) Found during work on #120927. This caused the compiler to silently drop ignore half of the mask in the specific intrinsics. (cherry picked from commit af522c5dd3a38cc5e11e8e62009d7dbe2cde2d86) --- clang/lib/Headers/avx10_2convertintrin.h | 16 clang/test/CodeGen/X86/avx10_2convert-builtins.c | 16 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/clang/lib/Headers/avx10_2convertintrin.h b/clang/lib/Headers/avx10_2convertintrin.h index c67a5b890f195..79d9def2207b8 100644 --- a/clang/lib/Headers/avx10_2convertintrin.h +++ b/clang/lib/Headers/avx10_2convertintrin.h @@ -260,13 +260,13 @@ static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_cvt2ph_bf8(__m256h __A, static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_mask_cvt2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)__W); + (__mmask32)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)__W); } static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_maskz_cvt2ph_bf8(__mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), + (__mmask32)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)(__m256i)_mm256_setzero_si256()); } @@ -297,13 +297,13 @@ _mm256_cvts2ph_bf8(__m256h __A, __m256h __B) { static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_mask_cvts2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)__W); + (__mmask32)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)__W); } static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_maskz_cvts2ph_bf8(__mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), + (__mmask32)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)(__m256i)_mm256_setzero_si256()); } @@ -334,13 +334,13 @@ static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_cvt2ph_hf8(__m256h __A, static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_mask_cvt2ph_hf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)__W); + (__mmask32)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)__W); } static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_maskz_cvt2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), + (__mmask32)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)(__m256i)_mm256_setzero_si256()); } @@ -371,13 +371,13 @@ _mm256_cvts2ph_hf8(__m256h __A, __m256h __B) { static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_mask_cvts2ph_hf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)__W); + (__mmask32)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)__W); } static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_maskz_cvts2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), + (__mmask32)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)(__m256i)_mm256_setzero_si256()); } diff --git a/clang/test/CodeGen/X86/avx10_2convert-builtins.c b/clang/test/CodeGen/X86/avx10_2convert-builtins.c index efd9a31c40875..e5e6f867e119e 100644 --- a/clang/test/CodeGen/X86/avx10_2convert-builtins.c +++ b/clang/test/CodeGen/X86/avx10_2convert-builtins.c @@ -231,7 +231,7 @@ __m256i test_mm256_cvt2ph_bf8(__m256h __A, __m256h __B) { return _mm256_cvt2ph_bf8(__A, __B); } -__m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask16 __U, __m256h __A, __m256h __B) { +__m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { // CHECK-LABEL: @test_mm256_mask_cvt2ph_bf8( // CHECK: call <32 x i8> @llvm.x86.avx10.vcvt2ph2bf8256( // CHECK: select <32 x i1> %{{.*}}, <32 x i8> %{{.*}}, <32 x i8> %{{.*}} @@ -239,7 +239,7 @@ __m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask16 __U, __m256h __A, __m2 return _mm256_mask_cvt2ph_bf8(__W, __U, __A, __B); } -__m256i test_mm256_maskz_cvt2ph_bf8(__mmask16 __U, __m256h __A, __m25
[llvm-branch-commits] [clang] 1c36697 - [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627)
Author: Mikołaj Piróg Date: 2025-02-11T14:07:33-08:00 New Revision: 1c36697fbb554b49b00bd2e9bd842ffcb73d9a0f URL: https://github.com/llvm/llvm-project/commit/1c36697fbb554b49b00bd2e9bd842ffcb73d9a0f DIFF: https://github.com/llvm/llvm-project/commit/1c36697fbb554b49b00bd2e9bd842ffcb73d9a0f.diff LOG: [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627) Found during work on #120927. This caused the compiler to silently drop ignore half of the mask in the specific intrinsics. (cherry picked from commit af522c5dd3a38cc5e11e8e62009d7dbe2cde2d86) Added: Modified: clang/lib/Headers/avx10_2convertintrin.h clang/test/CodeGen/X86/avx10_2convert-builtins.c Removed: diff --git a/clang/lib/Headers/avx10_2convertintrin.h b/clang/lib/Headers/avx10_2convertintrin.h index c67a5b890f195..79d9def2207b8 100644 --- a/clang/lib/Headers/avx10_2convertintrin.h +++ b/clang/lib/Headers/avx10_2convertintrin.h @@ -260,13 +260,13 @@ static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_cvt2ph_bf8(__m256h __A, static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_mask_cvt2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)__W); + (__mmask32)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)__W); } static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_maskz_cvt2ph_bf8(__mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), + (__mmask32)__U, (__v32qi)_mm256_cvt2ph_bf8(__A, __B), (__v32qi)(__m256i)_mm256_setzero_si256()); } @@ -297,13 +297,13 @@ _mm256_cvts2ph_bf8(__m256h __A, __m256h __B) { static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_mask_cvts2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)__W); + (__mmask32)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)__W); } static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_maskz_cvts2ph_bf8(__mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), + (__mmask32)__U, (__v32qi)_mm256_cvts2ph_bf8(__A, __B), (__v32qi)(__m256i)_mm256_setzero_si256()); } @@ -334,13 +334,13 @@ static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_cvt2ph_hf8(__m256h __A, static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_mask_cvt2ph_hf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)__W); + (__mmask32)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)__W); } static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_maskz_cvt2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), + (__mmask32)__U, (__v32qi)_mm256_cvt2ph_hf8(__A, __B), (__v32qi)(__m256i)_mm256_setzero_si256()); } @@ -371,13 +371,13 @@ _mm256_cvts2ph_hf8(__m256h __A, __m256h __B) { static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_mask_cvts2ph_hf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)__W); + (__mmask32)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)__W); } static __inline__ __m256i __DEFAULT_FN_ATTRS256 _mm256_maskz_cvts2ph_hf8(__mmask32 __U, __m256h __A, __m256h __B) { return (__m256i)__builtin_ia32_selectb_256( - (__mmask16)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), + (__mmask32)__U, (__v32qi)_mm256_cvts2ph_hf8(__A, __B), (__v32qi)(__m256i)_mm256_setzero_si256()); } diff --git a/clang/test/CodeGen/X86/avx10_2convert-builtins.c b/clang/test/CodeGen/X86/avx10_2convert-builtins.c index efd9a31c40875..e5e6f867e119e 100644 --- a/clang/test/CodeGen/X86/avx10_2convert-builtins.c +++ b/clang/test/CodeGen/X86/avx10_2convert-builtins.c @@ -231,7 +231,7 @@ __m256i test_mm256_cvt2ph_bf8(__m256h __A, __m256h __B) { return _mm256_cvt2ph_bf8(__A, __B); } -__m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask16 __U, __m256h __A, __m256h __B) { +__m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask32 __U, __m256h __A, __m256h __B) { // CHECK-LABEL: @test_mm256_mask_cvt2ph_bf8( // CHECK: call <32 x i8> @llvm.x86.avx10.vcvt2ph2bf8256( // CHECK: select <32 x i1> %{{.*}}, <32 x i8> %{{.*}}, <32 x i8> %{{.*}} @@ -239,7 +239,7 @@ __m256i test_mm256_mask_cvt2ph_bf8(__m256i __W, __mmask16 __U, __m256h __A, __m2 return _mm256_mask_cvt2ph_bf8(__W, __U, __A, __B); }
[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627) (PR #126666)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/12 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong intrinsic names after rename (#126390) (PR #126687)
github-actions[bot] wrote: @phoebewang (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/126687 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong intrinsic names after rename (#126390) (PR #126687)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/126687 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 94c1a8e - [DSE] Don't use initializes on byval argument (#126259)
Author: Nikita Popov Date: 2025-02-11T14:12:42-08:00 New Revision: 94c1a8ea1bfea3cbd7191783c85b2cead642f653 URL: https://github.com/llvm/llvm-project/commit/94c1a8ea1bfea3cbd7191783c85b2cead642f653 DIFF: https://github.com/llvm/llvm-project/commit/94c1a8ea1bfea3cbd7191783c85b2cead642f653.diff LOG: [DSE] Don't use initializes on byval argument (#126259) There are two ways we can fix this problem, depending on how the semantics of byval and initializes should interact: * Don't infer initializes on byval arguments. initializes on byval refers to the original caller memory (or having both attributes is made a verifier error). * Infer initializes on byval, but don't use it in DSE. initializes on byval refers to the callee copy. This matches the semantics of readonly on byval. This is slightly more powerful, for example, we could do a backend optimization where byval + initializes will allocate the full size of byval on the stack but not copy over the parts covered by initializes. I went with the second variant here, skipping byval + initializes in DSE (FunctionAttrs already doesn't propagate initializes past byval). I'm open to going in the other direction though. Fixes https://github.com/llvm/llvm-project/issues/126181. (cherry picked from commit 2d31a12dbe2339d20844ede70cbb54dbaf4ceea9) Added: Modified: llvm/docs/LangRef.rst llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll Removed: diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index d004ced9dff14..e002195cb7ed5 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -1725,6 +1725,10 @@ Currently, only the following parameter attributes are defined: and negative values are allowed in case the argument points partway into an allocation. An empty list is not allowed. +On a ``byval`` argument, ``initializes`` refers to the given parts of the +callee copy being overwritten. A ``byval`` callee can never initialize the +original caller memory passed to the ``byval`` argument. + ``dead_on_unwind`` At a high level, this attribute indicates that the pointer argument is dead if the call unwinds, in the sense that the caller will not depend on the diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp index 13f3de07c3c44..0fdc3354753b1 100644 --- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp +++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp @@ -2281,7 +2281,9 @@ DSEState::getInitializesArgMemLoc(const Instruction *I) { for (unsigned Idx = 0, Count = CB->arg_size(); Idx < Count; ++Idx) { ConstantRangeList Inits; Attribute InitializesAttr = CB->getParamAttr(Idx, Attribute::Initializes); -if (InitializesAttr.isValid()) +// initializes on byval arguments refers to the callee copy, not the +// original memory the caller passed in. +if (InitializesAttr.isValid() && !CB->isByValArgument(Idx)) Inits = InitializesAttr.getValueAsConstantRangeList(); Value *CurArg = CB->getArgOperand(Idx); diff --git a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll index e590c5bf4004a..5f8ab56c22754 100644 --- a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll +++ b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll @@ -338,3 +338,17 @@ define i16 @global_var_alias() { ret i16 %l } +declare void @byval_fn(ptr byval(i32) initializes((0, 4)) %am) + +define void @test_byval() { +; CHECK-LABEL: @test_byval( +; CHECK-NEXT:[[A:%.*]] = alloca i32, align 4 +; CHECK-NEXT:store i32 0, ptr [[A]], align 4 +; CHECK-NEXT:call void @byval_fn(ptr [[A]]) +; CHECK-NEXT:ret void +; + %a = alloca i32 + store i32 0, ptr %a + call void @byval_fn(ptr %a) + ret void +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [DSE] Don't use initializes on byval argument (#126259) (PR #126493)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/126493 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [DSE] Don't use initializes on byval argument (#126259) (PR #126493)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/126493 >From 94c1a8ea1bfea3cbd7191783c85b2cead642f653 Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Mon, 10 Feb 2025 10:34:03 +0100 Subject: [PATCH] [DSE] Don't use initializes on byval argument (#126259) There are two ways we can fix this problem, depending on how the semantics of byval and initializes should interact: * Don't infer initializes on byval arguments. initializes on byval refers to the original caller memory (or having both attributes is made a verifier error). * Infer initializes on byval, but don't use it in DSE. initializes on byval refers to the callee copy. This matches the semantics of readonly on byval. This is slightly more powerful, for example, we could do a backend optimization where byval + initializes will allocate the full size of byval on the stack but not copy over the parts covered by initializes. I went with the second variant here, skipping byval + initializes in DSE (FunctionAttrs already doesn't propagate initializes past byval). I'm open to going in the other direction though. Fixes https://github.com/llvm/llvm-project/issues/126181. (cherry picked from commit 2d31a12dbe2339d20844ede70cbb54dbaf4ceea9) --- llvm/docs/LangRef.rst | 4 .../lib/Transforms/Scalar/DeadStoreElimination.cpp | 4 +++- .../DeadStoreElimination/inter-procedural.ll | 14 ++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index d004ced9dff14..e002195cb7ed5 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -1725,6 +1725,10 @@ Currently, only the following parameter attributes are defined: and negative values are allowed in case the argument points partway into an allocation. An empty list is not allowed. +On a ``byval`` argument, ``initializes`` refers to the given parts of the +callee copy being overwritten. A ``byval`` callee can never initialize the +original caller memory passed to the ``byval`` argument. + ``dead_on_unwind`` At a high level, this attribute indicates that the pointer argument is dead if the call unwinds, in the sense that the caller will not depend on the diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp index 13f3de07c3c44..0fdc3354753b1 100644 --- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp +++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp @@ -2281,7 +2281,9 @@ DSEState::getInitializesArgMemLoc(const Instruction *I) { for (unsigned Idx = 0, Count = CB->arg_size(); Idx < Count; ++Idx) { ConstantRangeList Inits; Attribute InitializesAttr = CB->getParamAttr(Idx, Attribute::Initializes); -if (InitializesAttr.isValid()) +// initializes on byval arguments refers to the callee copy, not the +// original memory the caller passed in. +if (InitializesAttr.isValid() && !CB->isByValArgument(Idx)) Inits = InitializesAttr.getValueAsConstantRangeList(); Value *CurArg = CB->getArgOperand(Idx); diff --git a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll index e590c5bf4004a..5f8ab56c22754 100644 --- a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll +++ b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll @@ -338,3 +338,17 @@ define i16 @global_var_alias() { ret i16 %l } +declare void @byval_fn(ptr byval(i32) initializes((0, 4)) %am) + +define void @test_byval() { +; CHECK-LABEL: @test_byval( +; CHECK-NEXT:[[A:%.*]] = alloca i32, align 4 +; CHECK-NEXT:store i32 0, ptr [[A]], align 4 +; CHECK-NEXT:call void @byval_fn(ptr [[A]]) +; CHECK-NEXT:ret void +; + %a = alloca i32 + store i32 0, ptr %a + call void @byval_fn(ptr %a) + ret void +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [DSE] Don't use initializes on byval argument (#126259) (PR #126493)
github-actions[bot] wrote: @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/126493 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532) (PR #126496)
github-actions[bot] wrote: @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/126496 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236) (PR #126492)
github-actions[bot] wrote: @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/126492 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] af970cd - [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236)
Author: Nikita Popov Date: 2025-02-11T14:14:21-08:00 New Revision: af970cd8753c37e7fcf66b6211f2a2d1e261325c URL: https://github.com/llvm/llvm-project/commit/af970cd8753c37e7fcf66b6211f2a2d1e261325c DIFF: https://github.com/llvm/llvm-project/commit/af970cd8753c37e7fcf66b6211f2a2d1e261325c.diff LOG: [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236) The code already guards against values coming from a previous iteration using properlyDominates(). However, addrecs are considered to properly dominate the loop they are defined in. Handle this special case separately, by checking for expressions that have computable loop evolution (this should cover cases like a zext of an addrec as well). I considered changing the definition of properlyDominates() instead, but decided against it. The current definition is useful in other context, e.g. when deciding whether an expression is safe to expand in a given block. Fixes https://github.com/llvm/llvm-project/issues/126012. (cherry picked from commit 7aed53eb1982113e825534f0f66d0a0e46e7a5ed) Added: Modified: llvm/lib/Analysis/ScalarEvolution.cpp llvm/test/Transforms/IndVarSimplify/pr126012.ll Removed: diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp index 2ce40877b523e..c71202c8dd58e 100644 --- a/llvm/lib/Analysis/ScalarEvolution.cpp +++ b/llvm/lib/Analysis/ScalarEvolution.cpp @@ -12402,6 +12402,12 @@ bool ScalarEvolution::isImpliedViaMerge(CmpPredicate Pred, const SCEV *LHS, // iteration of a loop. if (!properlyDominates(L, LBB)) return false; + // Addrecs are considered to properly dominate their loop, so are missed + // by the previous check. Discard any values that have computable + // evolution in this loop. + if (auto *Loop = LI.getLoopFor(LBB)) +if (hasComputableLoopEvolution(L, Loop)) + return false; if (!ProvedEasily(L, RHS)) return false; } diff --git a/llvm/test/Transforms/IndVarSimplify/pr126012.ll b/llvm/test/Transforms/IndVarSimplify/pr126012.ll index 725ea89b8e651..5189fe020dd3b 100644 --- a/llvm/test/Transforms/IndVarSimplify/pr126012.ll +++ b/llvm/test/Transforms/IndVarSimplify/pr126012.ll @@ -1,18 +1,22 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 ; RUN: opt -S -passes=indvars < %s | FileCheck %s -; FIXME: This is a miscompile. +; Do not infer that %cmp is true. The %indvar3 input of %indvar2 comes from +; a previous iteration, so we should not compare it to a value from the current +; iteration. define i32 @test() { ; CHECK-LABEL: define i32 @test() { ; CHECK-NEXT: [[ENTRY:.*]]: ; CHECK-NEXT:br label %[[FOR_PREHEADER:.*]] ; CHECK: [[FOR_PREHEADER]]: ; CHECK-NEXT:[[INDVAR1:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[PHI:%.*]], %[[FOR_INC:.*]] ] -; CHECK-NEXT:[[INDVAR3:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC:%.*]], %[[FOR_INC]] ] +; CHECK-NEXT:[[INDVAR2:%.*]] = phi i32 [ 1, %[[ENTRY]] ], [ [[INDVAR3:%.*]], %[[FOR_INC]] ] +; CHECK-NEXT:[[INDVAR3]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC:%.*]], %[[FOR_INC]] ] ; CHECK-NEXT:[[COND1:%.*]] = icmp eq i32 [[INDVAR3]], 0 ; CHECK-NEXT:br i1 [[COND1]], label %[[FOR_INC]], label %[[FOR_END:.*]] ; CHECK: [[FOR_END]]: -; CHECK-NEXT:[[EXT:%.*]] = zext i1 true to i32 +; CHECK-NEXT:[[CMP:%.*]] = icmp ugt i32 [[INDVAR2]], 0 +; CHECK-NEXT:[[EXT:%.*]] = zext i1 [[CMP]] to i32 ; CHECK-NEXT:br label %[[FOR_INC]] ; CHECK: [[FOR_INC]]: ; CHECK-NEXT:[[PHI]] = phi i32 [ [[EXT]], %[[FOR_END]] ], [ 0, %[[FOR_PREHEADER]] ] ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236) (PR #126492)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/126492 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 9bbf3a9 - [IndVars] Add test for #126012 (NFC)
Author: Nikita Popov Date: 2025-02-11T14:14:21-08:00 New Revision: 9bbf3a98b793f8fc6269a20a026ca6fe029a1790 URL: https://github.com/llvm/llvm-project/commit/9bbf3a98b793f8fc6269a20a026ca6fe029a1790 DIFF: https://github.com/llvm/llvm-project/commit/9bbf3a98b793f8fc6269a20a026ca6fe029a1790.diff LOG: [IndVars] Add test for #126012 (NFC) (cherry picked from commit ae08969a2068dd327fbf4d0f606550574fbb9e45) Added: llvm/test/Transforms/IndVarSimplify/pr126012.ll Modified: Removed: diff --git a/llvm/test/Transforms/IndVarSimplify/pr126012.ll b/llvm/test/Transforms/IndVarSimplify/pr126012.ll new file mode 100644 index 0..725ea89b8e651 --- /dev/null +++ b/llvm/test/Transforms/IndVarSimplify/pr126012.ll @@ -0,0 +1,49 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 +; RUN: opt -S -passes=indvars < %s | FileCheck %s + +; FIXME: This is a miscompile. +define i32 @test() { +; CHECK-LABEL: define i32 @test() { +; CHECK-NEXT: [[ENTRY:.*]]: +; CHECK-NEXT:br label %[[FOR_PREHEADER:.*]] +; CHECK: [[FOR_PREHEADER]]: +; CHECK-NEXT:[[INDVAR1:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[PHI:%.*]], %[[FOR_INC:.*]] ] +; CHECK-NEXT:[[INDVAR3:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC:%.*]], %[[FOR_INC]] ] +; CHECK-NEXT:[[COND1:%.*]] = icmp eq i32 [[INDVAR3]], 0 +; CHECK-NEXT:br i1 [[COND1]], label %[[FOR_INC]], label %[[FOR_END:.*]] +; CHECK: [[FOR_END]]: +; CHECK-NEXT:[[EXT:%.*]] = zext i1 true to i32 +; CHECK-NEXT:br label %[[FOR_INC]] +; CHECK: [[FOR_INC]]: +; CHECK-NEXT:[[PHI]] = phi i32 [ [[EXT]], %[[FOR_END]] ], [ 0, %[[FOR_PREHEADER]] ] +; CHECK-NEXT:[[INC]] = add nuw nsw i32 [[INDVAR3]], 1 +; CHECK-NEXT:[[EXITCOND:%.*]] = icmp eq i32 [[INDVAR3]], 2 +; CHECK-NEXT:br i1 [[EXITCOND]], label %[[FOR_EXIT:.*]], label %[[FOR_PREHEADER]] +; CHECK: [[FOR_EXIT]]: +; CHECK-NEXT:[[INDVAR1_LCSSA:%.*]] = phi i32 [ [[INDVAR1]], %[[FOR_INC]] ] +; CHECK-NEXT:ret i32 [[INDVAR1_LCSSA]] +; +entry: + br label %for.preheader + +for.preheader: + %indvar1 = phi i32 [ 0, %entry ], [ %phi, %for.inc ] + %indvar2 = phi i32 [ 1, %entry ], [ %indvar3, %for.inc ] + %indvar3 = phi i32 [ 0, %entry ], [ %inc, %for.inc ] + %cond1 = icmp eq i32 %indvar3, 0 + br i1 %cond1, label %for.inc, label %for.end + +for.end: + %cmp = icmp sgt i32 %indvar2, 0 + %ext = zext i1 %cmp to i32 + br label %for.inc + +for.inc: + %phi = phi i32 [ %ext, %for.end ], [ 0, %for.preheader ] + %inc = add i32 %indvar3, 1 + %exitcond = icmp eq i32 %indvar3, 2 + br i1 %exitcond, label %for.exit, label %for.preheader + +for.exit: + ret i32 %indvar1 +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532) (PR #126496)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/126496 >From a89e04e7f0caa28d607e38099b905063b47a88fb Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Mon, 3 Feb 2025 17:37:07 +0100 Subject: [PATCH 1/2] [ValueTracking] Add additional tests for computeKnownBits on GEPs (NFC) These demonstrate miscompiles in the existing code. (cherry picked from commit 3dc1ef1650c8389a6f195a474781cf2281208bed) --- llvm/unittests/Analysis/ValueTrackingTest.cpp | 35 +++ 1 file changed, 35 insertions(+) diff --git a/llvm/unittests/Analysis/ValueTrackingTest.cpp b/llvm/unittests/Analysis/ValueTrackingTest.cpp index ee44aac45594d..39865fa195cf7 100644 --- a/llvm/unittests/Analysis/ValueTrackingTest.cpp +++ b/llvm/unittests/Analysis/ValueTrackingTest.cpp @@ -2679,6 +2679,41 @@ TEST_F(ComputeKnownBitsTest, ComputeKnownBitsAbsoluteSymbol) { EXPECT_EQ(0u, Known_0_256_Align8.countMinTrailingOnes()); } +TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPExtendBeforeMul) { + // FIXME: The index should be extended before multiplying with the scale. + parseAssembly(R"( +target datalayout = "p:16:16:16" + +define void @test(i16 %arg) { + %and = and i16 %arg, u0x8000 + %base = inttoptr i16 %and to ptr + %A = getelementptr i32, ptr %base, i8 80 + ret void +} +)"); + KnownBits Known = computeKnownBits(A, M->getDataLayout()); + EXPECT_EQ(~64 & 0x7fff, Known.Zero); + EXPECT_EQ(64, Known.One); +} + +TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPOnlyIndexBits) { + // FIXME: GEP should only affect the index width. + parseAssembly(R"( +target datalayout = "p:16:16:16:8" + +define void @test(i16 %arg) { + %and = and i16 %arg, u0x8000 + %or = or i16 %and, u0x00ff + %base = inttoptr i16 %or to ptr + %A = getelementptr i8, ptr %base, i8 1 + ret void +} +)"); + KnownBits Known = computeKnownBits(A, M->getDataLayout()); + EXPECT_EQ(0x7eff, Known.Zero); + EXPECT_EQ(0x100, Known.One); +} + TEST_F(ValueTrackingTest, HaveNoCommonBitsSet) { { // Check for an inverted mask: (X & ~M) op (Y & M). >From 5777d5df62a659e165b4df74aefae29ae01d2509 Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Tue, 4 Feb 2025 14:29:58 +0100 Subject: [PATCH 2/2] [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532) For GEPs, we have three bit widths involved: The pointer bit width, the index bit width, and the bit width of the GEP operands. The correct behavior here is: * We need to sextOrTrunc the GEP operand to the index width *before* multiplying by the scale. * If the index width and pointer width differ, GEP only ever modifies the low bits. Adds should not overflow into the high bits. I'm testing this via unit tests because it's a bit tricky to test in IR with InstCombine canonicalization getting in the way. (cherry picked from commit 3bd11b502c1846afa5e1257c94b7a70566e34686) --- llvm/lib/Analysis/ValueTracking.cpp | 66 ++- llvm/unittests/Analysis/ValueTrackingTest.cpp | 12 ++-- 2 files changed, 42 insertions(+), 36 deletions(-) diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp index b63a0a07f7de2..8a674914641a8 100644 --- a/llvm/lib/Analysis/ValueTracking.cpp +++ b/llvm/lib/Analysis/ValueTracking.cpp @@ -1445,7 +1445,22 @@ static void computeKnownBitsFromOperator(const Operator *I, computeKnownBits(I->getOperand(0), Known, Depth + 1, Q); // Accumulate the constant indices in a separate variable // to minimize the number of calls to computeForAddSub. -APInt AccConstIndices(BitWidth, 0, /*IsSigned*/ true); +unsigned IndexWidth = Q.DL.getIndexTypeSizeInBits(I->getType()); +APInt AccConstIndices(IndexWidth, 0); + +auto AddIndexToKnown = [&](KnownBits IndexBits) { + if (IndexWidth == BitWidth) { +// Note that inbounds does *not* guarantee nsw for the addition, as only +// the offset is signed, while the base address is unsigned. +Known = KnownBits::add(Known, IndexBits); + } else { +// If the index width is smaller than the pointer width, only add the +// value to the low bits. +assert(IndexWidth < BitWidth && + "Index width can't be larger than pointer width"); +Known.insertBits(KnownBits::add(Known.trunc(IndexWidth), IndexBits), 0); + } +}; gep_type_iterator GTI = gep_type_begin(I); for (unsigned i = 1, e = I->getNumOperands(); i != e; ++i, ++GTI) { @@ -1483,43 +1498,34 @@ static void computeKnownBitsFromOperator(const Operator *I, break; } - unsigned IndexBitWidth = Index->getType()->getScalarSizeInBits(); - KnownBits IndexBits(IndexBitWidth); - computeKnownBits(Index, IndexBits, Depth + 1, Q); - TypeSize IndexTypeSize = GTI.getSequentialElementStride(Q.DL); - uint64_t TypeSizeInBytes = IndexTypeSize.getKnownMinValue(); - KnownBits ScalingFa
[llvm-branch-commits] [llvm] a89e04e - [ValueTracking] Add additional tests for computeKnownBits on GEPs (NFC)
Author: Nikita Popov Date: 2025-02-11T14:15:45-08:00 New Revision: a89e04e7f0caa28d607e38099b905063b47a88fb URL: https://github.com/llvm/llvm-project/commit/a89e04e7f0caa28d607e38099b905063b47a88fb DIFF: https://github.com/llvm/llvm-project/commit/a89e04e7f0caa28d607e38099b905063b47a88fb.diff LOG: [ValueTracking] Add additional tests for computeKnownBits on GEPs (NFC) These demonstrate miscompiles in the existing code. (cherry picked from commit 3dc1ef1650c8389a6f195a474781cf2281208bed) Added: Modified: llvm/unittests/Analysis/ValueTrackingTest.cpp Removed: diff --git a/llvm/unittests/Analysis/ValueTrackingTest.cpp b/llvm/unittests/Analysis/ValueTrackingTest.cpp index ee44aac45594d..39865fa195cf7 100644 --- a/llvm/unittests/Analysis/ValueTrackingTest.cpp +++ b/llvm/unittests/Analysis/ValueTrackingTest.cpp @@ -2679,6 +2679,41 @@ TEST_F(ComputeKnownBitsTest, ComputeKnownBitsAbsoluteSymbol) { EXPECT_EQ(0u, Known_0_256_Align8.countMinTrailingOnes()); } +TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPExtendBeforeMul) { + // FIXME: The index should be extended before multiplying with the scale. + parseAssembly(R"( +target datalayout = "p:16:16:16" + +define void @test(i16 %arg) { + %and = and i16 %arg, u0x8000 + %base = inttoptr i16 %and to ptr + %A = getelementptr i32, ptr %base, i8 80 + ret void +} +)"); + KnownBits Known = computeKnownBits(A, M->getDataLayout()); + EXPECT_EQ(~64 & 0x7fff, Known.Zero); + EXPECT_EQ(64, Known.One); +} + +TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPOnlyIndexBits) { + // FIXME: GEP should only affect the index width. + parseAssembly(R"( +target datalayout = "p:16:16:16:8" + +define void @test(i16 %arg) { + %and = and i16 %arg, u0x8000 + %or = or i16 %and, u0x00ff + %base = inttoptr i16 %or to ptr + %A = getelementptr i8, ptr %base, i8 1 + ret void +} +)"); + KnownBits Known = computeKnownBits(A, M->getDataLayout()); + EXPECT_EQ(0x7eff, Known.Zero); + EXPECT_EQ(0x100, Known.One); +} + TEST_F(ValueTrackingTest, HaveNoCommonBitsSet) { { // Check for an inverted mask: (X & ~M) op (Y & M). ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 5777d5d - [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532)
Author: Nikita Popov Date: 2025-02-11T14:15:45-08:00 New Revision: 5777d5df62a659e165b4df74aefae29ae01d2509 URL: https://github.com/llvm/llvm-project/commit/5777d5df62a659e165b4df74aefae29ae01d2509 DIFF: https://github.com/llvm/llvm-project/commit/5777d5df62a659e165b4df74aefae29ae01d2509.diff LOG: [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532) For GEPs, we have three bit widths involved: The pointer bit width, the index bit width, and the bit width of the GEP operands. The correct behavior here is: * We need to sextOrTrunc the GEP operand to the index width *before* multiplying by the scale. * If the index width and pointer width differ, GEP only ever modifies the low bits. Adds should not overflow into the high bits. I'm testing this via unit tests because it's a bit tricky to test in IR with InstCombine canonicalization getting in the way. (cherry picked from commit 3bd11b502c1846afa5e1257c94b7a70566e34686) Added: Modified: llvm/lib/Analysis/ValueTracking.cpp llvm/unittests/Analysis/ValueTrackingTest.cpp Removed: diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp index b63a0a07f7de2..8a674914641a8 100644 --- a/llvm/lib/Analysis/ValueTracking.cpp +++ b/llvm/lib/Analysis/ValueTracking.cpp @@ -1445,7 +1445,22 @@ static void computeKnownBitsFromOperator(const Operator *I, computeKnownBits(I->getOperand(0), Known, Depth + 1, Q); // Accumulate the constant indices in a separate variable // to minimize the number of calls to computeForAddSub. -APInt AccConstIndices(BitWidth, 0, /*IsSigned*/ true); +unsigned IndexWidth = Q.DL.getIndexTypeSizeInBits(I->getType()); +APInt AccConstIndices(IndexWidth, 0); + +auto AddIndexToKnown = [&](KnownBits IndexBits) { + if (IndexWidth == BitWidth) { +// Note that inbounds does *not* guarantee nsw for the addition, as only +// the offset is signed, while the base address is unsigned. +Known = KnownBits::add(Known, IndexBits); + } else { +// If the index width is smaller than the pointer width, only add the +// value to the low bits. +assert(IndexWidth < BitWidth && + "Index width can't be larger than pointer width"); +Known.insertBits(KnownBits::add(Known.trunc(IndexWidth), IndexBits), 0); + } +}; gep_type_iterator GTI = gep_type_begin(I); for (unsigned i = 1, e = I->getNumOperands(); i != e; ++i, ++GTI) { @@ -1483,43 +1498,34 @@ static void computeKnownBitsFromOperator(const Operator *I, break; } - unsigned IndexBitWidth = Index->getType()->getScalarSizeInBits(); - KnownBits IndexBits(IndexBitWidth); - computeKnownBits(Index, IndexBits, Depth + 1, Q); - TypeSize IndexTypeSize = GTI.getSequentialElementStride(Q.DL); - uint64_t TypeSizeInBytes = IndexTypeSize.getKnownMinValue(); - KnownBits ScalingFactor(IndexBitWidth); + TypeSize Stride = GTI.getSequentialElementStride(Q.DL); + uint64_t StrideInBytes = Stride.getKnownMinValue(); + if (!Stride.isScalable()) { +// Fast path for constant offset. +if (auto *CI = dyn_cast(Index)) { + AccConstIndices += + CI->getValue().sextOrTrunc(IndexWidth) * StrideInBytes; + continue; +} + } + + KnownBits IndexBits = + computeKnownBits(Index, Depth + 1, Q).sextOrTrunc(IndexWidth); + KnownBits ScalingFactor(IndexWidth); // Multiply by current sizeof type. // &A[i] == A + i * sizeof(*A[i]). - if (IndexTypeSize.isScalable()) { + if (Stride.isScalable()) { // For scalable types the only thing we know about sizeof is // that this is a multiple of the minimum size. -ScalingFactor.Zero.setLowBits(llvm::countr_zero(TypeSizeInBytes)); - } else if (IndexBits.isConstant()) { -APInt IndexConst = IndexBits.getConstant(); -APInt ScalingFactor(IndexBitWidth, TypeSizeInBytes); -IndexConst *= ScalingFactor; -AccConstIndices += IndexConst.sextOrTrunc(BitWidth); -continue; +ScalingFactor.Zero.setLowBits(llvm::countr_zero(StrideInBytes)); } else { ScalingFactor = -KnownBits::makeConstant(APInt(IndexBitWidth, TypeSizeInBytes)); +KnownBits::makeConstant(APInt(IndexWidth, StrideInBytes)); } - IndexBits = KnownBits::mul(IndexBits, ScalingFactor); - - // If the offsets have a diff erent width from the pointer, according - // to the language reference we need to sign-extend or truncate them - // to the width of the pointer. - IndexBits = IndexBits.sextOrTrunc(BitWidth); - - // Note that inbounds does *not* guarantee nsw for the addition, as only - // the offset is signed, while the base address is unsigned. - Known = KnownBits:
[llvm-branch-commits] [llvm] release/20.x: [ValueTracking] Fix bit width handling in computeKnownBits() for GEPs (#125532) (PR #126496)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/126496 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in MLIR (PR #125836)
https://github.com/kuhar commented: Since this essentially breaks logic for gfx940 and gfx941, should we assert in code like `Chipset` that these are not used and silently miscompiled? https://github.com/llvm/llvm-project/pull/125836 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in MLIR (PR #125836)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/125836 >From ebef8a82c9265ecea31795d726af402a96b89430 Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Wed, 5 Feb 2025 05:50:12 -0500 Subject: [PATCH 1/2] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in MLIR gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. For SWDEV-512631 --- mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td | 2 +- mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td | 8 +++ .../AMDGPUToROCDL/AMDGPUToROCDL.cpp | 22 +-- .../ArithToAMDGPU/ArithToAMDGPU.cpp | 2 +- .../AMDGPU/Transforms/EmulateAtomics.cpp | 8 +-- .../AMDGPUToROCDL/8-bit-floats.mlir | 2 +- mlir/test/Conversion/AMDGPUToROCDL/mfma.mlir | 2 +- .../ArithToAMDGPU/8-bit-float-saturation.mlir | 2 +- .../ArithToAMDGPU/8-bit-floats.mlir | 2 +- .../Dialect/AMDGPU/AMDGPUUtilsTest.cpp| 20 +++-- 10 files changed, 30 insertions(+), 40 deletions(-) diff --git a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td index 69745addfd748ec..24f541587cba88a 100644 --- a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td +++ b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td @@ -602,7 +602,7 @@ def AMDGPU_MFMAOp : order (that is, v[0] will go to arg[7:0], v[1] to arg[15:8] and so on). The negateA, negateB, and negateC flags are only supported for double-precision -operations on gfx940+. +operations on gfx942+. }]; let assemblyFormat = [{ $sourceA `*` $sourceB `+` $destC diff --git a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td index 7efa4ffa2aa6fe0..77401bd6de4bd56 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td @@ -348,11 +348,11 @@ def ROCDL_mfma_f32_16x16x4bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.16x16x4bf16.1k"> def ROCDL_mfma_f32_4x4x4bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.4x4x4bf16.1k">; def ROCDL_mfma_f32_32x32x8bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.32x32x8bf16.1k">; def ROCDL_mfma_f32_16x16x16bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.16x16x16bf16.1k">; -// Note: in gfx940, unlike in gfx90a, the f64 xdlops use the "blgp" argument as a -// NEG bitfield. See IntrinsicsAMDGPU.td for more info. +// Note: in gfx942, unlike in gfx90a, the f64 xdlops use the "blgp" argument as +// a NEG bitfield. See IntrinsicsAMDGPU.td for more info. def ROCDL_mfma_f64_16x16x4f64 : ROCDL_Mfma_IntrOp<"mfma.f64.16x16x4f64">; def ROCDL_mfma_f64_4x4x4f64 : ROCDL_Mfma_IntrOp<"mfma.f64.4x4x4f64">; -// New in gfx940. +// New in gfx942. def ROCDL_mfma_i32_16x16x32_i8 : ROCDL_Mfma_IntrOp<"mfma.i32.16x16x32.i8">; def ROCDL_mfma_i32_32x32x16_i8 : ROCDL_Mfma_IntrOp<"mfma.i32.32x32x16.i8">; def ROCDL_mfma_f32_16x16x8_xf32 : ROCDL_Mfma_IntrOp<"mfma.f32.16x16x8.xf32">; @@ -375,7 +375,7 @@ def ROCDL_mfma_f32_32x32x16_f16 : ROCDL_Mfma_IntrOp<"mfma.f32.32x32x16.f16">; def ROCDL_mfma_scale_f32_16x16x128_f8f6f4 : ROCDL_Mfma_OO_IntrOp<"mfma.scale.f32.16x16x128.f8f6f4", [0,1]>; def ROCDL_mfma_scale_f32_32x32x64_f8f6f4 : ROCDL_Mfma_OO_IntrOp<"mfma.scale.f32.32x32x64.f8f6f4", [0,1]>; -// 2:4 Sparsity ops (GFX940) +// 2:4 Sparsity ops (GFX942) def ROCDL_smfmac_f32_16x16x32_f16 : ROCDL_Mfma_IntrOp<"smfmac.f32.16x16x32.f16">; def ROCDL_smfmac_f32_32x32x16_f16 : ROCDL_Mfma_IntrOp<"smfmac.f32.32x32x16.f16">; def ROCDL_smfmac_f32_16x16x32_bf16 : ROCDL_Mfma_IntrOp<"smfmac.f32.16x16x32.bf16">; diff --git a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp index 9fb51f0bc1f1ea7..18fd0fc3f038139 100644 --- a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp +++ b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp @@ -80,7 +80,7 @@ namespace { // Define commonly used chipsets versions for convenience. constexpr Chipset kGfx908 = Chipset(9, 0, 8); constexpr Chipset kGfx90a = Chipset(9, 0, 0xa); -constexpr Chipset kGfx940 = Chipset(9, 4, 0); +constexpr Chipset kGfx942 = Chipset(9, 4, 2); /// Define lowering patterns for raw buffer ops template @@ -483,7 +483,7 @@ static std::optional mfmaOpToIntrinsic(MFMAOp mfma, destElem = destType.getElementType(); if (sourceElem.isF32() && destElem.isF32()) { -if (mfma.getReducePrecision() && chipset >= kGfx940) { +if (mfma.getReducePrecision() && chipset >= kGfx942) { if (m == 32 && n == 32 && k == 4 && b == 1) return ROCDL::mfma_f32_32x32x4_xf32::getOperationName(); if (m == 16 && n == 16 && k == 8 && b == 1) @@ -551,9 +551,9 @@ static std::optional mfmaOpToIntrinsic(MFMAOp mfma, return ROCDL::mfma_i32_32x32x8i8::getOperationName(); if (m == 16 && n == 16 && k == 16 && b == 1) return ROCDL::mfma_i32_16x16x16i8::getOperationName(); -if (m == 32 && n == 32 && k == 16 && b == 1 && chipset >= k
[llvm-branch-commits] [flang] [AMDGPU] Add missing gfx architectures to AddFlangOffloadRuntime.cmake (PR #125827)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/125827 >From 175aff53a41aebabdadfd93296d8b8bc22683197 Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Wed, 5 Feb 2025 04:45:26 -0500 Subject: [PATCH] [AMDGPU] Add missing gfx architectures to AddFlangOffloadRuntime.cmake --- flang/cmake/modules/AddFlangOffloadRuntime.cmake | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/cmake/modules/AddFlangOffloadRuntime.cmake b/flang/cmake/modules/AddFlangOffloadRuntime.cmake index f1f6eb57c5d6cf3..eb0e964559ed566 100644 --- a/flang/cmake/modules/AddFlangOffloadRuntime.cmake +++ b/flang/cmake/modules/AddFlangOffloadRuntime.cmake @@ -98,10 +98,10 @@ macro(enable_omp_offload_compilation files) set(all_amdgpu_architectures "gfx700;gfx701;gfx801;gfx803;gfx900;gfx902;gfx906" -"gfx908;gfx90a;gfx90c;gfx942;gfx1010;gfx1030" +"gfx908;gfx90a;gfx90c;gfx942;gfx950;gfx1010;gfx1030" "gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036" "gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151" -"gfx1152;gfx1153" +"gfx1152;gfx1153;gfx1200;gfx1201" ) set(all_nvptx_architectures "sm_35;sm_37;sm_50;sm_52;sm_53;sm_60;sm_61;sm_62" ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in MLIR (PR #125836)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/125836 >From ebef8a82c9265ecea31795d726af402a96b89430 Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Wed, 5 Feb 2025 05:50:12 -0500 Subject: [PATCH 1/2] [AMDGPU][MLIR] Replace gfx940 and gfx941 with gfx942 in MLIR gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. For SWDEV-512631 --- mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td | 2 +- mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td | 8 +++ .../AMDGPUToROCDL/AMDGPUToROCDL.cpp | 22 +-- .../ArithToAMDGPU/ArithToAMDGPU.cpp | 2 +- .../AMDGPU/Transforms/EmulateAtomics.cpp | 8 +-- .../AMDGPUToROCDL/8-bit-floats.mlir | 2 +- mlir/test/Conversion/AMDGPUToROCDL/mfma.mlir | 2 +- .../ArithToAMDGPU/8-bit-float-saturation.mlir | 2 +- .../ArithToAMDGPU/8-bit-floats.mlir | 2 +- .../Dialect/AMDGPU/AMDGPUUtilsTest.cpp| 20 +++-- 10 files changed, 30 insertions(+), 40 deletions(-) diff --git a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td index 69745addfd748ec..24f541587cba88a 100644 --- a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td +++ b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td @@ -602,7 +602,7 @@ def AMDGPU_MFMAOp : order (that is, v[0] will go to arg[7:0], v[1] to arg[15:8] and so on). The negateA, negateB, and negateC flags are only supported for double-precision -operations on gfx940+. +operations on gfx942+. }]; let assemblyFormat = [{ $sourceA `*` $sourceB `+` $destC diff --git a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td index 7efa4ffa2aa6fe0..77401bd6de4bd56 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td @@ -348,11 +348,11 @@ def ROCDL_mfma_f32_16x16x4bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.16x16x4bf16.1k"> def ROCDL_mfma_f32_4x4x4bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.4x4x4bf16.1k">; def ROCDL_mfma_f32_32x32x8bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.32x32x8bf16.1k">; def ROCDL_mfma_f32_16x16x16bf16_1k : ROCDL_Mfma_IntrOp<"mfma.f32.16x16x16bf16.1k">; -// Note: in gfx940, unlike in gfx90a, the f64 xdlops use the "blgp" argument as a -// NEG bitfield. See IntrinsicsAMDGPU.td for more info. +// Note: in gfx942, unlike in gfx90a, the f64 xdlops use the "blgp" argument as +// a NEG bitfield. See IntrinsicsAMDGPU.td for more info. def ROCDL_mfma_f64_16x16x4f64 : ROCDL_Mfma_IntrOp<"mfma.f64.16x16x4f64">; def ROCDL_mfma_f64_4x4x4f64 : ROCDL_Mfma_IntrOp<"mfma.f64.4x4x4f64">; -// New in gfx940. +// New in gfx942. def ROCDL_mfma_i32_16x16x32_i8 : ROCDL_Mfma_IntrOp<"mfma.i32.16x16x32.i8">; def ROCDL_mfma_i32_32x32x16_i8 : ROCDL_Mfma_IntrOp<"mfma.i32.32x32x16.i8">; def ROCDL_mfma_f32_16x16x8_xf32 : ROCDL_Mfma_IntrOp<"mfma.f32.16x16x8.xf32">; @@ -375,7 +375,7 @@ def ROCDL_mfma_f32_32x32x16_f16 : ROCDL_Mfma_IntrOp<"mfma.f32.32x32x16.f16">; def ROCDL_mfma_scale_f32_16x16x128_f8f6f4 : ROCDL_Mfma_OO_IntrOp<"mfma.scale.f32.16x16x128.f8f6f4", [0,1]>; def ROCDL_mfma_scale_f32_32x32x64_f8f6f4 : ROCDL_Mfma_OO_IntrOp<"mfma.scale.f32.32x32x64.f8f6f4", [0,1]>; -// 2:4 Sparsity ops (GFX940) +// 2:4 Sparsity ops (GFX942) def ROCDL_smfmac_f32_16x16x32_f16 : ROCDL_Mfma_IntrOp<"smfmac.f32.16x16x32.f16">; def ROCDL_smfmac_f32_32x32x16_f16 : ROCDL_Mfma_IntrOp<"smfmac.f32.32x32x16.f16">; def ROCDL_smfmac_f32_16x16x32_bf16 : ROCDL_Mfma_IntrOp<"smfmac.f32.16x16x32.bf16">; diff --git a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp index 9fb51f0bc1f1ea7..18fd0fc3f038139 100644 --- a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp +++ b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp @@ -80,7 +80,7 @@ namespace { // Define commonly used chipsets versions for convenience. constexpr Chipset kGfx908 = Chipset(9, 0, 8); constexpr Chipset kGfx90a = Chipset(9, 0, 0xa); -constexpr Chipset kGfx940 = Chipset(9, 4, 0); +constexpr Chipset kGfx942 = Chipset(9, 4, 2); /// Define lowering patterns for raw buffer ops template @@ -483,7 +483,7 @@ static std::optional mfmaOpToIntrinsic(MFMAOp mfma, destElem = destType.getElementType(); if (sourceElem.isF32() && destElem.isF32()) { -if (mfma.getReducePrecision() && chipset >= kGfx940) { +if (mfma.getReducePrecision() && chipset >= kGfx942) { if (m == 32 && n == 32 && k == 4 && b == 1) return ROCDL::mfma_f32_32x32x4_xf32::getOperationName(); if (m == 16 && n == 16 && k == 8 && b == 1) @@ -551,9 +551,9 @@ static std::optional mfmaOpToIntrinsic(MFMAOp mfma, return ROCDL::mfma_i32_32x32x8i8::getOperationName(); if (m == 16 && n == 16 && k == 16 && b == 1) return ROCDL::mfma_i32_16x16x16i8::getOperationName(); -if (m == 32 && n == 32 && k == 16 && b == 1 && chipset >= k
[llvm-branch-commits] [flang] [libc] [libclc] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in offload and libclc (PR #125826)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/125826 >From f2a70b996f1e7b0c71af2e77f5fb6d11266ec82e Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Wed, 5 Feb 2025 04:19:00 -0500 Subject: [PATCH] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in offload and libclc gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. For SWDEV-512631 and SWDEV-512633 --- flang/cmake/modules/AddFlangOffloadRuntime.cmake | 2 +- libc/docs/gpu/using.rst | 2 +- libclc/CMakeLists.txt| 2 +- offload/plugins-nextgen/amdgpu/src/rtl.cpp | 6 -- offload/test/lit.cfg | 4 +--- 5 files changed, 4 insertions(+), 12 deletions(-) diff --git a/flang/cmake/modules/AddFlangOffloadRuntime.cmake b/flang/cmake/modules/AddFlangOffloadRuntime.cmake index 8e4f47d18535dcb..f1f6eb57c5d6cf3 100644 --- a/flang/cmake/modules/AddFlangOffloadRuntime.cmake +++ b/flang/cmake/modules/AddFlangOffloadRuntime.cmake @@ -98,7 +98,7 @@ macro(enable_omp_offload_compilation files) set(all_amdgpu_architectures "gfx700;gfx701;gfx801;gfx803;gfx900;gfx902;gfx906" -"gfx908;gfx90a;gfx90c;gfx940;gfx1010;gfx1030" +"gfx908;gfx90a;gfx90c;gfx942;gfx1010;gfx1030" "gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036" "gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151" "gfx1152;gfx1153" diff --git a/libc/docs/gpu/using.rst b/libc/docs/gpu/using.rst index 1c1f9c9bfb0c696..f17f6287be31349 100644 --- a/libc/docs/gpu/using.rst +++ b/libc/docs/gpu/using.rst @@ -44,7 +44,7 @@ this shouldn't be necessary. $> clang openmp.c -fopenmp --offload-arch=gfx90a -Xoffload-linker -lc $> clang cuda.cu --offload-arch=sm_80 --offload-new-driver -fgpu-rdc -Xoffload-linker -lc - $> clang hip.hip --offload-arch=gfx940 --offload-new-driver -fgpu-rdc -Xoffload-linker -lc + $> clang hip.hip --offload-arch=gfx942 --offload-new-driver -fgpu-rdc -Xoffload-linker -lc This will automatically link in the needed function definitions if they were required by the user's application. Normally using the ``-fgpu-rdc`` option diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt index b28da904ef68e15..a5f9c47f099080f 100644 --- a/libclc/CMakeLists.txt +++ b/libclc/CMakeLists.txt @@ -211,7 +211,7 @@ set( cayman_aliases aruba ) set( tahiti_aliases pitcairn verde oland hainan bonaire kabini kaveri hawaii mullins tonga tongapro iceland carrizo fiji stoney polaris10 polaris11 gfx602 gfx705 gfx805 - gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90a gfx90c gfx940 gfx941 gfx942 + gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90a gfx90c gfx942 gfx1010 gfx1011 gfx1012 gfx1013 gfx1030 gfx1031 gfx1032 gfx1033 gfx1034 gfx1035 gfx1036 gfx1100 gfx1101 gfx1102 gfx1103 diff --git a/offload/plugins-nextgen/amdgpu/src/rtl.cpp b/offload/plugins-nextgen/amdgpu/src/rtl.cpp index 92184ba796dbd83..e83d38a14f77f67 100644 --- a/offload/plugins-nextgen/amdgpu/src/rtl.cpp +++ b/offload/plugins-nextgen/amdgpu/src/rtl.cpp @@ -2854,12 +2854,6 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy { Error checkIfAPU() { // TODO: replace with ROCr API once it becomes available. llvm::StringRef StrGfxName(ComputeUnitKind); -IsAPU = llvm::StringSwitch(StrGfxName) -.Case("gfx940", true) -.Default(false); -if (IsAPU) - return Plugin::success(); - bool MayBeAPU = llvm::StringSwitch(StrGfxName) .Case("gfx942", true) .Default(false); diff --git a/offload/test/lit.cfg b/offload/test/lit.cfg index 658ae5f9653ba90..fe28418d9c1b1a3 100644 --- a/offload/test/lit.cfg +++ b/offload/test/lit.cfg @@ -132,12 +132,10 @@ elif config.libomptarget_current_target.startswith('amdgcn'): # amdgpu_test_arch contains a list of AMD GPUs in the system # only check the first one assuming that we will run the test on it. if not (config.amdgpu_test_arch.startswith("gfx90a") or -config.amdgpu_test_arch.startswith("gfx940") or config.amdgpu_test_arch.startswith("gfx942")): supports_unified_shared_memory = False # check if AMD architecture is an APU: -if (config.amdgpu_test_arch.startswith("gfx940") or -(config.amdgpu_test_arch.startswith("gfx942") and +if ((config.amdgpu_test_arch.startswith("gfx942") and evaluate_bool_env(config.environment['IS_APU']))): supports_apu = True if supports_unified_shared_memory: ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [libc] [libclc] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in offload and libclc (PR #125826)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/125826 >From f2a70b996f1e7b0c71af2e77f5fb6d11266ec82e Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Wed, 5 Feb 2025 04:19:00 -0500 Subject: [PATCH] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in offload and libclc gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. For SWDEV-512631 and SWDEV-512633 --- flang/cmake/modules/AddFlangOffloadRuntime.cmake | 2 +- libc/docs/gpu/using.rst | 2 +- libclc/CMakeLists.txt| 2 +- offload/plugins-nextgen/amdgpu/src/rtl.cpp | 6 -- offload/test/lit.cfg | 4 +--- 5 files changed, 4 insertions(+), 12 deletions(-) diff --git a/flang/cmake/modules/AddFlangOffloadRuntime.cmake b/flang/cmake/modules/AddFlangOffloadRuntime.cmake index 8e4f47d18535dcb..f1f6eb57c5d6cf3 100644 --- a/flang/cmake/modules/AddFlangOffloadRuntime.cmake +++ b/flang/cmake/modules/AddFlangOffloadRuntime.cmake @@ -98,7 +98,7 @@ macro(enable_omp_offload_compilation files) set(all_amdgpu_architectures "gfx700;gfx701;gfx801;gfx803;gfx900;gfx902;gfx906" -"gfx908;gfx90a;gfx90c;gfx940;gfx1010;gfx1030" +"gfx908;gfx90a;gfx90c;gfx942;gfx1010;gfx1030" "gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036" "gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151" "gfx1152;gfx1153" diff --git a/libc/docs/gpu/using.rst b/libc/docs/gpu/using.rst index 1c1f9c9bfb0c696..f17f6287be31349 100644 --- a/libc/docs/gpu/using.rst +++ b/libc/docs/gpu/using.rst @@ -44,7 +44,7 @@ this shouldn't be necessary. $> clang openmp.c -fopenmp --offload-arch=gfx90a -Xoffload-linker -lc $> clang cuda.cu --offload-arch=sm_80 --offload-new-driver -fgpu-rdc -Xoffload-linker -lc - $> clang hip.hip --offload-arch=gfx940 --offload-new-driver -fgpu-rdc -Xoffload-linker -lc + $> clang hip.hip --offload-arch=gfx942 --offload-new-driver -fgpu-rdc -Xoffload-linker -lc This will automatically link in the needed function definitions if they were required by the user's application. Normally using the ``-fgpu-rdc`` option diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt index b28da904ef68e15..a5f9c47f099080f 100644 --- a/libclc/CMakeLists.txt +++ b/libclc/CMakeLists.txt @@ -211,7 +211,7 @@ set( cayman_aliases aruba ) set( tahiti_aliases pitcairn verde oland hainan bonaire kabini kaveri hawaii mullins tonga tongapro iceland carrizo fiji stoney polaris10 polaris11 gfx602 gfx705 gfx805 - gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90a gfx90c gfx940 gfx941 gfx942 + gfx900 gfx902 gfx904 gfx906 gfx908 gfx909 gfx90a gfx90c gfx942 gfx1010 gfx1011 gfx1012 gfx1013 gfx1030 gfx1031 gfx1032 gfx1033 gfx1034 gfx1035 gfx1036 gfx1100 gfx1101 gfx1102 gfx1103 diff --git a/offload/plugins-nextgen/amdgpu/src/rtl.cpp b/offload/plugins-nextgen/amdgpu/src/rtl.cpp index 92184ba796dbd83..e83d38a14f77f67 100644 --- a/offload/plugins-nextgen/amdgpu/src/rtl.cpp +++ b/offload/plugins-nextgen/amdgpu/src/rtl.cpp @@ -2854,12 +2854,6 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy { Error checkIfAPU() { // TODO: replace with ROCr API once it becomes available. llvm::StringRef StrGfxName(ComputeUnitKind); -IsAPU = llvm::StringSwitch(StrGfxName) -.Case("gfx940", true) -.Default(false); -if (IsAPU) - return Plugin::success(); - bool MayBeAPU = llvm::StringSwitch(StrGfxName) .Case("gfx942", true) .Default(false); diff --git a/offload/test/lit.cfg b/offload/test/lit.cfg index 658ae5f9653ba90..fe28418d9c1b1a3 100644 --- a/offload/test/lit.cfg +++ b/offload/test/lit.cfg @@ -132,12 +132,10 @@ elif config.libomptarget_current_target.startswith('amdgcn'): # amdgpu_test_arch contains a list of AMD GPUs in the system # only check the first one assuming that we will run the test on it. if not (config.amdgpu_test_arch.startswith("gfx90a") or -config.amdgpu_test_arch.startswith("gfx940") or config.amdgpu_test_arch.startswith("gfx942")): supports_unified_shared_memory = False # check if AMD architecture is an APU: -if (config.amdgpu_test_arch.startswith("gfx940") or -(config.amdgpu_test_arch.startswith("gfx942") and +if ((config.amdgpu_test_arch.startswith("gfx942") and evaluate_bool_env(config.environment['IS_APU']))): supports_apu = True if supports_unified_shared_memory: ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)
kadircet wrote: what about landing this one, and cherry picking it into the 20 release. then we can land #126689 and revert this one. giving us the next release cycle to vet the underlying change? https://github.com/llvm/llvm-project/pull/126690 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)
https://github.com/kadircet approved this pull request. https://github.com/llvm/llvm-project/pull/126690 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)
https://github.com/kadircet closed https://github.com/llvm/llvm-project/pull/126690 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)
kadircet wrote: going to land this one, since review of #126689 can take longer and this should enable us to unblock our releases. I am happy to revert afterwards. https://github.com/llvm/llvm-project/pull/126690 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] c7cfe02 - Revert "[clang][HeuristicResolver] Additional hardening against an infinite l…"
Author: kadir çetinkaya Date: 2025-02-11T09:13:27+01:00 New Revision: c7cfe02fd5be25ff65b42bd21cb16fa27933af64 URL: https://github.com/llvm/llvm-project/commit/c7cfe02fd5be25ff65b42bd21cb16fa27933af64 DIFF: https://github.com/llvm/llvm-project/commit/c7cfe02fd5be25ff65b42bd21cb16fa27933af64.diff LOG: Revert "[clang][HeuristicResolver] Additional hardening against an infinite l…" This reverts commit 780894689ff741c761457eec1c925679309336a3. Added: Modified: clang/lib/Sema/HeuristicResolver.cpp Removed: diff --git a/clang/lib/Sema/HeuristicResolver.cpp b/clang/lib/Sema/HeuristicResolver.cpp index adce403412f689..3cbf33dcdced38 100644 --- a/clang/lib/Sema/HeuristicResolver.cpp +++ b/clang/lib/Sema/HeuristicResolver.cpp @@ -258,11 +258,7 @@ QualType HeuristicResolverImpl::simplifyType(QualType Type, const Expr *E, } return T; }; - // As an additional protection against infinite loops, bound the number of - // simplification steps. - size_t StepCount = 0; - const size_t MaxSteps = 64; - while (!Current.Type.isNull() && StepCount++ < MaxSteps) { + while (!Current.Type.isNull()) { TypeExprPair New = SimplifyOneStep(Current); if (New.Type == Current.Type) break; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)
kadircet wrote: oops I wasn't looking at the target branch :( let me cherry-pick this into main https://github.com/llvm/llvm-project/pull/126690 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][HeuristicResolver] Additional hardening against an infinite loop in simplifyType() (PR #126690)
HighCommander4 wrote: > and cherry picking it into the 20 release The regressing change, which introduced the `simplifyType()` function, isn't present on the llvm 20 branch (it landed a bit after the branch cut). So there should not be a need to cherry-pick the fix onto the llvm 20 branch either. https://github.com/llvm/llvm-project/pull/126690 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm-objcopy][ReleaseNotes] Fix prints wrong path when dump-section output path doesn't exist #125345 (PR #126607)
https://github.com/jh7370 approved this pull request. LGTM, with @MaskRay's suggestion. https://github.com/llvm/llvm-project/pull/126607 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits