date:20251211

[llvm-branch-commits] [llvm] backport: [RISCV] Sources of vmerge shouldn't overlap V0 (#170070) (PR #170604)

2025-12-11 Thread Philip Reames via llvm-branch-commits

preames wrote:

> I also want to emphasize that this will be the last 21.x release, so any 
> problems that arise from this change will NOT be fixed until 22.x.

I would lean towards leaving this unfixed in 21.x as 22.x should be released 
relatively soon, and the risk to the last point release (the one more folks 
than usual might rely on) is somewhat high.  

https://github.com/llvm/llvm-project/pull/170604
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (PR #171837)

2025-12-11 Thread Matt Arsenault via llvm-branch-commits



@@ -5553,6 +5553,37 @@ void computeKnownFPClass(const Value *V, const APInt 
&DemandedElts,
 
   // TODO: Copy inf handling from instructions
   break;
+case Intrinsic::amdgcn_rsq: {
+  KnownFPClass KnownSrc;
+  // The only negative value that can be returned is -0 for -0 inputs.
+  Known.knownNot(fcNegInf | fcNegSubnormal | fcNegNormal);
+
+  computeKnownFPClass(II->getArgOperand(0), DemandedElts, 
InterestedClasses,
+  KnownSrc, Q, Depth + 1);
+
+  if (KnownSrc.isKnownNever(fcSNan))
+Known.knownNot(fcSNan);
+
+  // Negative -> nan
+  if (KnownSrc.isKnownNeverNaN() && KnownSrc.cannotBeOrderedLessThanZero())
+Known.knownNot(fcNan);
+
+  Type *EltTy = II->getType()->getScalarType();
+
+  // f32 denormal always flushed.
+  if (EltTy->isFloatTy())
+Known.knownNot(fcPosSubnormal);

arsenm wrote:

There was just no documentation before, the instruction never changed 

https://github.com/llvm/llvm-project/pull/171837
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [WIP][CodeGen][DebugInfo][RISCV] Support scalable offsets in CFI (PR #170607)

2025-12-11 Thread Philip Reames via llvm-branch-commits


https://github.com/preames commented:

I went looking for how we handle scalable offsets in the current vector frame 
setup and CSR handling, and found appendScalableVectorExpression and 
createDefCFAExpression in RISCVFrameLowering.cpp.  Looking at the diff, I just 
noticed that there's an NFC buried in here -- we're replacing that logic with 
the new CFA Op being defined.  

My overall question is Why?  Why is the existing scheme not adequate for the 
new purpose?

https://github.com/llvm/llvm-project/pull/170607
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [WIP][CodeGen][DebugInfo][RISCV] Support scalable offsets in CFI (PR #170607)

2025-12-11 Thread Philip Reames via llvm-branch-commits


https://github.com/preames edited 
https://github.com/llvm/llvm-project/pull/170607
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [GlobalISel][AArch64] Added support for sli/sri intrinsics (PR #171448)

2025-12-11 Thread David Green via llvm-branch-commits


https://github.com/davemgreen approved this pull request.

Thanks - LGTM

https://github.com/llvm/llvm-project/pull/171448
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [Clang] Invoke pass plugin preCodeGenCallback (PR #171872)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Alexis Engelke (aengelke)


Changes

Single-use AddEmitPasses is inlined into RunCodegenPipeline for clarity
in comparing the parameters to the plugin and the parameters passed to
addPassesToEmitFile.


---
Full diff: https://github.com/llvm/llvm-project/pull/171872.diff


2 Files Affected:

- (modified) clang/lib/CodeGen/BackendUtil.cpp (+35-50) 
- (added) clang/test/CodeGen/codegen-plugins.c (+14) 


``diff
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index b39e303d13994..a5449d6c42e36 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -170,12 +170,6 @@ class EmitAssemblyHelper {
   /// the requested target.
   void CreateTargetMachine(bool MustCreateTM);
 
-  /// Add passes necessary to emit assembly or LLVM IR.
-  ///
-  /// \return True on success.
-  bool AddEmitPasses(legacy::PassManager &CodeGenPasses, BackendAction Action,
- raw_pwrite_stream &OS, raw_pwrite_stream *DwoOS);
-
   std::unique_ptr openOutputFile(StringRef Path) {
 std::error_code EC;
 auto F = std::make_unique(Path, EC,
@@ -647,33 +641,6 @@ void EmitAssemblyHelper::CreateTargetMachine(bool 
MustCreateTM) {
 TM->setLargeDataThreshold(CodeGenOpts.LargeDataThreshold);
 }
 
-bool EmitAssemblyHelper::AddEmitPasses(legacy::PassManager &CodeGenPasses,
-   BackendAction Action,
-   raw_pwrite_stream &OS,
-   raw_pwrite_stream *DwoOS) {
-  // Add LibraryInfo.
-  std::unique_ptr TLII(
-  llvm::driver::createTLII(TargetTriple, CodeGenOpts.getVecLib()));
-  CodeGenPasses.add(new TargetLibraryInfoWrapperPass(*TLII));
-
-  const llvm::TargetOptions &Options = TM->Options;
-  CodeGenPasses.add(new RuntimeLibraryInfoWrapper(
-  TargetTriple, Options.ExceptionModel, Options.FloatABIType,
-  Options.EABIVersion, Options.MCOptions.ABIName, Options.VecLib));
-
-  // Normal mode, emit a .s or .o file by running the code generator. Note,
-  // this also adds codegenerator level optimization passes.
-  CodeGenFileType CGFT = getCodeGenFileType(Action);
-
-  if (TM->addPassesToEmitFile(CodeGenPasses, OS, DwoOS, CGFT,
-  /*DisableVerify=*/!CodeGenOpts.VerifyModule)) {
-Diags.Report(diag::err_fe_unable_to_interface_with_target);
-return false;
-  }
-
-  return true;
-}
-
 static OptimizationLevel mapToLevel(const CodeGenOptions &Opts) {
   switch (Opts.OptimizationLevel) {
   default:
@@ -1258,29 +1225,47 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
 void EmitAssemblyHelper::RunCodegenPipeline(
 BackendAction Action, std::unique_ptr &OS,
 std::unique_ptr &DwoOS) {
+  if (!actionRequiresCodeGen(Action))
+return;
+
+  // Normal mode, emit a .s or .o file by running the code generator. Note,
+  // this also adds codegenerator level optimization passes.
+  CodeGenFileType CGFT = getCodeGenFileType(Action);
+
+  // Invoke pre-codegen callback from plugin, which might want to take over the
+  // entire code generation itself.
+  for (auto &Plugin : CodeGenOpts.PassPlugins) {
+if (Plugin.invokePreCodeGenCallback(*TheModule, *TM, CGFT, *OS))
+  return;
+  }
+
   // We still use the legacy PM to run the codegen pipeline since the new PM
   // does not work with the codegen pipeline.
   // FIXME: make the new PM work with the codegen pipeline.
   legacy::PassManager CodeGenPasses;
 
-  // Append any output we need to the pass manager.
-  switch (Action) {
-  case Backend_EmitAssembly:
-  case Backend_EmitMCNull:
-  case Backend_EmitObj:
-CodeGenPasses.add(
-createTargetTransformInfoWrapperPass(getTargetIRAnalysis()));
-if (!CodeGenOpts.SplitDwarfOutput.empty()) {
-  DwoOS = openOutputFile(CodeGenOpts.SplitDwarfOutput);
-  if (!DwoOS)
-return;
-}
-if (!AddEmitPasses(CodeGenPasses, Action, *OS,
-   DwoOS ? &DwoOS->os() : nullptr))
-  // FIXME: Should we handle this error differently?
+  CodeGenPasses.add(
+  createTargetTransformInfoWrapperPass(getTargetIRAnalysis()));
+  // Add LibraryInfo.
+  std::unique_ptr TLII(
+  llvm::driver::createTLII(TargetTriple, CodeGenOpts.getVecLib()));
+  CodeGenPasses.add(new TargetLibraryInfoWrapperPass(*TLII));
+
+  const llvm::TargetOptions &Options = TM->Options;
+  CodeGenPasses.add(new RuntimeLibraryInfoWrapper(
+  TargetTriple, Options.ExceptionModel, Options.FloatABIType,
+  Options.EABIVersion, Options.MCOptions.ABIName, Options.VecLib));
+
+  if (!CodeGenOpts.SplitDwarfOutput.empty()) {
+DwoOS = openOutputFile(CodeGenOpts.SplitDwarfOutput);
+if (!DwoOS)
   return;
-break;
-  default:
+  }
+
+  if (TM->addPassesToEmitFile(CodeGenPasses, *OS,
+  DwoOS ? &DwoOS->os() : nullptr, CGFT,
+  /*DisableVerify=*/!CodeGenOpts.VerifyModule))

[llvm-branch-commits] [Clang] Invoke pass plugin preCodeGenCallback (PR #171872)

2025-12-11 Thread Alexis Engelke via llvm-branch-commits


https://github.com/aengelke created 
https://github.com/llvm/llvm-project/pull/171872

Single-use AddEmitPasses is inlined into RunCodegenPipeline for clarity
in comparing the parameters to the plugin and the parameters passed to
addPassesToEmitFile.



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [LLVM] Add plugin hook for back-ends (PR #170846)

2025-12-11 Thread Alexis Engelke via llvm-branch-commits


aengelke wrote:

I put up #171872 for the Clang part, which now includes tests and in turn is 
based on #171868 to make plugins reasonably testable.

If there're no further comments or objections on the LLVM part, I'm going to 
merge this early next week.

https://github.com/llvm/llvm-project/pull/170846
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (PR #171837)

2025-12-11 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/171837

>From d269be7799a55afd78d9eecbc76055344a94b0b3 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Thu, 11 Dec 2025 14:47:59 +0100
Subject: [PATCH] ValueTracking: Handle amdgcn.rsq intrinsic in
 computeKnownFPClass

We have other target intrinsics already in ValueTracking functions,
and no access to TTI.
---
 llvm/lib/Analysis/ValueTracking.cpp   |  42 +++
 .../Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll | 112 +-
 2 files changed, 98 insertions(+), 56 deletions(-)

diff --git a/llvm/lib/Analysis/ValueTracking.cpp 
b/llvm/lib/Analysis/ValueTracking.cpp
index e98d13486d023..947e98d2d0c2a 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5553,6 +5553,48 @@ void computeKnownFPClass(const Value *V, const APInt 
&DemandedElts,
 
   // TODO: Copy inf handling from instructions
   break;
+case Intrinsic::amdgcn_rsq: {
+  KnownFPClass KnownSrc;
+  // The only negative value that can be returned is -inf for -0 inputs.
+  Known.knownNot(fcNegZero | fcNegSubnormal | fcNegNormal);
+
+  computeKnownFPClass(II->getArgOperand(0), DemandedElts, 
InterestedClasses,
+  KnownSrc, Q, Depth + 1);
+
+  // Negative -> nan
+  if (KnownSrc.isKnownNeverNaN() && KnownSrc.cannotBeOrderedLessThanZero())
+Known.knownNot(fcNan);
+  else if (KnownSrc.isKnownNever(fcSNan))
+Known.knownNot(fcSNan);
+
+  // -inf -> -0
+  if (KnownSrc.isKnownNeverNegInfinity())
+Known.knownNot(fcNegZero);
+
+  // +inf -> +0
+  if (KnownSrc.isKnownNeverPosInfinity())
+Known.knownNot(fcPosZero);
+
+  Type *EltTy = II->getType()->getScalarType();
+
+  // f32 denormal always flushed.
+  if (EltTy->isFloatTy())
+Known.knownNot(fcPosSubnormal);
+
+  if (const Function *F = II->getFunction()) {
+DenormalMode Mode = F->getDenormalMode(EltTy->getFltSemantics());
+
+// -0 -> -inf
+if (KnownSrc.isKnownNeverLogicalNegZero(Mode))
+  Known.knownNot(fcNegInf);
+
+// +0 -> +inf
+if (KnownSrc.isKnownNeverLogicalPosZero(Mode))
+  Known.knownNot(fcPosInf);
+  }
+
+  break;
+}
 default:
   break;
 }
diff --git a/llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll 
b/llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll
index 91b72c8873073..2e5bcf2bfff2e 100644
--- a/llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll
+++ b/llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll
@@ -6,9 +6,9 @@ declare float @llvm.amdgcn.rsq.f32(float)
 declare double @llvm.amdgcn.rsq.f64(double)
 
 define half @ret_rsq_f16(half %arg) {
-; CHECK-LABEL: define half @ret_rsq_f16(
+; CHECK-LABEL: define nofpclass(nzero nsub nnorm) half @ret_rsq_f16(
 ; CHECK-SAME: half [[ARG:%.*]]) #[[ATTR1:[0-9]+]] {
-; CHECK-NEXT:[[CALL:%.*]] = call half @llvm.amdgcn.rsq.f16(half [[ARG]]) 
#[[ATTR4:[0-9]+]]
+; CHECK-NEXT:[[CALL:%.*]] = call nofpclass(nzero nsub nnorm) half 
@llvm.amdgcn.rsq.f16(half [[ARG]]) #[[ATTR4:[0-9]+]]
 ; CHECK-NEXT:ret half [[CALL]]
 ;
   %call = call half @llvm.amdgcn.rsq.f16(half %arg)
@@ -16,9 +16,9 @@ define half @ret_rsq_f16(half %arg) {
 }
 
 define float @ret_rsq_f32(float %arg) {
-; CHECK-LABEL: define float @ret_rsq_f32(
+; CHECK-LABEL: define nofpclass(nzero sub nnorm) float @ret_rsq_f32(
 ; CHECK-SAME: float [[ARG:%.*]]) #[[ATTR1]] {
-; CHECK-NEXT:[[CALL:%.*]] = call float @llvm.amdgcn.rsq.f32(float [[ARG]]) 
#[[ATTR4]]
+; CHECK-NEXT:[[CALL:%.*]] = call nofpclass(nzero sub nnorm) float 
@llvm.amdgcn.rsq.f32(float [[ARG]]) #[[ATTR4]]
 ; CHECK-NEXT:ret float [[CALL]]
 ;
   %call = call float @llvm.amdgcn.rsq.f32(float %arg)
@@ -26,9 +26,9 @@ define float @ret_rsq_f32(float %arg) {
 }
 
 define double @ret_rsq_f64(double %arg) {
-; CHECK-LABEL: define double @ret_rsq_f64(
+; CHECK-LABEL: define nofpclass(nzero nsub nnorm) double @ret_rsq_f64(
 ; CHECK-SAME: double [[ARG:%.*]]) #[[ATTR1]] {
-; CHECK-NEXT:[[CALL:%.*]] = call double @llvm.amdgcn.rsq.f64(double 
[[ARG]]) #[[ATTR4]]
+; CHECK-NEXT:[[CALL:%.*]] = call nofpclass(nzero nsub nnorm) double 
@llvm.amdgcn.rsq.f64(double [[ARG]]) #[[ATTR4]]
 ; CHECK-NEXT:ret double [[CALL]]
 ;
   %call = call double @llvm.amdgcn.rsq.f64(double %arg)
@@ -37,9 +37,9 @@ define double @ret_rsq_f64(double %arg) {
 
 ; Result could still be -0 if negative argument is flushed.
 define float @ret_rsq_f32_dynamic_denormal_input(float %arg) #0 {
-; CHECK-LABEL: define float @ret_rsq_f32_dynamic_denormal_input(
+; CHECK-LABEL: define nofpclass(nzero sub nnorm) float 
@ret_rsq_f32_dynamic_denormal_input(
 ; CHECK-SAME: float [[ARG:%.*]]) #[[ATTR2:[0-9]+]] {
-; CHECK-NEXT:[[CALL:%.*]] = call float @llvm.amdgcn.rsq.f32(float [[ARG]]) 
#[[ATTR4]]
+; CHECK-NEXT:[[CALL:%.*]] = call nofpclass(nzero sub nnorm) float

[llvm-branch-commits] [llvm] ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (PR #171837)

2025-12-11 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/171837

>From d269be7799a55afd78d9eecbc76055344a94b0b3 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Thu, 11 Dec 2025 14:47:59 +0100
Subject: [PATCH] ValueTracking: Handle amdgcn.rsq intrinsic in
 computeKnownFPClass

We have other target intrinsics already in ValueTracking functions,
and no access to TTI.
---
 llvm/lib/Analysis/ValueTracking.cpp   |  42 +++
 .../Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll | 112 +-
 2 files changed, 98 insertions(+), 56 deletions(-)

diff --git a/llvm/lib/Analysis/ValueTracking.cpp 
b/llvm/lib/Analysis/ValueTracking.cpp
index e98d13486d023..947e98d2d0c2a 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5553,6 +5553,48 @@ void computeKnownFPClass(const Value *V, const APInt 
&DemandedElts,
 
   // TODO: Copy inf handling from instructions
   break;
+case Intrinsic::amdgcn_rsq: {
+  KnownFPClass KnownSrc;
+  // The only negative value that can be returned is -inf for -0 inputs.
+  Known.knownNot(fcNegZero | fcNegSubnormal | fcNegNormal);
+
+  computeKnownFPClass(II->getArgOperand(0), DemandedElts, 
InterestedClasses,
+  KnownSrc, Q, Depth + 1);
+
+  // Negative -> nan
+  if (KnownSrc.isKnownNeverNaN() && KnownSrc.cannotBeOrderedLessThanZero())
+Known.knownNot(fcNan);
+  else if (KnownSrc.isKnownNever(fcSNan))
+Known.knownNot(fcSNan);
+
+  // -inf -> -0
+  if (KnownSrc.isKnownNeverNegInfinity())
+Known.knownNot(fcNegZero);
+
+  // +inf -> +0
+  if (KnownSrc.isKnownNeverPosInfinity())
+Known.knownNot(fcPosZero);
+
+  Type *EltTy = II->getType()->getScalarType();
+
+  // f32 denormal always flushed.
+  if (EltTy->isFloatTy())
+Known.knownNot(fcPosSubnormal);
+
+  if (const Function *F = II->getFunction()) {
+DenormalMode Mode = F->getDenormalMode(EltTy->getFltSemantics());
+
+// -0 -> -inf
+if (KnownSrc.isKnownNeverLogicalNegZero(Mode))
+  Known.knownNot(fcNegInf);
+
+// +0 -> +inf
+if (KnownSrc.isKnownNeverLogicalPosZero(Mode))
+  Known.knownNot(fcPosInf);
+  }
+
+  break;
+}
 default:
   break;
 }
diff --git a/llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll 
b/llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll
index 91b72c8873073..2e5bcf2bfff2e 100644
--- a/llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll
+++ b/llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll
@@ -6,9 +6,9 @@ declare float @llvm.amdgcn.rsq.f32(float)
 declare double @llvm.amdgcn.rsq.f64(double)
 
 define half @ret_rsq_f16(half %arg) {
-; CHECK-LABEL: define half @ret_rsq_f16(
+; CHECK-LABEL: define nofpclass(nzero nsub nnorm) half @ret_rsq_f16(
 ; CHECK-SAME: half [[ARG:%.*]]) #[[ATTR1:[0-9]+]] {
-; CHECK-NEXT:[[CALL:%.*]] = call half @llvm.amdgcn.rsq.f16(half [[ARG]]) 
#[[ATTR4:[0-9]+]]
+; CHECK-NEXT:[[CALL:%.*]] = call nofpclass(nzero nsub nnorm) half 
@llvm.amdgcn.rsq.f16(half [[ARG]]) #[[ATTR4:[0-9]+]]
 ; CHECK-NEXT:ret half [[CALL]]
 ;
   %call = call half @llvm.amdgcn.rsq.f16(half %arg)
@@ -16,9 +16,9 @@ define half @ret_rsq_f16(half %arg) {
 }
 
 define float @ret_rsq_f32(float %arg) {
-; CHECK-LABEL: define float @ret_rsq_f32(
+; CHECK-LABEL: define nofpclass(nzero sub nnorm) float @ret_rsq_f32(
 ; CHECK-SAME: float [[ARG:%.*]]) #[[ATTR1]] {
-; CHECK-NEXT:[[CALL:%.*]] = call float @llvm.amdgcn.rsq.f32(float [[ARG]]) 
#[[ATTR4]]
+; CHECK-NEXT:[[CALL:%.*]] = call nofpclass(nzero sub nnorm) float 
@llvm.amdgcn.rsq.f32(float [[ARG]]) #[[ATTR4]]
 ; CHECK-NEXT:ret float [[CALL]]
 ;
   %call = call float @llvm.amdgcn.rsq.f32(float %arg)
@@ -26,9 +26,9 @@ define float @ret_rsq_f32(float %arg) {
 }
 
 define double @ret_rsq_f64(double %arg) {
-; CHECK-LABEL: define double @ret_rsq_f64(
+; CHECK-LABEL: define nofpclass(nzero nsub nnorm) double @ret_rsq_f64(
 ; CHECK-SAME: double [[ARG:%.*]]) #[[ATTR1]] {
-; CHECK-NEXT:[[CALL:%.*]] = call double @llvm.amdgcn.rsq.f64(double 
[[ARG]]) #[[ATTR4]]
+; CHECK-NEXT:[[CALL:%.*]] = call nofpclass(nzero nsub nnorm) double 
@llvm.amdgcn.rsq.f64(double [[ARG]]) #[[ATTR4]]
 ; CHECK-NEXT:ret double [[CALL]]
 ;
   %call = call double @llvm.amdgcn.rsq.f64(double %arg)
@@ -37,9 +37,9 @@ define double @ret_rsq_f64(double %arg) {
 
 ; Result could still be -0 if negative argument is flushed.
 define float @ret_rsq_f32_dynamic_denormal_input(float %arg) #0 {
-; CHECK-LABEL: define float @ret_rsq_f32_dynamic_denormal_input(
+; CHECK-LABEL: define nofpclass(nzero sub nnorm) float 
@ret_rsq_f32_dynamic_denormal_input(
 ; CHECK-SAME: float [[ARG:%.*]]) #[[ATTR2:[0-9]+]] {
-; CHECK-NEXT:[[CALL:%.*]] = call float @llvm.amdgcn.rsq.f32(float [[ARG]]) 
#[[ATTR4]]
+; CHECK-NEXT:[[CALL:%.*]] = call nofpclass(nzero sub nnorm) float

[llvm-branch-commits] [flang] [flang][OpenMP] Generalize checks of loop construct structure (PR #170735)

2025-12-11 Thread Tom Eccles via llvm-branch-commits


https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/170735
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [flang][OpenMP] Generalize checks of loop construct structure (PR #170735)

2025-12-11 Thread Tom Eccles via llvm-branch-commits


https://github.com/tblah commented:

Overall this looks great. The LoopRange helper was a good idea. Just one comment

https://github.com/llvm/llvm-project/pull/170735
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [flang][OpenMP] Generalize checks of loop construct structure (PR #170735)

2025-12-11 Thread Tom Eccles via llvm-branch-commits



@@ -262,41 +270,106 @@ static bool IsLoopTransforming(llvm::omp::Directive dir) 
{
   }
 }
 
-void OmpStructureChecker::CheckNestedBlock(const parser::OpenMPLoopConstruct 
&x,
-const parser::Block &body, size_t &nestedCount) {
+void OmpStructureChecker::CheckNestedBlock(
+const parser::OpenMPLoopConstruct &x, const parser::Block &body) {
   for (auto &stmt : body) {
 if (auto *dir{parser::Unwrap(stmt)}) {
   context_.Say(dir->source,
   "Compiler directives are not allowed inside OpenMP loop 
constructs"_warn_en_US);
-} else if (parser::Unwrap(stmt)) {
-  ++nestedCount;
 } else if (auto *omp{parser::Unwrap(stmt)}) {
   if (!IsLoopTransforming(omp->BeginDir().DirName().v)) {
 context_.Say(omp->source,
 "Only loop-transforming OpenMP constructs are allowed inside 
OpenMP loop constructs"_err_en_US);
   }
-  ++nestedCount;
 } else if (auto *block{parser::Unwrap(stmt)}) {
-  CheckNestedBlock(x, std::get(block->t), nestedCount);
-} else {
+  CheckNestedBlock(x, std::get(block->t));
+} else if (!parser::Unwrap(stmt)) {
   parser::CharBlock source{parser::GetSource(stmt).value_or(x.source)};
   context_.Say(source,
   "OpenMP loop construct can only contain DO loops or 
loop-nest-generating OpenMP constructs"_err_en_US);
 }
   }
 }
 
+static bool IsFullUnroll(const parser::OpenMPLoopConstruct &x) {
+  const parser::OmpDirectiveSpecification &beginSpec{x.BeginDir()};
+
+  if (beginSpec.DirName().v == llvm::omp::Directive::OMPD_unroll) {
+return llvm::none_of(beginSpec.Clauses().v, [](const parser::OmpClause &c) 
{
+  return c.Id() == llvm::omp::Clause::OMPC_partial;
+});
+  }
+  return false;
+}
+
+static std::optional CountGeneratedLoops(
+const parser::ExecutionPartConstruct &epc) {
+  if (parser::Unwrap(epc)) {
+return 1;
+  }
+
+  auto &omp{DEREF(parser::Unwrap(epc))};
+  const parser::OmpDirectiveSpecification &beginSpec{omp.BeginDir()};
+  llvm::omp::Directive dir{beginSpec.DirName().v};
+
+  // TODO: Handle split, apply.
+  if (IsFullUnroll(omp)) {
+return std::nullopt;
+  }
+  if (dir == llvm::omp::Directive::OMPD_fuse) {
+auto rangeAt{
+llvm::find_if(beginSpec.Clauses().v, [](const parser::OmpClause &c) {
+  return c.Id() == llvm::omp::Clause::OMPC_looprange;
+})};
+if (rangeAt == beginSpec.Clauses().v.end()) {
+  return std::nullopt;
+}
+
+auto *loopRange{parser::Unwrap(*rangeAt)};
+std::optional count{GetIntValue(std::get<1>(loopRange->t))};
+if (!count || *count <= 0) {
+  return std::nullopt;
+}
+if (auto nestedCount{CountGeneratedLoops(std::get(omp.t))}) 
{
+  return 1 + *nestedCount - static_cast(*count);

tblah wrote:

Could this subtraction wrap for erroneous code with a bad looprange clause?

https://github.com/llvm/llvm-project/pull/170735
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Serialize the global namespace name in JSON (PR #171701)

2025-12-11 Thread Erick Velez via llvm-branch-commits


https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/171701

>From 28f8a29700af3b98e6c9c6c8c36db9d17009d864 Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Sun, 16 Nov 2025 18:28:26 -0800
Subject: [PATCH] fix unittest

---
 clang-tools-extra/clang-doc/JSONGenerator.cpp | 2 ++
 .../clang-doc/assets/namespace-template.mustache  | 2 +-
 clang-tools-extra/test/clang-doc/json/concept.cpp | 2 +-
 clang-tools-extra/test/clang-doc/json/namespace.cpp   | 2 +-
 .../test/clang-doc/mustache-separate-namespace.cpp| 2 +-
 clang-tools-extra/test/clang-doc/namespace.cpp| 8 
 .../unittests/clang-doc/JSONGeneratorTest.cpp | 2 +-
 7 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/clang-tools-extra/clang-doc/JSONGenerator.cpp 
b/clang-tools-extra/clang-doc/JSONGenerator.cpp
index c47c65ddc2d73..83fa556782793 100644
--- a/clang-tools-extra/clang-doc/JSONGenerator.cpp
+++ b/clang-tools-extra/clang-doc/JSONGenerator.cpp
@@ -610,6 +610,8 @@ static void serializeInfo(const VarInfo &I, json::Object 
&Obj,
 static void serializeInfo(const NamespaceInfo &I, json::Object &Obj,
   const std::optional RepositoryUrl) {
   serializeCommonAttributes(I, Obj, RepositoryUrl);
+  if (I.USR == GlobalNamespaceID)
+Obj["Name"] = "Global Namespace";
 
   if (!I.Children.Namespaces.empty())
 serializeArray(I.Children.Namespaces, Obj, "Namespaces",
diff --git a/clang-tools-extra/clang-doc/assets/namespace-template.mustache 
b/clang-tools-extra/clang-doc/assets/namespace-template.mustache
index 7fb66cadbb8e8..9450f9b4fc684 100644
--- a/clang-tools-extra/clang-doc/assets/namespace-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/namespace-template.mustache
@@ -44,7 +44,7 @@
 
 
 
-{{RecordType}} {{Name}}
+{{#RecordType}}{{RecordType}} {{/RecordType}}{{Name}}
 
 {{#HasEnums}}
 
diff --git a/clang-tools-extra/test/clang-doc/json/concept.cpp 
b/clang-tools-extra/test/clang-doc/json/concept.cpp
index 5d8c47eff0a16..f4c4ad3946d47 100644
--- a/clang-tools-extra/test/clang-doc/json/concept.cpp
+++ b/clang-tools-extra/test/clang-doc/json/concept.cpp
@@ -31,6 +31,6 @@ concept Incrementable = requires(T x) {
 // CHECK-NEXT:"USR": "{{[0-9A-F]*}}"
 // CHECK-NEXT:  }
 // CHECK-NEXT:],
-// CHECK:"Name": "",
+// CHECK:"Name": "Global Namespace",
 // CHECK:"USR": ""
 // CHECK:  }
diff --git a/clang-tools-extra/test/clang-doc/json/namespace.cpp 
b/clang-tools-extra/test/clang-doc/json/namespace.cpp
index dd7a9af9c82a0..c1370d9fe379f 100644
--- a/clang-tools-extra/test/clang-doc/json/namespace.cpp
+++ b/clang-tools-extra/test/clang-doc/json/namespace.cpp
@@ -75,7 +75,7 @@ typedef int MyTypedef;
 // CHECK-NEXT:   "HasEnums": true,
 // CHECK-NEXT:   "HasRecords": true,
 // CHECK-NEXT:   "InfoType": "namespace",
-// CHECK-NEXT:   "Name": "",
+// CHECK-NEXT:   "Name": "Global Namespace",
 // CHECK-NEXT:   "Namespaces": [
 // CHECK-NEXT: {
 // CHECK-NEXT:   "End": true,
diff --git a/clang-tools-extra/test/clang-doc/mustache-separate-namespace.cpp 
b/clang-tools-extra/test/clang-doc/mustache-separate-namespace.cpp
index cb0f9dc64bba6..7fbf51c4efd30 100644
--- a/clang-tools-extra/test/clang-doc/mustache-separate-namespace.cpp
+++ b/clang-tools-extra/test/clang-doc/mustache-separate-namespace.cpp
@@ -19,7 +19,7 @@ namespace MyNamespace {
 // CHECK-GLOBAL: 
 // CHECK-GLOBAL-NEXT:
 // CHECK-GLOBAL-NEXT:
-// CHECK-GLOBAL-NEXT: 
+// CHECK-GLOBAL-NEXT:Global Namespace
 // CHECK-GLOBAL-NEXT:
 // CHECK-GLOBAL-NEXT:
 // CHECK-GLOBAL-NEXT:
diff --git a/clang-tools-extra/test/clang-doc/namespace.cpp 
b/clang-tools-extra/test/clang-doc/namespace.cpp
index 029f9974e775e..8580ea6739a21 100644
--- a/clang-tools-extra/test/clang-doc/namespace.cpp
+++ b/clang-tools-extra/test/clang-doc/namespace.cpp
@@ -63,7 +63,7 @@ class AnonClass {};
 // MD-ANON-INDEX: ### anonFunction
 // MD-ANON-INDEX: *void anonFunction()*
 
-// HTML-ANON-INDEX:  @nonymous_namespace
+// HTML-ANON-INDEX: @nonymous_namespace
 // HTML-ANON-INDEX: Inner Classes
 // HTML-ANON-INDEX: 
 // HTML-ANON-INDEX: 
@@ -119,7 +119,7 @@ class ClassInNestedNamespace {};
 // MD-NESTED-INDEX: *void functionInNestedNamespace()*
 // MD-NESTED-INDEX: Function in NestedNamespace
 
-// HTML-NESTED-INDEX:  NestedNamespace
+// HTML-NESTED-INDEX: NestedNamespace
 // HTML-NESTED-INDEX: Inner Classes
 // HTML-NESTED-INDEX: 
 // HTML-NESTED-INDEX: 
@@ -145,7 +145,7 @@ class ClassInNestedNamespace {};
 // MD-PRIMARY-INDEX: *void functionInPrimaryNamespace()*
 // MD-PRIMARY-INDEX:  Function in PrimaryNamespace
 
-// HTML-PRIMA

[llvm-branch-commits] [llvm] ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (PR #171837)

2025-12-11 Thread via llvm-branch-commits


github-actions[bot] wrote:


# :window: Windows x64 Test Results

* 125587 tests passed
* 2775 tests skipped

All executed tests passed, but another part of the build **failed**. Click on a 
failure below to see the details.


[code=4294967295] bin/clang-move.exe

```
FAILED: [code=4294967295] bin/clang-move.exe
cmd.exe /C "cd . && 
C:\BuildTools\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe 
-E vs_link_exe 
--intdir=tools\clang\tools\extra\clang-move\tool\CMakeFiles\clang-move.dir 
--rc="C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64\rc.exe" 
--mt="C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64\mt.exe" 
--manifests  -- C:\clang\clang-msvc\bin\lld-link.exe /nologo 
tools\clang\tools\extra\clang-move\tool\CMakeFiles\clang-move.dir\ClangMove.cpp.obj
 
tools\clang\tools\extra\clang-move\tool\CMakeFiles\clang-move.dir\C_\_work\llvm-project\llvm-project\llvm\resources\windows_version_resource.rc.res
  /out:bin\clang-move.exe /implib:lib\clang-move.lib /pdb:bin\clang-move.pdb 
/version:0.0 /MANIFEST:NO /STACK:1000 /INCREMENTAL:NO /subsystem:console  
lib\LLVMSupport.lib  lib\LLVMFrontendOpenMP.lib  lib\clangAST.lib  
lib\clangASTMatchers.lib  lib\clangBasic.lib  lib\clangFormat.lib  
lib\clangFrontend.lib  lib\clangRewrite.lib  lib\clangSerialization.lib  
lib\clangTooling.lib  lib\clangToolingCore.lib  lib\clangMove.lib  
lib\clangTooling.lib  lib\clangFormat.lib  lib\clangToolingInclusions.lib  
lib\clangDependencyScanning.lib  lib\clangDriver.lib  lib\clangFrontend.lib  
lib\clangParse.lib  lib\clangSerialization.lib  lib\clangSema.lib  
lib\clangAPINotes.lib  lib\clangEdit.lib  lib\clangAnalysisLifetimeSafety.lib  
lib\clangSupport.lib  lib\clangOptions.lib  version.lib  
lib\LLVMWindowsDriver.lib  lib\LLVMOption.lib  lib\clangToolingCore.lib  
lib\clangRewrite.lib  lib\clangAnalysis.lib  lib\clangASTMatchers.lib  
lib\clangAST.lib  lib\clangLex.lib  lib\clangBasic.lib  
lib\LLVMFrontendOpenMP.lib  lib\LLVMScalarOpts.lib  
lib\LLVMAggressiveInstCombine.lib  lib\LLVMInstCombine.lib  
lib\LLVMFrontendOffloading.lib  lib\LLVMTransformUtils.lib  
lib\LLVMObjectYAML.lib  lib\LLVMFrontendAtomic.lib  lib\LLVMAnalysis.lib  
lib\LLVMFrontendHLSL.lib  lib\LLVMProfileData.lib  lib\LLVMSymbolize.lib  
lib\LLVMDebugInfoGSYM.lib  lib\LLVMDebugInfoPDB.lib  
lib\LLVMDebugInfoCodeView.lib  "C:\BuildTools\DIA SDK\lib\amd64\diaguids.lib"  
lib\LLVMDebugInfoMSF.lib  lib\LLVMDebugInfoBTF.lib  lib\LLVMDebugInfoDWARF.lib  
lib\LLVMObject.lib  lib\LLVMMCParser.lib  lib\LLVMMC.lib  
lib\LLVMDebugInfoDWARFLowLevel.lib  lib\LLVMIRReader.lib  lib\LLVMBitReader.lib 
 lib\LLVMAsmParser.lib  lib\LLVMCore.lib  lib\LLVMRemarks.lib  
lib\LLVMBitstreamReader.lib  lib\LLVMTextAPI.lib  lib\LLVMBinaryFormat.lib  
lib\LLVMFrontendDirective.lib  lib\LLVMTargetParser.lib  lib\LLVMSupport.lib  
psapi.lib  shell32.lib  ole32.lib  uuid.lib  advapi32.lib  ws2_32.lib  
ntdll.lib  delayimp.lib  -delayload:shell32.dll  -delayload:ole32.dll  
lib\LLVMDemangle.lib  kernel32.lib user32.lib gdi32.lib winspool.lib 
shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
LINK: command "C:\clang\clang-msvc\bin\lld-link.exe /nologo 
tools\clang\tools\extra\clang-move\tool\CMakeFiles\clang-move.dir\ClangMove.cpp.obj
 
tools\clang\tools\extra\clang-move\tool\CMakeFiles\clang-move.dir\C_\_work\llvm-project\llvm-project\llvm\resources\windows_version_resource.rc.res
 /out:bin\clang-move.exe /implib:lib\clang-move.lib /pdb:bin\clang-move.pdb 
/version:0.0 /MANIFEST:NO /STACK:1000 /INCREMENTAL:NO /subsystem:console 
lib\LLVMSupport.lib lib\LLVMFrontendOpenMP.lib lib\clangAST.lib 
lib\clangASTMatchers.lib lib\clangBasic.lib lib\clangFormat.lib 
lib\clangFrontend.lib lib\clangRewrite.lib lib\clangSerialization.lib 
lib\clangTooling.lib lib\clangToolingCore.lib lib\clangMove.lib 
lib\clangTooling.lib lib\clangFormat.lib lib\clangToolingInclusions.lib 
lib\clangDependencyScanning.lib lib\clangDriver.lib lib\clangFrontend.lib 
lib\clangParse.lib lib\clangSerialization.lib lib\clangSema.lib 
lib\clangAPINotes.lib lib\clangEdit.lib lib\clangAnalysisLifetimeSafety.lib 
lib\clangSupport.lib lib\clangOptions.lib version.lib lib\LLVMWindowsDriver.lib 
lib\LLVMOption.lib lib\clangToolingCore.lib lib\clangRewrite.lib 
lib\clangAnalysis.lib lib\clangASTMatchers.lib lib\clangAST.lib 
lib\clangLex.lib lib\clangBasic.lib lib\LLVMFrontendOpenMP.lib 
lib\LLVMScalarOpts.lib lib\LLVMAggressiveInstCombine.lib 
lib\LLVMInstCombine.lib lib\LLVMFrontendOffloading.lib 
lib\LLVMTransformUtils.lib lib\LLVMObjectYAML.lib lib\LLVMFrontendAtomic.lib 
lib\LLVMAnalysis.lib lib\LLVMFrontendHLSL.lib lib\LLVMProfileData.lib 
lib\LLVMSymbolize.lib lib\LLVMDebugInfoGSYM.lib lib\LLVMDebugInfoPDB.lib 
lib\LLVMDebugInfoCodeView.lib C:\BuildTools\DIA SDK\lib\amd64\diaguids.lib 
lib\LLVMDebugInfoMSF.lib lib\LLVMDebugInfoBTF.lib lib\LLVMDebugInfoDWARF.lib 
lib\LLVMObject.lib lib\LLVMMCParser.lib lib\LLVMMC.lib 
lib\LLVMDebugInfoDWARFLow

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Create a partial for the navbar (PR #171669)

2025-12-11 Thread Erick Velez via llvm-branch-commits


https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/171669

>From 1d32cb2b5eded74705d269fa8140982459ae0e70 Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Thu, 4 Dec 2025 12:33:18 -0800
Subject: [PATCH] [clang-doc] Create a partial for the navbar

Move navbar section to its own template so ensure consistency across
templates
---
 clang-tools-extra/clang-doc/HTMLGenerator.cpp |  5 -
 .../clang-doc/assets/class-template.mustache  | 20 +--
 .../assets/namespace-template.mustache| 20 +--
 .../clang-doc/assets/navbar-template.mustache | 19 ++
 clang-tools-extra/clang-doc/support/Utils.cpp |  3 +++
 .../clang-doc/tool/CMakeLists.txt |  1 +
 6 files changed, 29 insertions(+), 39 deletions(-)
 create mode 100644 clang-tools-extra/clang-doc/assets/navbar-template.mustache

diff --git a/clang-tools-extra/clang-doc/HTMLGenerator.cpp 
b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
index 19018f2cf845d..77b287476423e 100644
--- a/clang-tools-extra/clang-doc/HTMLGenerator.cpp
+++ b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
@@ -68,11 +68,14 @@ Error HTMLGenerator::setupTemplateFiles(const 
ClangDocContext &CDCtx) {
   ConvertToNative(CDCtx.MustacheTemplates.lookup("enum-template"));
   std::string HeadFilePath =
   ConvertToNative(CDCtx.MustacheTemplates.lookup("head-template"));
+  std::string NavbarFilePath =
+  ConvertToNative(CDCtx.MustacheTemplates.lookup("navbar-template"));
   std::vector> Partials = {
   {"Comments", CommentFilePath},
   {"FunctionPartial", FunctionFilePath},
   {"EnumPartial", EnumFilePath},
-  {"HeadPartial", HeadFilePath}};
+  {"HeadPartial", HeadFilePath},
+  {"NavbarPartial", NavbarFilePath}};
 
   if (Error Err = setupTemplate(NamespaceTemplate, NamespaceFilePath, 
Partials))
 return Err;
diff --git a/clang-tools-extra/clang-doc/assets/class-template.mustache 
b/clang-tools-extra/clang-doc/assets/class-template.mustache
index 9c5019510b43c..fcd923cd9db93 100644
--- a/clang-tools-extra/clang-doc/assets/class-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/class-template.mustache
@@ -9,25 +9,7 @@
 
 {{>HeadPartial}}
 
-
-
-{{#ProjectName}}
-
-{{ProjectName}}
-
-{{/ProjectName}}
-
-
-
-Namespace
-
-
-Class
-
-
-
-
-
+{{>NavbarPartial}}
 
 
 
diff --git a/clang-tools-extra/clang-doc/assets/namespace-template.mustache 
b/clang-tools-extra/clang-doc/assets/namespace-template.mustache
index f386eb2e6a581..5c0d2fb14d3c9 100644
--- a/clang-tools-extra/clang-doc/assets/namespace-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/namespace-template.mustache
@@ -9,25 +9,7 @@
 
 {{>HeadPartial}}
 
-
-
-{{#ProjectName}}
-
-{{ProjectName}}
-
-{{/ProjectName}}
-
-
-
-Namespace
-
-
-Class
-
-
-
-
-
+{{>NavbarPartial}}
 
 
 
diff --git a/clang-tools-extra/clang-doc/assets/navbar-template.mustache 
b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
new file mode 100644
index 0..178d147a556d3
--- /dev/null
+++ b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
@@ -0,0 +1,19 @@
+
+
+{{#ProjectName}}
+
+{{ProjectName}}
+
+{{/ProjectName}}
+
+
+
+Namespace
+
+
+Class
+
+
+
+
+
diff --git a/clang-tools-extra/clang-doc/support/Utils.cpp 
b/clang-tools-extra/clang-doc/support/Utils.cpp
index 50e849dc26c79..d0fd6f45b8a02 100644
--- a/clang-tools-extra/clang-doc/support/Utils.cpp
+++ b/clang-tools-extra/clang-doc/support/Utils.cpp
@@ -58,6 +58,8 @@ void getHtmlFiles(StringRef AssetsPath, 
clang::doc::ClangDocContext &CDCtx) {
   appendPathPosix(AssetsPath, "comment-template.mustache");
   SmallString<128> HeadTemplate =
   appendPathPosix(AssetsPath, "head-template.mustache");
+  SmallString<128> NavbarTemplate =
+  appendPathPosix(AssetsPath, "navbar-template.mustache");
 
   CDCtx.MustacheTemplates.insert(
   {"namespace-template", NamespaceTemplate.c_str()});
@@ -67,4 +69,5 @@ void getHtmlFiles(StringRef AssetsPath, 
clang::doc::ClangDocContext &CDCtx) {
   {"function-template", FunctionTemplate.c_str()});
   CDCtx.MustacheTemplates.insert({"comment-template", 
CommentTemplate.c_str()});
   CDCtx.MustacheTemplates.insert

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add JSON bools for parents, vparents and test (PR #171699)

2025-12-11 Thread Erick Velez via llvm-branch-commits


https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/171699

>From 45d8e8a1662089056d2293c631b617ab22986f60 Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Sun, 16 Nov 2025 12:43:44 -0800
Subject: [PATCH] fix unittest

---
 clang-tools-extra/clang-doc/JSONGenerator.cpp |   8 +-
 .../test/clang-doc/json/inheritance.cpp   | 111 ++
 .../unittests/clang-doc/JSONGeneratorTest.cpp |   2 +
 3 files changed, 119 insertions(+), 2 deletions(-)
 create mode 100644 clang-tools-extra/test/clang-doc/json/inheritance.cpp

diff --git a/clang-tools-extra/clang-doc/JSONGenerator.cpp 
b/clang-tools-extra/clang-doc/JSONGenerator.cpp
index 0253ebf5335da..c65c3dc759c3e 100644
--- a/clang-tools-extra/clang-doc/JSONGenerator.cpp
+++ b/clang-tools-extra/clang-doc/JSONGenerator.cpp
@@ -572,12 +572,16 @@ static void serializeInfo(const RecordInfo &I, 
json::Object &Obj,
   serializeInfo(Base, BaseObj, RepositoryUrl);
 });
 
-  if (!I.Parents.empty())
+  if (!I.Parents.empty()) {
 serializeArray(I.Parents, Obj, "Parents", SerializeReferenceLambda);
+Obj["HasParents"] = true;
+  }
 
-  if (!I.VirtualParents.empty())
+  if (!I.VirtualParents.empty()) {
 serializeArray(I.VirtualParents, Obj, "VirtualParents",
SerializeReferenceLambda);
+Obj["HasVirtualParents"] = true;
+  }
 
   if (I.Template)
 serializeInfo(I.Template.value(), Obj);
diff --git a/clang-tools-extra/test/clang-doc/json/inheritance.cpp 
b/clang-tools-extra/test/clang-doc/json/inheritance.cpp
new file mode 100644
index 0..53476da870c61
--- /dev/null
+++ b/clang-tools-extra/test/clang-doc/json/inheritance.cpp
@@ -0,0 +1,111 @@
+// RUN: rm -rf %t && mkdir -p %t
+// RUN: clang-doc --output=%t --format=json --executor=standalone %s
+// RUN: FileCheck %s < %t/json/GlobalNamespace/_ZTV7MyClass.json
+
+class Virtual {};
+class Foo : virtual Virtual {};
+class Bar : Foo {};
+class Fizz : virtual Virtual {};
+class Buzz : Fizz {};
+
+class MyClass : Bar, Buzz {};
+
+// CHECK:   "Bases": [
+// CHECK-NEXT:{
+// CHECK-NEXT: "Access": "private",
+// CHECK-NEXT:  "InfoType": "record",
+// CHECK-NEXT:  "IsParent": true,
+// CHECK-NEXT:  "IsTypedef": false,
+// CHECK-NEXT:  "IsVirtual": false,
+// CHECK-NEXT:  "MangledName": "",
+// CHECK-NEXT:  "Name": "Bar",
+// CHECK-NEXT:  "Path": "GlobalNamespace",
+// CHECK-NEXT:  "TagType": "struct",
+// CHECK-NEXT:  "USR": "{{[0-9A-F]*}}"
+// CHECK-NEXT:},
+// CHECK-NEXT:{
+// CHECK-NEXT:  "Access": "private",
+// CHECK-NEXT:  "InfoType": "record",
+// CHECK-NEXT:  "IsParent": false,
+// CHECK-NEXT:  "IsTypedef": false,
+// CHECK-NEXT:  "IsVirtual": false,
+// CHECK-NEXT:  "MangledName": "",
+// CHECK-NEXT:  "Name": "Foo",
+// CHECK-NEXT:  "Path": "GlobalNamespace",
+// CHECK-NEXT:  "TagType": "struct",
+// CHECK-NEXT:  "USR": "{{[0-9A-F]*}}"
+// CHECK-NEXT:},
+// CHECK-NEXT:{
+// CHECK-NEXT:  "Access": "private",
+// CHECK-NEXT:  "InfoType": "record",
+// CHECK-NEXT:  "IsParent": false,
+// CHECK-NEXT:  "IsTypedef": false,
+// CHECK-NEXT:  "IsVirtual": true,
+// CHECK-NEXT:  "MangledName": "",
+// CHECK-NEXT:  "Name": "Virtual",
+// CHECK-NEXT:  "Path": "GlobalNamespace",
+// CHECK-NEXT:  "TagType": "struct",
+// CHECK-NEXT:  "USR": "{{[0-9A-F]*}}"
+// CHECK-NEXT:},
+// CHECK-NEXT:{
+// CHECK-NEXT:  "Access": "private",
+// CHECK-NEXT:  "InfoType": "record",
+// CHECK-NEXT:  "IsParent": true,
+// CHECK-NEXT:  "IsTypedef": false,
+// CHECK-NEXT:  "IsVirtual": false,
+// CHECK-NEXT:  "MangledName": "",
+// CHECK-NEXT:  "Name": "Buzz",
+// CHECK-NEXT:  "Path": "GlobalNamespace",
+// CHECK-NEXT:  "TagType": "struct",
+// CHECK-NEXT:  "USR": "{{[0-9A-F]*}}"
+// CHECK-NEXT:},
+// CHECK-NEXT:{
+// CHECK-NEXT:  "Access": "private",
+// CHECK-NEXT:  "InfoType": "record",
+// CHECK-NEXT:  "IsParent": false,
+// CHECK-NEXT:  "IsTypedef": false,
+// CHECK-NEXT:  "IsVirtual": false,
+// CHECK-NEXT:  "MangledName": "",
+// CHECK-NEXT:  "Name": "Fizz",
+// CHECK-NEXT:  "Path": "GlobalNamespace",
+// CHECK-NEXT:  "TagType": "struct",
+// CHECK-NEXT:  "USR": "{{[0-9A-F]*}}"
+// CHECK-NEXT:},
+// CHECK-NEXT:{
+// CHECK-NEXT:  "Access": "private",
+// CHECK-NEXT:  "End": true,
+// CHECK-NEXT:  "InfoType": "record",
+// CHECK-NEXT:  "IsParent": false,
+// CHECK-NEXT:  "IsTypedef": false,
+// CHECK-NEXT:  "IsVirtual": true,
+// CHECK-NEXT:  "MangledName": "",
+// CHECK-NEXT:  "Name": "Virtual",
+// CHECK-NEXT:  "Path": "GlobalNamespace",
+// CHECK-NEXT:  "TagType": "struct",
+// CHECK-NEXT:  "USR": "{{[0-9A-F]*}}"
+// CHECK-NEXT:}
+// CHECK-NEXT:  ],
+// CHECK:   "Parents": [
+// CHECK-NEXT:{
+// CHECK-NEXT:  "Name": "Bar",
+// CHECK-NEXT:  "Path"

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Serialize private members in JSON (PR #171700)

2025-12-11 Thread Erick Velez via llvm-branch-commits


https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/171700

>From 0be4b3662d1b660562cc37d0d155bbc04036576f Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Sun, 16 Nov 2025 13:38:47 -0800
Subject: [PATCH] [clang-doc] Serialize private members in JSON

---
 clang-tools-extra/clang-doc/JSONGenerator.cpp   |  6 ++
 clang-tools-extra/test/clang-doc/json/class.cpp | 10 ++
 2 files changed, 16 insertions(+)

diff --git a/clang-tools-extra/clang-doc/JSONGenerator.cpp 
b/clang-tools-extra/clang-doc/JSONGenerator.cpp
index c65c3dc759c3e..c47c65ddc2d73 100644
--- a/clang-tools-extra/clang-doc/JSONGenerator.cpp
+++ b/clang-tools-extra/clang-doc/JSONGenerator.cpp
@@ -545,6 +545,8 @@ static void serializeInfo(const RecordInfo &I, json::Object 
&Obj,
 json::Array &PubMembersArrayRef = *PublicMembersArray.getAsArray();
 json::Value ProtectedMembersArray = Array();
 json::Array &ProtMembersArrayRef = *ProtectedMembersArray.getAsArray();
+json::Value PrivateMembersArray = Array();
+json::Array &PrivateMembersArrayRef = *PrivateMembersArray.getAsArray();
 
 for (const MemberTypeInfo &Member : I.Members) {
   json::Value MemberVal = Object();
@@ -557,12 +559,16 @@ static void serializeInfo(const RecordInfo &I, 
json::Object &Obj,
 PubMembersArrayRef.push_back(MemberVal);
   else if (Member.Access == AccessSpecifier::AS_protected)
 ProtMembersArrayRef.push_back(MemberVal);
+  else if (Member.Access == AccessSpecifier::AS_private)
+PrivateMembersArrayRef.push_back(MemberVal);
 }
 
 if (!PubMembersArrayRef.empty())
   insertArray(Obj, PublicMembersArray, "PublicMembers");
 if (!ProtMembersArrayRef.empty())
   Obj["ProtectedMembers"] = ProtectedMembersArray;
+if (!PrivateMembersArrayRef.empty())
+  insertArray(Obj, PrivateMembersArray, "PrivateMembers");
   }
 
   if (!I.Bases.empty())
diff --git a/clang-tools-extra/test/clang-doc/json/class.cpp 
b/clang-tools-extra/test/clang-doc/json/class.cpp
index 9d3102a11db9d..d57e8a990c3fe 100644
--- a/clang-tools-extra/test/clang-doc/json/class.cpp
+++ b/clang-tools-extra/test/clang-doc/json/class.cpp
@@ -30,6 +30,8 @@ struct MyClass {
   int protectedMethod();
 
   int ProtectedField;
+private:
+  int PrivateField;
 };
 
 // CHECK:   {
@@ -122,6 +124,7 @@ struct MyClass {
 // CHECK-NEXT:  }
 // CHECK-NEXT:],
 // CHECK-NEXT:"HasEnums": true,
+// CHECK-NEXT:"HasPrivateMembers": true,
 // CHECK-NEXT:"HasPublicFunctions": true,
 // CHECK-NEXT:"HasPublicMembers": true,
 // CHECK-NEXT:"HasRecords": true,
@@ -137,6 +140,13 @@ struct MyClass {
 // CHECK-NEXT:  "GlobalNamespace"
 // CHECK-NEXT:],
 // CHECK-NEXT:   "Path": "GlobalNamespace",
+// CHECK-NEXT:   "PrivateMembers": [
+// CHECK-NEXT: {
+// CHECK-NEXT:   "IsStatic": false,
+// CHECK-NEXT:   "Name": "PrivateField",
+// CHECK-NEXT:   "Type": "int"
+// CHECK-NEXT: }
+// CHECK-NEXT:   ],
 // CHECK-NEXT:   "ProtectedFunctions": [
 // CHECK-NEXT: {
 // CHECK-NEXT:   "InfoType": "function",

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [RFC][LLVM][Clang] Add LLVM plugin hook for back-ends (PR #170846)

2025-12-11 Thread Alexis Engelke via llvm-branch-commits



@@ -1265,6 +1259,14 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
 void EmitAssemblyHelper::RunCodegenPipeline(
 BackendAction Action, std::unique_ptr &OS,
 std::unique_ptr &DwoOS) {
+  // Invoke pre-codegen callback from plugin, which might want to take over the
+  // entire code generation itself.
+  for (auto &Plugin : Plugins) {
+CodeGenFileType CGFT = getCodeGenFileType(Action);

aengelke wrote:

Thanks for catching this. To keep things simple and manageable, I'm going to 
split the Clang part into a separate PR with tests.

https://github.com/llvm/llvm-project/pull/170846
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AtomicExpand] Add bitcasts when expanding load atomic vector (PR #148900)

2025-12-11 Thread via llvm-branch-commits


https://github.com/jofrn edited https://github.com/llvm/llvm-project/pull/148900
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AtomicExpand] Add bitcasts when expanding load atomic vector (PR #148900)

2025-12-11 Thread via llvm-branch-commits



@@ -189,3 +189,69 @@ define void @pointer_cmpxchg_expand6(ptr addrspace(1) 
%ptr, ptr addrspace(2) %v)
   ret void
 }
 
+define <2 x ptr> @atomic_vec2_ptr_align(ptr %x) nounwind {
+; CHECK-LABEL: define <2 x ptr> @atomic_vec2_ptr_align(
+; CHECK-SAME: ptr [[X:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:[[TMP1:%.*]] = call i128 @__atomic_load_16(ptr [[X]], i32 2)
+; CHECK-NEXT:[[TMP6:%.*]] = bitcast i128 [[TMP1]] to <2 x i64>
+; CHECK-NEXT:[[TMP7:%.*]] = inttoptr <2 x i64> [[TMP6]] to <2 x ptr>
+; CHECK-NEXT:ret <2 x ptr> [[TMP7]]
+;
+  %ret = load atomic <2 x ptr>, ptr %x acquire, align 16

jofrn wrote:

I have done so already. Take a look.

https://github.com/llvm/llvm-project/pull/148900
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [Clang] Load pass plugins before parsing LLVM options (PR #171868)

2025-12-11 Thread Stefan Gränitz via llvm-branch-commits


weliveindetail wrote:

Nice! LGTM at first glace

> I'm not sure whether using the LLVM Bye.so in the tests is possible this
way (e.g., if Clang is built standalone).

Testing against examples is fine, but almost never exercised on build bots. We 
don't expose `LLVM_BUILD_EXAMPLES` as export, so clang standalone won't see it. 
Also, I think `%llvmshlibdir` is broken in standalone builds. You can use 
`config.llvm_libs_dir`, but IIRC there is no substitution yet.

https://github.com/llvm/llvm-project/pull/171868
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] InstCombine: Fold ldexp with constant exponent to fmul (PR #171731)

2025-12-11 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/171731

>From 44aa41c925cb7a4cc2acbaa8c858d44d6c6f3bf8 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 23 Aug 2023 20:57:49 -0400
Subject: [PATCH] InstCombine: Fold ldexp with constant exponent to fmul

If we can represent this with an fmul, prefer it as a canonical
form. More optimizations will understand fmul, and allows contract to
fma.
---
 .../InstCombine/InstCombineCalls.cpp  | 13 
 .../InstCombine/fold-select-fmul-if-zero.ll   | 10 +--
 llvm/test/Transforms/InstCombine/ldexp.ll | 62 +++
 3 files changed, 51 insertions(+), 34 deletions(-)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
index 85602a5a7575a..b498bafb3caaa 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -3089,6 +3089,19 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst 
&CI) {
 // exponent. Also could broaden sign check to cover == 0 case.
 Value *Src = II->getArgOperand(0);
 Value *Exp = II->getArgOperand(1);
+
+uint64_t ConstExp;
+if (match(Exp, m_ConstantInt(ConstExp))) {
+  // ldexp(x, K) -> fmul x, 2^K
+  const fltSemantics &FPTy =
+  Src->getType()->getScalarType()->getFltSemantics();
+  Constant *FPConst =
+  ConstantFP::get(Src->getType(), scalbn(APFloat::getOne(FPTy),
+ static_cast(ConstExp),
+ 
APFloat::rmNearestTiesToEven));
+  return BinaryOperator::CreateFMulFMF(Src, FPConst, II);
+}
+
 Value *InnerSrc;
 Value *InnerExp;
 if (match(Src, m_OneUse(m_Intrinsic(
diff --git a/llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll 
b/llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll
index 1ba7005f99e3d..4495b0f26042b 100644
--- a/llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll
+++ b/llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll
@@ -49,10 +49,7 @@ define float @fmul_by_32_if_0_oeq_zero_f32(float %x) {
 
 define float @ldexp_by_5_if_0_oeq_zero_f32(float %x) {
 ; CHECK-LABEL: @ldexp_by_5_if_0_oeq_zero_f32(
-; CHECK-NEXT:[[X_IS_ZERO:%.*]] = fcmp oeq float [[X:%.*]], 0.00e+00
-; CHECK-NEXT:[[SCALED_X:%.*]] = call float @llvm.ldexp.f32.i32(float 
[[X]], i32 5)
-; CHECK-NEXT:[[SCALED_IF_DENORMAL:%.*]] = select i1 [[X_IS_ZERO]], float 
[[SCALED_X]], float [[X]]
-; CHECK-NEXT:ret float [[SCALED_IF_DENORMAL]]
+; CHECK-NEXT:ret float [[X:%.*]]
 ;
   %x.is.zero = fcmp oeq float %x, 0.0
   %scaled.x = call float @llvm.ldexp.f32.i32(float %x, i32 5)
@@ -62,10 +59,7 @@ define float @ldexp_by_5_if_0_oeq_zero_f32(float %x) {
 
 define <2 x float> @ldexp_by_5_if_0_oeq_zero_v2f32(<2 x float> %x) {
 ; CHECK-LABEL: @ldexp_by_5_if_0_oeq_zero_v2f32(
-; CHECK-NEXT:[[X_IS_ZERO:%.*]] = fcmp oeq <2 x float> [[X:%.*]], 
zeroinitializer
-; CHECK-NEXT:[[SCALED_X:%.*]] = call <2 x float> 
@llvm.ldexp.v2f32.v2i32(<2 x float> [[X]], <2 x i32> splat (i32 5))
-; CHECK-NEXT:[[SCALED_IF_DENORMAL:%.*]] = select <2 x i1> [[X_IS_ZERO]], 
<2 x float> [[SCALED_X]], <2 x float> [[X]]
-; CHECK-NEXT:ret <2 x float> [[SCALED_IF_DENORMAL]]
+; CHECK-NEXT:ret <2 x float> [[SCALED_IF_DENORMAL:%.*]]
 ;
   %x.is.zero = fcmp oeq <2 x float> %x, zeroinitializer
   %scaled.x = call <2 x float> @llvm.ldexp.v2f32.v2i32(<2 x float> %x, <2 x 
i32> )
diff --git a/llvm/test/Transforms/InstCombine/ldexp.ll 
b/llvm/test/Transforms/InstCombine/ldexp.ll
index 8908d476b4a2c..43e80b1a1e588 100644
--- a/llvm/test/Transforms/InstCombine/ldexp.ll
+++ b/llvm/test/Transforms/InstCombine/ldexp.ll
@@ -444,7 +444,7 @@ define float @ldexp_ldexp_different_exp_type(float %x, i32 
%a, i64 %b) {
 define float @ldexp_ldexp_constants(float %x) {
 ; CHECK-LABEL: define float @ldexp_ldexp_constants
 ; CHECK-SAME: (float [[X:%.*]]) {
-; CHECK-NEXT:[[LDEXP1:%.*]] = call reassoc float @llvm.ldexp.f32.i32(float 
[[X]], i32 32)
+; CHECK-NEXT:[[LDEXP1:%.*]] = fmul reassoc float [[X]], 0x41F0
 ; CHECK-NEXT:ret float [[LDEXP1]]
 ;
   %ldexp0 = call reassoc float @llvm.ldexp.f32.i32(float %x, i32 8)
@@ -455,7 +455,7 @@ define float @ldexp_ldexp_constants(float %x) {
 define float @ldexp_ldexp_constants_nsz(float %x) {
 ; CHECK-LABEL: define float @ldexp_ldexp_constants_nsz
 ; CHECK-SAME: (float [[X:%.*]]) {
-; CHECK-NEXT:[[LDEXP1:%.*]] = call reassoc nsz float 
@llvm.ldexp.f32.i32(float [[X]], i32 32)
+; CHECK-NEXT:[[LDEXP1:%.*]] = fmul reassoc nsz float [[X]], 
0x41F0
 ; CHECK-NEXT:ret float [[LDEXP1]]
 ;
   %ldexp0 = call reassoc nsz float @llvm.ldexp.f32.i32(float %x, i32 8)
@@ -466,7 +466,7 @@ define float @ldexp_ldexp_constants_nsz(float %x) {
 define float @ldexp_ldexp_constants_nsz0(float %x) {
 ; CHECK-LABEL: define float @ldexp_ldexp_consta

[llvm-branch-commits] [llvm] InstCombine: Fold ldexp with constant exponent to fmul (PR #171731)

2025-12-11 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/171731

>From 44aa41c925cb7a4cc2acbaa8c858d44d6c6f3bf8 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 23 Aug 2023 20:57:49 -0400
Subject: [PATCH] InstCombine: Fold ldexp with constant exponent to fmul

If we can represent this with an fmul, prefer it as a canonical
form. More optimizations will understand fmul, and allows contract to
fma.
---
 .../InstCombine/InstCombineCalls.cpp  | 13 
 .../InstCombine/fold-select-fmul-if-zero.ll   | 10 +--
 llvm/test/Transforms/InstCombine/ldexp.ll | 62 +++
 3 files changed, 51 insertions(+), 34 deletions(-)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
index 85602a5a7575a..b498bafb3caaa 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -3089,6 +3089,19 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst 
&CI) {
 // exponent. Also could broaden sign check to cover == 0 case.
 Value *Src = II->getArgOperand(0);
 Value *Exp = II->getArgOperand(1);
+
+uint64_t ConstExp;
+if (match(Exp, m_ConstantInt(ConstExp))) {
+  // ldexp(x, K) -> fmul x, 2^K
+  const fltSemantics &FPTy =
+  Src->getType()->getScalarType()->getFltSemantics();
+  Constant *FPConst =
+  ConstantFP::get(Src->getType(), scalbn(APFloat::getOne(FPTy),
+ static_cast(ConstExp),
+ 
APFloat::rmNearestTiesToEven));
+  return BinaryOperator::CreateFMulFMF(Src, FPConst, II);
+}
+
 Value *InnerSrc;
 Value *InnerExp;
 if (match(Src, m_OneUse(m_Intrinsic(
diff --git a/llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll 
b/llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll
index 1ba7005f99e3d..4495b0f26042b 100644
--- a/llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll
+++ b/llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll
@@ -49,10 +49,7 @@ define float @fmul_by_32_if_0_oeq_zero_f32(float %x) {
 
 define float @ldexp_by_5_if_0_oeq_zero_f32(float %x) {
 ; CHECK-LABEL: @ldexp_by_5_if_0_oeq_zero_f32(
-; CHECK-NEXT:[[X_IS_ZERO:%.*]] = fcmp oeq float [[X:%.*]], 0.00e+00
-; CHECK-NEXT:[[SCALED_X:%.*]] = call float @llvm.ldexp.f32.i32(float 
[[X]], i32 5)
-; CHECK-NEXT:[[SCALED_IF_DENORMAL:%.*]] = select i1 [[X_IS_ZERO]], float 
[[SCALED_X]], float [[X]]
-; CHECK-NEXT:ret float [[SCALED_IF_DENORMAL]]
+; CHECK-NEXT:ret float [[X:%.*]]
 ;
   %x.is.zero = fcmp oeq float %x, 0.0
   %scaled.x = call float @llvm.ldexp.f32.i32(float %x, i32 5)
@@ -62,10 +59,7 @@ define float @ldexp_by_5_if_0_oeq_zero_f32(float %x) {
 
 define <2 x float> @ldexp_by_5_if_0_oeq_zero_v2f32(<2 x float> %x) {
 ; CHECK-LABEL: @ldexp_by_5_if_0_oeq_zero_v2f32(
-; CHECK-NEXT:[[X_IS_ZERO:%.*]] = fcmp oeq <2 x float> [[X:%.*]], 
zeroinitializer
-; CHECK-NEXT:[[SCALED_X:%.*]] = call <2 x float> 
@llvm.ldexp.v2f32.v2i32(<2 x float> [[X]], <2 x i32> splat (i32 5))
-; CHECK-NEXT:[[SCALED_IF_DENORMAL:%.*]] = select <2 x i1> [[X_IS_ZERO]], 
<2 x float> [[SCALED_X]], <2 x float> [[X]]
-; CHECK-NEXT:ret <2 x float> [[SCALED_IF_DENORMAL]]
+; CHECK-NEXT:ret <2 x float> [[SCALED_IF_DENORMAL:%.*]]
 ;
   %x.is.zero = fcmp oeq <2 x float> %x, zeroinitializer
   %scaled.x = call <2 x float> @llvm.ldexp.v2f32.v2i32(<2 x float> %x, <2 x 
i32> )
diff --git a/llvm/test/Transforms/InstCombine/ldexp.ll 
b/llvm/test/Transforms/InstCombine/ldexp.ll
index 8908d476b4a2c..43e80b1a1e588 100644
--- a/llvm/test/Transforms/InstCombine/ldexp.ll
+++ b/llvm/test/Transforms/InstCombine/ldexp.ll
@@ -444,7 +444,7 @@ define float @ldexp_ldexp_different_exp_type(float %x, i32 
%a, i64 %b) {
 define float @ldexp_ldexp_constants(float %x) {
 ; CHECK-LABEL: define float @ldexp_ldexp_constants
 ; CHECK-SAME: (float [[X:%.*]]) {
-; CHECK-NEXT:[[LDEXP1:%.*]] = call reassoc float @llvm.ldexp.f32.i32(float 
[[X]], i32 32)
+; CHECK-NEXT:[[LDEXP1:%.*]] = fmul reassoc float [[X]], 0x41F0
 ; CHECK-NEXT:ret float [[LDEXP1]]
 ;
   %ldexp0 = call reassoc float @llvm.ldexp.f32.i32(float %x, i32 8)
@@ -455,7 +455,7 @@ define float @ldexp_ldexp_constants(float %x) {
 define float @ldexp_ldexp_constants_nsz(float %x) {
 ; CHECK-LABEL: define float @ldexp_ldexp_constants_nsz
 ; CHECK-SAME: (float [[X:%.*]]) {
-; CHECK-NEXT:[[LDEXP1:%.*]] = call reassoc nsz float 
@llvm.ldexp.f32.i32(float [[X]], i32 32)
+; CHECK-NEXT:[[LDEXP1:%.*]] = fmul reassoc nsz float [[X]], 
0x41F0
 ; CHECK-NEXT:ret float [[LDEXP1]]
 ;
   %ldexp0 = call reassoc nsz float @llvm.ldexp.f32.i32(float %x, i32 8)
@@ -466,7 +466,7 @@ define float @ldexp_ldexp_constants_nsz(float %x) {
 define float @ldexp_ldexp_constants_nsz0(float %x) {
 ; CHECK-LABEL: define float @ldexp_ldexp_consta

[llvm-branch-commits] [llvm] release/21.x: [Mips] Support "$sp" named register (#136821) (PR #171308)

2025-12-11 Thread YunQiang Su via llvm-branch-commits


wzssyqa wrote:

This patch has been reverted.

https://github.com/llvm/llvm-project/pull/171308
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/21.x: [Mips] Support "$sp" named register (#136821) (PR #171308)

2025-12-11 Thread YunQiang Su via llvm-branch-commits


https://github.com/wzssyqa closed 
https://github.com/llvm/llvm-project/pull/171308
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/21.x: [ExtractAPI] Format typedef params correctly (#171516) (PR #171522)

2025-12-11 Thread via llvm-branch-commits

dyung wrote:

> No, this specific code path has always produced incorrect output, even in 
> past versions of LLVM. I took care to keep the scope of the change as small 
> as possible to reduce the risk for this backport.

Given that it is not fixing a regression and not a fix with a broad impact, it 
is not really a candidate for inclusion on the release branch this late in the 
release cycle. The 22.x release branch will be created in about a month and the 
fix will be there. That being said, I will discuss with the other release 
managers to see what they think.

https://github.com/llvm/llvm-project/pull/171522
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/21.x: [ExtractAPI] Format typedef params correctly (#171516) (PR #171522)

2025-12-11 Thread via llvm-branch-commits


dyung wrote:

When was this regression introduced?

https://github.com/llvm/llvm-project/pull/171522
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang-tools-extra] [clang-tidy] add abseil-unchecked-statusor-access (PR #171188)

2025-12-11 Thread Jan Voung via llvm-branch-commits



@@ -0,0 +1,377 @@
+.. title:: clang-tidy - abseil-unchecked-statusor-access
+
+abseil-unchecked-statusor-access
+
+
+This check identifies unsafe accesses to values contained in
+``absl::StatusOr`` objects. Below we will refer to this type as
+``StatusOr``.
+
+An access to the value of an ``StatusOr`` occurs when one of its
+``value``, ``operator*``, or ``operator->`` member functions is invoked.
+To align with common misconceptions, the check considers these member
+functions as equivalent, even though there are subtle differences
+related to exceptions vs. undefined behavior.
+
+An access to the value of a ``StatusOr`` is considered safe if and
+only if code in the local scope (e.g. function body) ensures that the
+status of the ``StatusOr`` is ok in all possible execution paths that
+can reach the access. That should happen either through an explicit
+check, using the ``StatusOr::ok`` member function, or by constructing
+the ``StatusOr`` in a way that shows that its status is unambiguously
+ok (e.g. by passing a value to its constructor).
+
+Below we list some examples of safe and unsafe ``StatusOr`` access
+patterns.
+
+Note: If the check isn’t behaving as you would have expected on a code
+snippet, please `report it `__.
+
+False negatives
+---
+
+This check generally does **not** generate false negatives. If it cannot

jvoung wrote:

I see, the second sentence is describing how the checker works:
 `!isSafeUnwrap(...)` then issue warning that it is unsafe.

Whether that "unsafe" is true or not made me think more of the "false positive" 
category so was a bit confusing.

Perhaps you can clarify that and "note that if it is deemed unsafe, it could 
still be safe (false positive)."... or something to that effect.


https://github.com/llvm/llvm-project/pull/171188
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang-tools-extra] [clang-tidy] add abseil-unchecked-statusor-access (PR #171188)

2025-12-11 Thread Jan Voung via llvm-branch-commits


https://github.com/jvoung approved this pull request.


https://github.com/llvm/llvm-project/pull/171188
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [Clang] Load pass plugins before parsing LLVM options (PR #171868)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:



@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Alexis Engelke (aengelke)


Changes

This permits pass plugins to use llvm::cl::opt. Additionally, add a test
of -fpass-plugin, this was previously not tested at all.

I'm not sure whether using the LLVM Bye.so in the tests is possible this
way (e.g., if Clang is built standalone).


---

Patch is 53.61 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/171868.diff


8 Files Affected:

- (modified) clang/include/clang/Basic/CodeGenOptions.h (+6-2) 
- (modified) clang/include/clang/Options/Options.td (+1-1) 
- (modified) clang/lib/CodeGen/BackendUtil.cpp (+3-10) 
- (modified) clang/lib/Frontend/CompilerInstance.cpp (+10) 
- (modified) clang/test/CMakeLists.txt (+6) 
- (added) clang/test/CodeGen/pass-plugins.c (+10) 
- (modified) clang/test/lit.cfg.py (+2) 
- (modified) clang/test/lit.site.cfg.py.in (+1) 


``diff



  
Unicorn! · GitHub

  body {
background-color: #f1f1f1;
margin: 0;
font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
  }

  .container { margin: 50px auto 40px auto; width: 600px; text-align: 
center; }

  a { color: #4183c4; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { letter-spacing: -1px; line-height: 60px; font-size: 60px; 
font-weight: 100; margin: 0px; text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 10px 0 10px; font-size: 18px; 
font-weight: 200; line-height: 1.6em;}

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  


  
https://github.com/llvm/llvm-project/pull/171868
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [Clang] Load pass plugins before parsing LLVM options (PR #171868)

2025-12-11 Thread Alexis Engelke via llvm-branch-commits


https://github.com/aengelke created 
https://github.com/llvm/llvm-project/pull/171868

This permits pass plugins to use llvm::cl::opt. Additionally, add a test
of -fpass-plugin, this was previously not tested at all.

I'm not sure whether using the LLVM Bye.so in the tests is possible this
way (e.g., if Clang is built standalone).



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [LLVM] Add plugin hook for back-ends (PR #170846)

2025-12-11 Thread Alexis Engelke via llvm-branch-commits


https://github.com/aengelke edited 
https://github.com/llvm/llvm-project/pull/170846
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [LLVM] Add plugin hook for back-ends (PR #170846)

2025-12-11 Thread Alexis Engelke via llvm-branch-commits


https://github.com/aengelke updated 
https://github.com/llvm/llvm-project/pull/170846

>From 89e9b4a5863e957971a3febc95862c1d5fe43f28 Mon Sep 17 00:00:00 2001
From: Alexis Engelke 
Date: Fri, 5 Dec 2025 12:33:55 +
Subject: [PATCH 1/2] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.5-bogner
---
 clang/lib/CodeGen/BackendUtil.cpp | 33 +++
 llvm/include/llvm/Passes/PassPlugin.h | 31 +
 llvm/lib/Passes/PassPlugin.cpp|  5 
 3 files changed, 49 insertions(+), 20 deletions(-)

diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index 97bc063ad34e5..188ea36d44523 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -144,6 +144,7 @@ class EmitAssemblyHelper {
   const LangOptions &LangOpts;
   llvm::Module *TheModule;
   IntrusiveRefCntPtr VFS;
+  llvm::SmallVector Plugins;
 
   std::unique_ptr OS;
 
@@ -973,16 +974,9 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
 }
 #endif
   }
-  // Attempt to load pass plugins and register their callbacks with PB.
-  for (auto &PluginFN : CodeGenOpts.PassPlugins) {
-auto PassPlugin = PassPlugin::Load(PluginFN);
-if (PassPlugin) {
-  PassPlugin->registerPassBuilderCallbacks(PB);
-} else {
-  Diags.Report(diag::err_fe_unable_to_load_plugin)
-  << PluginFN << toString(PassPlugin.takeError());
-}
-  }
+  // Register plugin callbacks with PB.
+  for (auto &Plugin : Plugins)
+Plugin.registerPassBuilderCallbacks(PB);
   for (const auto &PassCallback : CodeGenOpts.PassBuilderCallbacks)
 PassCallback(PB);
 #define HANDLE_EXTENSION(Ext)  
\
@@ -1211,6 +1205,14 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
 void EmitAssemblyHelper::RunCodegenPipeline(
 BackendAction Action, std::unique_ptr &OS,
 std::unique_ptr &DwoOS) {
+  // Invoke pre-codegen callback from plugin, which might want to take over the
+  // entire code generation itself.
+  for (auto &Plugin : Plugins) {
+CodeGenFileType CGFT = getCodeGenFileType(Action);
+if (Plugin.invokePreCodeGenCallback(*TheModule, *TM, CGFT, *OS))
+  return;
+  }
+
   // We still use the legacy PM to run the codegen pipeline since the new PM
   // does not work with the codegen pipeline.
   // FIXME: make the new PM work with the codegen pipeline.
@@ -1274,6 +1276,17 @@ void EmitAssemblyHelper::emitAssembly(BackendAction 
Action,
   // Before executing passes, print the final values of the LLVM options.
   cl::PrintOptionValues();
 
+  // Attempt to load pass plugins.
+  for (auto &PluginFN : CodeGenOpts.PassPlugins) {
+auto PassPlugin = PassPlugin::Load(PluginFN);
+if (PassPlugin) {
+  Plugins.push_back(std::move(*PassPlugin));
+} else {
+  Diags.Report(diag::err_fe_unable_to_load_plugin)
+  << PluginFN << toString(PassPlugin.takeError());
+}
+  }
+
   std::unique_ptr ThinLinkOS, DwoOS;
   RunOptimizationPipeline(Action, OS, ThinLinkOS, BC);
   RunCodegenPipeline(Action, OS, DwoOS);
diff --git a/llvm/include/llvm/Passes/PassPlugin.h 
b/llvm/include/llvm/Passes/PassPlugin.h
index 947504bc207a7..9ca0b4c29ed96 100644
--- a/llvm/include/llvm/Passes/PassPlugin.h
+++ b/llvm/include/llvm/Passes/PassPlugin.h
@@ -14,6 +14,7 @@
 #define LLVM_PASSES_PASSPLUGIN_H
 
 #include "llvm/ADT/StringRef.h"
+#include "llvm/Support/CodeGen.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/DynamicLibrary.h"
 #include "llvm/Support/Error.h"
@@ -21,7 +22,9 @@
 #include 
 
 namespace llvm {
+class Module;
 class PassBuilder;
+class TargetMachine;
 
 /// \macro LLVM_PLUGIN_API_VERSION
 /// Identifies the API version understood by this plugin.
@@ -30,14 +33,15 @@ class PassBuilder;
 /// against that of the plugin. A mismatch is an error. The supported version
 /// will be incremented for ABI-breaking changes to the \c 
PassPluginLibraryInfo
 /// struct, i.e. when callbacks are added, removed, or reordered.
-#define LLVM_PLUGIN_API_VERSION 1
+#define LLVM_PLUGIN_API_VERSION 2
 
 extern "C" {
 /// Information about the plugin required to load its passes
 ///
 /// This struct defines the core interface for pass plugins and is supposed to
-/// be filled out by plugin implementors. LLVM-side users of a plugin are
-/// expected to use the \c PassPlugin class below to interface with it.
+/// be filled out by plugin implementors. Unused function pointers can be set 
to
+/// nullptr. LLVM-side users of a plugin are expected to use the \c PassPlugin
+/// class below to interface with it.
 struct PassPluginLibraryInfo {
   /// The API version understood by this plugin, usually \c
   /// LLVM_PLUGIN_API_VERSION
@@ -49,7 +53,14 @@ struct PassPluginLibraryInfo {
 
   /// The callback for registering plugin passes with a \c PassBuilder
   ///

[llvm-branch-commits] [llvm] [AtomicExpand] Add bitcasts when expanding load atomic vector (PR #148900)

2025-12-11 Thread via llvm-branch-commits


https://github.com/jofrn edited https://github.com/llvm/llvm-project/pull/148900
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (PR #171837)

2025-12-11 Thread Yingwei Zheng via llvm-branch-commits



@@ -5553,6 +5553,48 @@ void computeKnownFPClass(const Value *V, const APInt 
&DemandedElts,
 
   // TODO: Copy inf handling from instructions
   break;
+case Intrinsic::amdgcn_rsq: {
+  KnownFPClass KnownSrc;
+  // The only negative value that can be returned is -inf for -0 inputs.
+  Known.knownNot(fcNegZero | fcNegSubnormal | fcNegNormal);
+
+  computeKnownFPClass(II->getArgOperand(0), DemandedElts, 
InterestedClasses,
+  KnownSrc, Q, Depth + 1);
+
+  // Negative -> nan
+  if (KnownSrc.isKnownNeverNaN() && KnownSrc.cannotBeOrderedLessThanZero())
+Known.knownNot(fcNan);
+  else if (KnownSrc.isKnownNever(fcSNan))
+Known.knownNot(fcSNan);
+
+  // -inf -> -0
+  if (KnownSrc.isKnownNeverNegInfinity())
+Known.knownNot(fcNegZero);

dtcxzyw wrote:

```suggestion
```
`rsq(-inf) = nan`
https://github.com/user-attachments/assets/4c25b96a-ebf1-4b33-98cb-6ec66b0e7a60";
 />


https://github.com/llvm/llvm-project/pull/171837
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [Clang] Invoke pass plugin preCodeGenCallback (PR #171872)

2025-12-11 Thread via llvm-branch-commits


github-actions[bot] wrote:


# :penguin: Linux x64 Test Results

* 81869 tests passed
* 1137 tests skipped

All tests passed but another part of the build **failed**. Click on a failure 
below to see the details.


tools/clang/tools/extra/modularize/CMakeFiles/modularize.dir/ModularizeUtilities.cpp.o

```
FAILED: 
tools/clang/tools/extra/modularize/CMakeFiles/modularize.dir/ModularizeUtilities.cpp.o
sccache /opt/llvm/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG 
-D_GLIBCXX_ASSERTIONS -D_GLIBCXX_USE_CXX11_ABI=1 -D_GNU_SOURCE 
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
-I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/tools/clang/tools/extra/modularize
 
-I/home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/modularize
 -I/home/gha/actions-runner/_work/llvm-project/llvm-project/clang/include 
-I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/tools/clang/include
 -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/include 
-I/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/include -gmlt 
-fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror 
-Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra 
-Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers 
-pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough 
-Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
-Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
-Wno-pass-failed -Wmisleading-indentation -Wctad-maybe-unsupported 
-fdiagnostics-color -ffunction-sections -fdata-sections -fno-common 
-Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG -std=c++17  
-fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT 
tools/clang/tools/extra/modularize/CMakeFiles/modularize.dir/ModularizeUtilities.cpp.o
 -MF 
tools/clang/tools/extra/modularize/CMakeFiles/modularize.dir/ModularizeUtilities.cpp.o.d
 -o 
tools/clang/tools/extra/modularize/CMakeFiles/modularize.dir/ModularizeUtilities.cpp.o
 -c 
/home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/modularize/ModularizeUtilities.cpp
/home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/modularize/ModularizeUtilities.cpp:357:24:
 error: reference to 'Module' is ambiguous
357 | if (Mod.getHeaders(Module::HK_Normal).empty()) {
|^
/home/gha/actions-runner/_work/llvm-project/llvm-project/clang/include/clang/Serialization/ASTWriter.h:63:7:
 note: candidate found by name lookup is 'clang::Module'
63 | class Module;
|   ^
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/include/llvm/Passes/PassPlugin.h:25:7:
 note: candidate found by name lookup is 'llvm::Module'
25 | class Module;
|   ^
1 error generated.
```


bin/clang-tidy

```
FAILED: bin/clang-tidy
: && /opt/llvm/bin/clang++ -gmlt -fPIC -fno-semantic-interposition 
-fvisibility-inlines-hidden -Werror -Werror=date-time 
-Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter 
-Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic 
-Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough 
-Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
-Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
-Wno-pass-failed -Wmisleading-indentation -Wctad-maybe-unsupported 
-fdiagnostics-color -ffunction-sections -fdata-sections -fno-common 
-Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG -no-pie -fuse-ld=lld 
-Wl,--color-diagnostics   -Wl,--export-dynamic 
tools/clang/tools/extra/clang-tidy/tool/CMakeFiles/clang-tidy.dir/ClangTidyToolMain.cpp.o
 -o bin/clang-tidy  -Wl,-rpath,"\$ORIGIN/../lib:"  
lib/libLLVMAArch64AsmParser.a  lib/libLLVMAMDGPUAsmParser.a  
lib/libLLVMARMAsmParser.a  lib/libLLVMAVRAsmParser.a  lib/libLLVMBPFAsmParser.a 
 lib/libLLVMHexagonAsmParser.a  lib/libLLVMLanaiAsmParser.a  
lib/libLLVMLoongArchAsmParser.a  lib/libLLVMMipsAsmParser.a  
lib/libLLVMMSP430AsmParser.a  lib/libLLVMPowerPCAsmParser.a  
lib/libLLVMRISCVAsmParser.a  lib/libLLVMSparcAsmParser.a  
lib/libLLVMSystemZAsmParser.a  lib/libLLVMVEAsmParser.a  
lib/libLLVMWebAssemblyAsmParser.a  lib/libLLVMX86AsmParser.a  
lib/libLLVMAArch64Desc.a  lib/libLLVMAMDGPUDesc.a  lib/libLLVMARMDesc.a  
lib/libLLVMAVRDesc.a  lib/libLLVMBPFDesc.a  lib/libLLVMHexagonDesc.a  
lib/libLLVMLanaiDesc.a  lib/libLLVMLoongArchDesc.a  lib/libLLVMMipsDesc.a  
lib/libLLVMMSP430Desc.a  lib/libLLVMNVPTXDesc.a  lib/libLLVMPowerPCDesc.a  
lib/libLLVMRISCVDesc.a  lib/libLLVMSparcDesc.a  lib/libLLVMSPIRVDesc.a  
lib/libLLVMSystemZDesc.a  lib/libLLVMVEDesc.a  lib/libLLVMWebAssemblyDesc.a  
lib/libLLVMX86Desc.a  lib/libLLVMXCoreDesc.a  lib/libLLVMAArch64Info.a  
lib/libLLVMAMDGPUInfo.a  lib/libLLVMARMInfo.a  lib/libLLVMAVRInfo.a  
lib/libLLVMBPFInfo.a  lib/libLLVMHexagonInfo.a  lib/libLLVMLanaiInfo.a  
lib/libLLVMLoongArchInfo.a  lib/libLLVMMipsInfo.a  lib/libLLVMMS

[llvm-branch-commits] [llvm] [GlobalISel][AArch64] Added support for sli/sri intrinsics (PR #171448)

2025-12-11 Thread Joshua Rodriguez via llvm-branch-commits


https://github.com/JoshdRod closed 
https://github.com/llvm/llvm-project/pull/171448
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] VectorCombine: Improve the insert/extract fold in the narrowing case (PR #168820)

2025-12-11 Thread Nicolai Hähnle via llvm-branch-commits


https://github.com/nhaehnle updated 
https://github.com/llvm/llvm-project/pull/168820

From 352f05be1065786f45f00e54707b0ba17e8649c5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= 
Date: Wed, 19 Nov 2025 18:00:32 -0800
Subject: [PATCH] VectorCombine: Improve the insert/extract fold in the
 narrowing case

Keeping the extracted element in a natural position in the narrowed
vector has two beneficial effects:

1. It makes the narrowing shuffles cheaper (at least on AMDGPU), which
   allows the insert/extract fold to trigger.
2. It makes the narrowing shuffles in a chain of extract/insert
   compatible, which allows foldLengthChangingShuffles to successfully
   recognize a chain that can be folded.

There are minor X86 test changes that look reasonable to me. The IR
change for AVX2 in 
llvm/test/Transforms/VectorCombine/X86/extract-insert-poison.ll
doesn't change the assembly generated by `llc -mtriple=x86_64-- -mattr=AVX2`
at all.

commit-id:c151bb04
---
 .../Transforms/Vectorize/VectorCombine.cpp| 22 +--
 .../VectorCombine/AMDGPU/extract-insert-i8.ll | 18 ++-
 .../X86/extract-insert-poison.ll  | 12 ++
 .../VectorCombine/X86/extract-insert.ll   |  8 +++
 .../Transforms/VectorCombine/X86/pr126085.ll  |  4 ++--
 5 files changed, 22 insertions(+), 42 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp 
b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index b83597fec021a..4b081205eba10 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -4558,22 +4558,15 @@ bool 
VectorCombine::foldInsExtVectorToShuffle(Instruction &I) {
   SmallVector Mask(NumDstElts, PoisonMaskElem);
 
   bool NeedExpOrNarrow = NumSrcElts != NumDstElts;
-  bool IsExtIdxInBounds = ExtIdx < NumDstElts;
   bool NeedDstSrcSwap = isa(DstVec) && !isa(SrcVec);
   if (NeedDstSrcSwap) {
 SK = TargetTransformInfo::SK_PermuteSingleSrc;
-if (!IsExtIdxInBounds && NeedExpOrNarrow)
-  Mask[InsIdx] = 0;
-else
-  Mask[InsIdx] = ExtIdx;
+Mask[InsIdx] = ExtIdx % NumDstElts;
 std::swap(DstVec, SrcVec);
   } else {
 SK = TargetTransformInfo::SK_PermuteTwoSrc;
 std::iota(Mask.begin(), Mask.end(), 0);
-if (!IsExtIdxInBounds && NeedExpOrNarrow)
-  Mask[InsIdx] = NumDstElts;
-else
-  Mask[InsIdx] = ExtIdx + NumDstElts;
+Mask[InsIdx] = (ExtIdx % NumDstElts) + NumDstElts;
   }
 
   // Cost
@@ -4594,14 +4587,11 @@ bool 
VectorCombine::foldInsExtVectorToShuffle(Instruction &I) {
   NewCost += TTI.getShuffleCost(SK, DstVecTy, DstVecTy, Mask, CostKind, 0,
 nullptr, {DstVec, SrcVec});
   } else {
-// When creating length-changing-vector, always create with a Mask whose
-// first element has an ExtIdx, so that the first element of the vector
-// being created is always the target to be extracted.
+// When creating a length-changing-vector, always try to keep the relevant
+// element in an equivalent position, so that bulk shuffles are more likely
+// to be useful.
 ExtToVecMask.assign(NumDstElts, PoisonMaskElem);
-if (IsExtIdxInBounds)
-  ExtToVecMask[ExtIdx] = ExtIdx;
-else
-  ExtToVecMask[0] = ExtIdx;
+ExtToVecMask[ExtIdx % NumDstElts] = ExtIdx;
 // Add cost for expanding or narrowing
 NewCost = TTI.getShuffleCost(TargetTransformInfo::SK_PermuteSingleSrc,
  DstVecTy, SrcVecTy, ExtToVecMask, CostKind);
diff --git a/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll 
b/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
index 8c2455dd9d375..6c92892949175 100644
--- a/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
+++ b/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
@@ -88,22 +88,8 @@ entry:
 define <8 x i8> @extract_insert_chain_shortening(<32 x i8> %in) {
 ; OPT-LABEL: define <8 x i8> @extract_insert_chain_shortening(
 ; OPT-SAME: <32 x i8> [[IN:%.*]]) #[[ATTR0]] {
-; OPT-NEXT:[[I_1:%.*]] = extractelement <32 x i8> [[IN]], i64 17
-; OPT-NEXT:[[I_2:%.*]] = extractelement <32 x i8> [[IN]], i64 18
-; OPT-NEXT:[[I_3:%.*]] = extractelement <32 x i8> [[IN]], i64 19
-; OPT-NEXT:[[I_5:%.*]] = extractelement <32 x i8> [[IN]], i64 21
-; OPT-NEXT:[[I_6:%.*]] = extractelement <32 x i8> [[IN]], i64 22
-; OPT-NEXT:[[I_7:%.*]] = extractelement <32 x i8> [[IN]], i64 23
-; OPT-NEXT:[[O_0:%.*]] = shufflevector <32 x i8> [[IN]], <32 x i8> poison, 
<8 x i32> 
-; OPT-NEXT:[[O_1:%.*]] = insertelement <8 x i8> [[O_0]], i8 [[I_1]], i32 1
-; OPT-NEXT:[[O_2:%.*]] = insertelement <8 x i8> [[O_1]], i8 [[I_2]], i32 2
-; OPT-NEXT:[[O_3:%.*]] = insertelement <8 x i8> [[O_2]], i8 [[I_3]], i32 3
-; OPT-NEXT:[[TMP1:%.*]] = shufflevector <32 x i8> [[IN]], <32 x i8> 
poison, <8 x i32> 
-; OPT-NEXT:[[O_4:%.*]] = shufflevector <8 x i8> [[O_3]], <8 x i8> 
[[TMP1]], <

[llvm-branch-commits] [llvm] VectorCombine: Improve the insert/extract fold in the narrowing case (PR #168820)

2025-12-11 Thread Nicolai Hähnle via llvm-branch-commits


https://github.com/nhaehnle updated 
https://github.com/llvm/llvm-project/pull/168820

From 352f05be1065786f45f00e54707b0ba17e8649c5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= 
Date: Wed, 19 Nov 2025 18:00:32 -0800
Subject: [PATCH] VectorCombine: Improve the insert/extract fold in the
 narrowing case

Keeping the extracted element in a natural position in the narrowed
vector has two beneficial effects:

1. It makes the narrowing shuffles cheaper (at least on AMDGPU), which
   allows the insert/extract fold to trigger.
2. It makes the narrowing shuffles in a chain of extract/insert
   compatible, which allows foldLengthChangingShuffles to successfully
   recognize a chain that can be folded.

There are minor X86 test changes that look reasonable to me. The IR
change for AVX2 in 
llvm/test/Transforms/VectorCombine/X86/extract-insert-poison.ll
doesn't change the assembly generated by `llc -mtriple=x86_64-- -mattr=AVX2`
at all.

commit-id:c151bb04
---
 .../Transforms/Vectorize/VectorCombine.cpp| 22 +--
 .../VectorCombine/AMDGPU/extract-insert-i8.ll | 18 ++-
 .../X86/extract-insert-poison.ll  | 12 ++
 .../VectorCombine/X86/extract-insert.ll   |  8 +++
 .../Transforms/VectorCombine/X86/pr126085.ll  |  4 ++--
 5 files changed, 22 insertions(+), 42 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp 
b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index b83597fec021a..4b081205eba10 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -4558,22 +4558,15 @@ bool 
VectorCombine::foldInsExtVectorToShuffle(Instruction &I) {
   SmallVector Mask(NumDstElts, PoisonMaskElem);
 
   bool NeedExpOrNarrow = NumSrcElts != NumDstElts;
-  bool IsExtIdxInBounds = ExtIdx < NumDstElts;
   bool NeedDstSrcSwap = isa(DstVec) && !isa(SrcVec);
   if (NeedDstSrcSwap) {
 SK = TargetTransformInfo::SK_PermuteSingleSrc;
-if (!IsExtIdxInBounds && NeedExpOrNarrow)
-  Mask[InsIdx] = 0;
-else
-  Mask[InsIdx] = ExtIdx;
+Mask[InsIdx] = ExtIdx % NumDstElts;
 std::swap(DstVec, SrcVec);
   } else {
 SK = TargetTransformInfo::SK_PermuteTwoSrc;
 std::iota(Mask.begin(), Mask.end(), 0);
-if (!IsExtIdxInBounds && NeedExpOrNarrow)
-  Mask[InsIdx] = NumDstElts;
-else
-  Mask[InsIdx] = ExtIdx + NumDstElts;
+Mask[InsIdx] = (ExtIdx % NumDstElts) + NumDstElts;
   }
 
   // Cost
@@ -4594,14 +4587,11 @@ bool 
VectorCombine::foldInsExtVectorToShuffle(Instruction &I) {
   NewCost += TTI.getShuffleCost(SK, DstVecTy, DstVecTy, Mask, CostKind, 0,
 nullptr, {DstVec, SrcVec});
   } else {
-// When creating length-changing-vector, always create with a Mask whose
-// first element has an ExtIdx, so that the first element of the vector
-// being created is always the target to be extracted.
+// When creating a length-changing-vector, always try to keep the relevant
+// element in an equivalent position, so that bulk shuffles are more likely
+// to be useful.
 ExtToVecMask.assign(NumDstElts, PoisonMaskElem);
-if (IsExtIdxInBounds)
-  ExtToVecMask[ExtIdx] = ExtIdx;
-else
-  ExtToVecMask[0] = ExtIdx;
+ExtToVecMask[ExtIdx % NumDstElts] = ExtIdx;
 // Add cost for expanding or narrowing
 NewCost = TTI.getShuffleCost(TargetTransformInfo::SK_PermuteSingleSrc,
  DstVecTy, SrcVecTy, ExtToVecMask, CostKind);
diff --git a/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll 
b/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
index 8c2455dd9d375..6c92892949175 100644
--- a/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
+++ b/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
@@ -88,22 +88,8 @@ entry:
 define <8 x i8> @extract_insert_chain_shortening(<32 x i8> %in) {
 ; OPT-LABEL: define <8 x i8> @extract_insert_chain_shortening(
 ; OPT-SAME: <32 x i8> [[IN:%.*]]) #[[ATTR0]] {
-; OPT-NEXT:[[I_1:%.*]] = extractelement <32 x i8> [[IN]], i64 17
-; OPT-NEXT:[[I_2:%.*]] = extractelement <32 x i8> [[IN]], i64 18
-; OPT-NEXT:[[I_3:%.*]] = extractelement <32 x i8> [[IN]], i64 19
-; OPT-NEXT:[[I_5:%.*]] = extractelement <32 x i8> [[IN]], i64 21
-; OPT-NEXT:[[I_6:%.*]] = extractelement <32 x i8> [[IN]], i64 22
-; OPT-NEXT:[[I_7:%.*]] = extractelement <32 x i8> [[IN]], i64 23
-; OPT-NEXT:[[O_0:%.*]] = shufflevector <32 x i8> [[IN]], <32 x i8> poison, 
<8 x i32> 
-; OPT-NEXT:[[O_1:%.*]] = insertelement <8 x i8> [[O_0]], i8 [[I_1]], i32 1
-; OPT-NEXT:[[O_2:%.*]] = insertelement <8 x i8> [[O_1]], i8 [[I_2]], i32 2
-; OPT-NEXT:[[O_3:%.*]] = insertelement <8 x i8> [[O_2]], i8 [[I_3]], i32 3
-; OPT-NEXT:[[TMP1:%.*]] = shufflevector <32 x i8> [[IN]], <32 x i8> 
poison, <8 x i32> 
-; OPT-NEXT:[[O_4:%.*]] = shufflevector <8 x i8> [[O_3]], <8 x i8> 
[[TMP1]], <

[llvm-branch-commits] [llvm] VectorCombine: Improve the insert/extract fold in the narrowing case (PR #168820)

2025-12-11 Thread Nicolai Hähnle via llvm-branch-commits


https://github.com/nhaehnle updated 
https://github.com/llvm/llvm-project/pull/168820

From 352f05be1065786f45f00e54707b0ba17e8649c5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= 
Date: Wed, 19 Nov 2025 18:00:32 -0800
Subject: [PATCH] VectorCombine: Improve the insert/extract fold in the
 narrowing case

Keeping the extracted element in a natural position in the narrowed
vector has two beneficial effects:

1. It makes the narrowing shuffles cheaper (at least on AMDGPU), which
   allows the insert/extract fold to trigger.
2. It makes the narrowing shuffles in a chain of extract/insert
   compatible, which allows foldLengthChangingShuffles to successfully
   recognize a chain that can be folded.

There are minor X86 test changes that look reasonable to me. The IR
change for AVX2 in 
llvm/test/Transforms/VectorCombine/X86/extract-insert-poison.ll
doesn't change the assembly generated by `llc -mtriple=x86_64-- -mattr=AVX2`
at all.

commit-id:c151bb04
---
 .../Transforms/Vectorize/VectorCombine.cpp| 22 +--
 .../VectorCombine/AMDGPU/extract-insert-i8.ll | 18 ++-
 .../X86/extract-insert-poison.ll  | 12 ++
 .../VectorCombine/X86/extract-insert.ll   |  8 +++
 .../Transforms/VectorCombine/X86/pr126085.ll  |  4 ++--
 5 files changed, 22 insertions(+), 42 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp 
b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index b83597fec021a..4b081205eba10 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -4558,22 +4558,15 @@ bool 
VectorCombine::foldInsExtVectorToShuffle(Instruction &I) {
   SmallVector Mask(NumDstElts, PoisonMaskElem);
 
   bool NeedExpOrNarrow = NumSrcElts != NumDstElts;
-  bool IsExtIdxInBounds = ExtIdx < NumDstElts;
   bool NeedDstSrcSwap = isa(DstVec) && !isa(SrcVec);
   if (NeedDstSrcSwap) {
 SK = TargetTransformInfo::SK_PermuteSingleSrc;
-if (!IsExtIdxInBounds && NeedExpOrNarrow)
-  Mask[InsIdx] = 0;
-else
-  Mask[InsIdx] = ExtIdx;
+Mask[InsIdx] = ExtIdx % NumDstElts;
 std::swap(DstVec, SrcVec);
   } else {
 SK = TargetTransformInfo::SK_PermuteTwoSrc;
 std::iota(Mask.begin(), Mask.end(), 0);
-if (!IsExtIdxInBounds && NeedExpOrNarrow)
-  Mask[InsIdx] = NumDstElts;
-else
-  Mask[InsIdx] = ExtIdx + NumDstElts;
+Mask[InsIdx] = (ExtIdx % NumDstElts) + NumDstElts;
   }
 
   // Cost
@@ -4594,14 +4587,11 @@ bool 
VectorCombine::foldInsExtVectorToShuffle(Instruction &I) {
   NewCost += TTI.getShuffleCost(SK, DstVecTy, DstVecTy, Mask, CostKind, 0,
 nullptr, {DstVec, SrcVec});
   } else {
-// When creating length-changing-vector, always create with a Mask whose
-// first element has an ExtIdx, so that the first element of the vector
-// being created is always the target to be extracted.
+// When creating a length-changing-vector, always try to keep the relevant
+// element in an equivalent position, so that bulk shuffles are more likely
+// to be useful.
 ExtToVecMask.assign(NumDstElts, PoisonMaskElem);
-if (IsExtIdxInBounds)
-  ExtToVecMask[ExtIdx] = ExtIdx;
-else
-  ExtToVecMask[0] = ExtIdx;
+ExtToVecMask[ExtIdx % NumDstElts] = ExtIdx;
 // Add cost for expanding or narrowing
 NewCost = TTI.getShuffleCost(TargetTransformInfo::SK_PermuteSingleSrc,
  DstVecTy, SrcVecTy, ExtToVecMask, CostKind);
diff --git a/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll 
b/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
index 8c2455dd9d375..6c92892949175 100644
--- a/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
+++ b/llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
@@ -88,22 +88,8 @@ entry:
 define <8 x i8> @extract_insert_chain_shortening(<32 x i8> %in) {
 ; OPT-LABEL: define <8 x i8> @extract_insert_chain_shortening(
 ; OPT-SAME: <32 x i8> [[IN:%.*]]) #[[ATTR0]] {
-; OPT-NEXT:[[I_1:%.*]] = extractelement <32 x i8> [[IN]], i64 17
-; OPT-NEXT:[[I_2:%.*]] = extractelement <32 x i8> [[IN]], i64 18
-; OPT-NEXT:[[I_3:%.*]] = extractelement <32 x i8> [[IN]], i64 19
-; OPT-NEXT:[[I_5:%.*]] = extractelement <32 x i8> [[IN]], i64 21
-; OPT-NEXT:[[I_6:%.*]] = extractelement <32 x i8> [[IN]], i64 22
-; OPT-NEXT:[[I_7:%.*]] = extractelement <32 x i8> [[IN]], i64 23
-; OPT-NEXT:[[O_0:%.*]] = shufflevector <32 x i8> [[IN]], <32 x i8> poison, 
<8 x i32> 
-; OPT-NEXT:[[O_1:%.*]] = insertelement <8 x i8> [[O_0]], i8 [[I_1]], i32 1
-; OPT-NEXT:[[O_2:%.*]] = insertelement <8 x i8> [[O_1]], i8 [[I_2]], i32 2
-; OPT-NEXT:[[O_3:%.*]] = insertelement <8 x i8> [[O_2]], i8 [[I_3]], i32 3
-; OPT-NEXT:[[TMP1:%.*]] = shufflevector <32 x i8> [[IN]], <32 x i8> 
poison, <8 x i32> 
-; OPT-NEXT:[[O_4:%.*]] = shufflevector <8 x i8> [[O_3]], <8 x i8> 
[[TMP1]], <

[llvm-branch-commits] [llvm] backport: [RISCV] Sources of vmerge shouldn't overlap V0 (#170070) (PR #170604)

2025-12-11 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> > Hi @asb, @topperc and @preames, I'm a bit hesitant to take this fix into 
> > the release branch because it seems a bit large for what is essentially the 
> > last release of the 21.x branch. After this there will be no more releases, 
> > so no chances to fix any issues that may arise. The fact that one was 
> > already found when backporting the fix makes me even more worried. As the 
> > owners of the RISCV backend, do any of you have any thoughts on how risky 
> > it would be to take this fix in 21.x or should we just wait for the 22.x 
> > branch for the fix?
> 
> @lukel97 @wangpc-pp Is the change to RISCVInstrInfoVPseudos.td enough to fix 
> the bug? Was the RISCVVectorPeephole.cpp part for correctness or to prevent 
> regressions?

But I think the regression can be large. :-(

https://github.com/llvm/llvm-project/pull/170604
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/21.x: [ExtractAPI] Format typedef params correctly (#171516) (PR #171522)

2025-12-11 Thread Prajwal Nadig via llvm-branch-commits


snprajwal wrote:

Hi @dyung, this is a small patch, but it fixes a serious correctness issue with 
the declaration fragments emitted by ExtractAPI. When typedefs are present in 
method parameters, e.g.:

```c
typedef int (^CustomType)(const unsigned int *, unsigned long);
void bar(CustomType block);
```
The output without this patch:

```
void bar(CustomTypeblock);
```
The output with this patch:
```
void bar(CustomType block);
```

As you can see, the original output is syntactically invalid, hence the request 
to backport to the 21.x release. 

https://github.com/llvm/llvm-project/pull/171522
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AtomicExpand] Add bitcasts when expanding load atomic vector (PR #148900)

2025-12-11 Thread via llvm-branch-commits


https://github.com/jofrn updated 
https://github.com/llvm/llvm-project/pull/148900

>From a9e62b36fb879b7b0278d299df64e11ba6605041 Mon Sep 17 00:00:00 2001
From: jofrn 
Date: Tue, 15 Jul 2025 13:03:15 -0400
Subject: [PATCH] [AtomicExpand] Add bitcasts when expanding load atomic vector

AtomicExpand fails for aligned `load atomic ` because it
does not find a compatible library call. This change adds appropriate
bitcasts so that the call can be lowered. It also adds support for
128 bit lowering in tablegen to support SSE/AVX.
---
 llvm/lib/CodeGen/AtomicExpandPass.cpp |  19 +-
 llvm/test/CodeGen/ARM/atomic-load-store.ll|  51 
 llvm/test/CodeGen/X86/atomic-load-store.ll|  91 +-
 .../X86/expand-atomic-non-integer.ll  | 287 ++
 4 files changed, 382 insertions(+), 66 deletions(-)

diff --git a/llvm/lib/CodeGen/AtomicExpandPass.cpp 
b/llvm/lib/CodeGen/AtomicExpandPass.cpp
index 53f1cfe24a68d..8dc14bb416345 100644
--- a/llvm/lib/CodeGen/AtomicExpandPass.cpp
+++ b/llvm/lib/CodeGen/AtomicExpandPass.cpp
@@ -483,7 +483,9 @@ LoadInst 
*AtomicExpandImpl::convertAtomicLoadToIntegerType(LoadInst *LI) {
   NewLI->setAtomic(LI->getOrdering(), LI->getSyncScopeID());
   LLVM_DEBUG(dbgs() << "Replaced " << *LI << " with " << *NewLI << "\n");
 
-  Value *NewVal = Builder.CreateBitCast(NewLI, LI->getType());
+  Value *NewVal = LI->getType()->isPtrOrPtrVectorTy()
+  ? Builder.CreateIntToPtr(NewLI, LI->getType())
+  : Builder.CreateBitCast(NewLI, LI->getType());
   LI->replaceAllUsesWith(NewVal);
   LI->eraseFromParent();
   return NewLI;
@@ -2093,9 +2095,18 @@ bool AtomicExpandImpl::expandAtomicOpToLibcall(
 I->replaceAllUsesWith(V);
   } else if (HasResult) {
 Value *V;
-if (UseSizedLibcall)
-  V = Builder.CreateBitOrPointerCast(Result, I->getType());
-else {
+if (UseSizedLibcall) {
+  // Add bitcasts from Result's scalar type to I's  vector type
+  auto *PtrTy = dyn_cast(I->getType()->getScalarType());
+  auto *VTy = dyn_cast(I->getType());
+  if (VTy && PtrTy && !Result->getType()->isVectorTy()) {
+unsigned AS = PtrTy->getAddressSpace();
+Value *BC = Builder.CreateBitCast(
+Result, VTy->getWithNewType(DL.getIntPtrType(Ctx, AS)));
+V = Builder.CreateIntToPtr(BC, I->getType());
+  } else
+V = Builder.CreateBitOrPointerCast(Result, I->getType());
+} else {
   V = Builder.CreateAlignedLoad(I->getType(), AllocaResult,
 AllocaAlignment);
   Builder.CreateLifetimeEnd(AllocaResult);
diff --git a/llvm/test/CodeGen/ARM/atomic-load-store.ll 
b/llvm/test/CodeGen/ARM/atomic-load-store.ll
index 560dfde356c29..eaa2ffd9b2731 100644
--- a/llvm/test/CodeGen/ARM/atomic-load-store.ll
+++ b/llvm/test/CodeGen/ARM/atomic-load-store.ll
@@ -983,3 +983,54 @@ define void @store_atomic_f64__seq_cst(ptr %ptr, double 
%val1) {
   store atomic double %val1, ptr %ptr seq_cst, align 8
   ret void
 }
+
+define <1 x ptr> @atomic_vec1_ptr(ptr %x) #0 {
+; ARM-LABEL: atomic_vec1_ptr:
+; ARM:   @ %bb.0:
+; ARM-NEXT:ldr r0, [r0]
+; ARM-NEXT:dmb ish
+; ARM-NEXT:bx lr
+;
+; ARMOPTNONE-LABEL: atomic_vec1_ptr:
+; ARMOPTNONE:   @ %bb.0:
+; ARMOPTNONE-NEXT:ldr r0, [r0]
+; ARMOPTNONE-NEXT:dmb ish
+; ARMOPTNONE-NEXT:bx lr
+;
+; THUMBTWO-LABEL: atomic_vec1_ptr:
+; THUMBTWO:   @ %bb.0:
+; THUMBTWO-NEXT:ldr r0, [r0]
+; THUMBTWO-NEXT:dmb ish
+; THUMBTWO-NEXT:bx lr
+;
+; THUMBONE-LABEL: atomic_vec1_ptr:
+; THUMBONE:   @ %bb.0:
+; THUMBONE-NEXT:push {r7, lr}
+; THUMBONE-NEXT:movs r1, #0
+; THUMBONE-NEXT:mov r2, r1
+; THUMBONE-NEXT:bl __sync_val_compare_and_swap_4
+; THUMBONE-NEXT:pop {r7, pc}
+;
+; ARMV4-LABEL: atomic_vec1_ptr:
+; ARMV4:   @ %bb.0:
+; ARMV4-NEXT:push {r11, lr}
+; ARMV4-NEXT:mov r1, #2
+; ARMV4-NEXT:bl __atomic_load_4
+; ARMV4-NEXT:pop {r11, lr}
+; ARMV4-NEXT:mov pc, lr
+;
+; ARMV6-LABEL: atomic_vec1_ptr:
+; ARMV6:   @ %bb.0:
+; ARMV6-NEXT:ldr r0, [r0]
+; ARMV6-NEXT:mov r1, #0
+; ARMV6-NEXT:mcr p15, #0, r1, c7, c10, #5
+; ARMV6-NEXT:bx lr
+;
+; THUMBM-LABEL: atomic_vec1_ptr:
+; THUMBM:   @ %bb.0:
+; THUMBM-NEXT:ldr r0, [r0]
+; THUMBM-NEXT:dmb sy
+; THUMBM-NEXT:bx lr
+  %ret = load atomic <1 x ptr>, ptr %x acquire, align 4
+  ret <1 x ptr> %ret
+}
diff --git a/llvm/test/CodeGen/X86/atomic-load-store.ll 
b/llvm/test/CodeGen/X86/atomic-load-store.ll
index 00310f6d1f219..867a4acb791bc 100644
--- a/llvm/test/CodeGen/X86/atomic-load-store.ll
+++ b/llvm/test/CodeGen/X86/atomic-load-store.ll
@@ -244,6 +244,96 @@ define <2 x ptr addrspace(270)> @atomic_vec2_ptr270(ptr 
%x) {
   %ret = load atomic <2 x ptr addrspace(270)>, ptr %x acquire, align 8
   ret <2 x ptr addrspace(270)> %ret
 }
+define <2 x ptr> @atomic_vec2_ptr_align(ptr %x) nounwind {
+; CHECK-SSE2-O3-LABEL: atomic_vec2_ptr_align:
+; CHECK-SSE2-O3:   # %bb.0:

[llvm-branch-commits] [llvm] [AtomicExpand] Add bitcasts when expanding load atomic vector (PR #148900)

2025-12-11 Thread via llvm-branch-commits



@@ -189,3 +189,69 @@ define void @pointer_cmpxchg_expand6(ptr addrspace(1) 
%ptr, ptr addrspace(2) %v)
   ret void
 }
 
+define <2 x ptr> @atomic_vec2_ptr_align(ptr %x) nounwind {
+; CHECK-LABEL: define <2 x ptr> @atomic_vec2_ptr_align(
+; CHECK-SAME: ptr [[X:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:[[TMP1:%.*]] = call i128 @__atomic_load_16(ptr [[X]], i32 2)
+; CHECK-NEXT:[[TMP6:%.*]] = bitcast i128 [[TMP1]] to <2 x i64>
+; CHECK-NEXT:[[TMP7:%.*]] = inttoptr <2 x i64> [[TMP6]] to <2 x ptr>
+; CHECK-NEXT:ret <2 x ptr> [[TMP7]]
+;
+  %ret = load atomic <2 x ptr>, ptr %x acquire, align 16

jofrn wrote:

I think we should. Changed it to that; however, some *_cmpxchg_* tests also 
have their output changed even though unrelated to expanding a load atomic 
vector though, so perhaps it should be a separate commit. What do you think?

https://github.com/llvm/llvm-project/pull/148900
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AtomicExpand] Add bitcasts when expanding load atomic vector (PR #148900)

2025-12-11 Thread via llvm-branch-commits


https://github.com/jofrn edited https://github.com/llvm/llvm-project/pull/148900
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AtomicExpand] Add bitcasts when expanding load atomic vector (PR #148900)

2025-12-11 Thread via llvm-branch-commits


https://github.com/jofrn edited https://github.com/llvm/llvm-project/pull/148900
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AtomicExpand] Add bitcasts when expanding load atomic vector (PR #148900)

2025-12-11 Thread via llvm-branch-commits


https://github.com/jofrn edited https://github.com/llvm/llvm-project/pull/148900
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Add missing cases for V_INDIRECT_REG_{READ/WRITE}_GPR_IDX and V/S_INDIRECT_REG_WRITE_MOVREL (PR #171835)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Juan Manuel Martinez Caamaño (jmmartinez)


Changes

A buildbot failure in https://github.com/llvm/llvm-project/pull/170323
when expensive checks were used highlighted that some of these patterns
were missing.

This patch adds `V_INDIRECT_REG_{READ/WRITE}_GPR_IDX` and 
`V/S_INDIRECT_REG_WRITE_MOVREL` for `V6` and `V7` vector sizes.

---

Patch is 20.06 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/171835.diff


4 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+24) 
- (modified) llvm/lib/Target/AMDGPU/SIInstructions.td (+16) 
- (modified) 
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-extract-vector-elt.mir (+126) 
- (modified) 
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert-vector-elt.mir (+139) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 6d2110957002a..3e334aa08337e 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -1394,6 +1394,10 @@ SIInstrInfo::getIndirectGPRIDXPseudo(unsigned VecSize,
   return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V4);
 if (VecSize <= 160) // 20 bytes
   return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V5);
+if (VecSize <= 192) // 24 bytes
+  return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V6);
+if (VecSize <= 224) // 28 bytes
+  return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V7);
 if (VecSize <= 256) // 32 bytes
   return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V8);
 if (VecSize <= 288) // 36 bytes
@@ -1422,6 +1426,10 @@ SIInstrInfo::getIndirectGPRIDXPseudo(unsigned VecSize,
 return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V4);
   if (VecSize <= 160) // 20 bytes
 return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V5);
+  if (VecSize <= 192) // 24 bytes
+return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V6);
+  if (VecSize <= 224) // 28 bytes
+return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V7);
   if (VecSize <= 256) // 32 bytes
 return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V8);
   if (VecSize <= 288) // 36 bytes
@@ -1451,6 +1459,10 @@ static unsigned 
getIndirectVGPRWriteMovRelPseudoOpc(unsigned VecSize) {
 return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V4;
   if (VecSize <= 160) // 20 bytes
 return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V5;
+  if (VecSize <= 192) // 24 bytes
+return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V6;
+  if (VecSize <= 224) // 28 bytes
+return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V7;
   if (VecSize <= 256) // 32 bytes
 return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V8;
   if (VecSize <= 288) // 36 bytes
@@ -1480,6 +1492,10 @@ static unsigned 
getIndirectSGPRWriteMovRelPseudo32(unsigned VecSize) {
 return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V4;
   if (VecSize <= 160) // 20 bytes
 return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V5;
+  if (VecSize <= 192) // 24 bytes
+return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V6;
+  if (VecSize <= 224) // 28 bytes
+return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V7;
   if (VecSize <= 256) // 32 bytes
 return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V8;
   if (VecSize <= 288) // 36 bytes
@@ -2244,6 +2260,8 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) 
const {
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V3:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V4:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V5:
+  case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V6:
+  case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V7:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V8:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V9:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V10:
@@ -2256,6 +2274,8 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) 
const {
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V3:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V4:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V5:
+  case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V6:
+  case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V7:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V8:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V9:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V10:
@@ -2303,6 +2323,8 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) 
const {
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V3:
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V4:
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V5:
+  case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V6:
+  case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V7:
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V8:
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V9:
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V10:
@@ -2347,6 +2369,8 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) 
const {
   case AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V3:
   case A

[llvm-branch-commits] [llvm] [AMDGPU] Add missing cases for V_INDIRECT_REG_{READ/WRITE}_GPR_IDX and V/S_INDIRECT_REG_WRITE_MOVREL (PR #171835)

2025-12-11 Thread Juan Manuel Martinez Caamaño via llvm-branch-commits


https://github.com/jmmartinez created 
https://github.com/llvm/llvm-project/pull/171835

A buildbot failure in https://github.com/llvm/llvm-project/pull/170323
when expensive checks were used highlighted that some of these patterns
were missing.

This patch adds `V_INDIRECT_REG_{READ/WRITE}_GPR_IDX` and 
`V/S_INDIRECT_REG_WRITE_MOVREL` for `V6` and `V7` vector sizes.

From c8d5ae539570421f4d1b65f9ee81527b2e6cd1d6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Juan=20Manuel=20Martinez=20Caama=C3=B1o?=
 
Date: Thu, 11 Dec 2025 11:52:02 +0100
Subject: [PATCH] [AMDGPU] Add missing cases for
 V_INDIRECT_REG_{READ/WRITE}_GPR_IDX and V/S_INDIRECT_REG_WRITE_MOVREL

A buildbot failure in https://github.com/llvm/llvm-project/pull/170323
when expensive checks were used highlighted that some of these patterns
were missing.
---
 llvm/lib/Target/AMDGPU/SIInstrInfo.cpp|  24 +++
 llvm/lib/Target/AMDGPU/SIInstructions.td  |  16 ++
 .../inst-select-extract-vector-elt.mir| 126 
 .../inst-select-insert-vector-elt.mir | 139 ++
 4 files changed, 305 insertions(+)

diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 6d2110957002a..3e334aa08337e 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -1394,6 +1394,10 @@ SIInstrInfo::getIndirectGPRIDXPseudo(unsigned VecSize,
   return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V4);
 if (VecSize <= 160) // 20 bytes
   return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V5);
+if (VecSize <= 192) // 24 bytes
+  return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V6);
+if (VecSize <= 224) // 28 bytes
+  return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V7);
 if (VecSize <= 256) // 32 bytes
   return get(AMDGPU::V_INDIRECT_REG_READ_GPR_IDX_B32_V8);
 if (VecSize <= 288) // 36 bytes
@@ -1422,6 +1426,10 @@ SIInstrInfo::getIndirectGPRIDXPseudo(unsigned VecSize,
 return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V4);
   if (VecSize <= 160) // 20 bytes
 return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V5);
+  if (VecSize <= 192) // 24 bytes
+return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V6);
+  if (VecSize <= 224) // 28 bytes
+return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V7);
   if (VecSize <= 256) // 32 bytes
 return get(AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V8);
   if (VecSize <= 288) // 36 bytes
@@ -1451,6 +1459,10 @@ static unsigned 
getIndirectVGPRWriteMovRelPseudoOpc(unsigned VecSize) {
 return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V4;
   if (VecSize <= 160) // 20 bytes
 return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V5;
+  if (VecSize <= 192) // 24 bytes
+return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V6;
+  if (VecSize <= 224) // 28 bytes
+return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V7;
   if (VecSize <= 256) // 32 bytes
 return AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V8;
   if (VecSize <= 288) // 36 bytes
@@ -1480,6 +1492,10 @@ static unsigned 
getIndirectSGPRWriteMovRelPseudo32(unsigned VecSize) {
 return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V4;
   if (VecSize <= 160) // 20 bytes
 return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V5;
+  if (VecSize <= 192) // 24 bytes
+return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V6;
+  if (VecSize <= 224) // 28 bytes
+return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V7;
   if (VecSize <= 256) // 32 bytes
 return AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V8;
   if (VecSize <= 288) // 36 bytes
@@ -2244,6 +2260,8 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) 
const {
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V3:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V4:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V5:
+  case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V6:
+  case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V7:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V8:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V9:
   case AMDGPU::V_INDIRECT_REG_WRITE_MOVREL_B32_V10:
@@ -2256,6 +2274,8 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) 
const {
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V3:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V4:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V5:
+  case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V6:
+  case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V7:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V8:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V9:
   case AMDGPU::S_INDIRECT_REG_WRITE_MOVREL_B32_V10:
@@ -2303,6 +2323,8 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) 
const {
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V3:
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V4:
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V5:
+  case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V6:
+  case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V7:
   case AMDGPU::V_INDIRECT_REG_WRITE_GPR_IDX_B32_V8:
   case

[llvm-branch-commits] [llvm] [AMDGPU] Add missing cases for V_INDIRECT_REG_{READ/WRITE}_GPR_IDX and V/S_INDIRECT_REG_WRITE_MOVREL (PR #171835)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-globalisel

Author: Juan Manuel Martinez Caamaño (jmmartinez)


Changes

A buildbot failure in https://github.com/llvm/llvm-project/pull/170323
when expensive checks were used highlighted that some of these patterns
were missing.

This patch adds `V_INDIRECT_REG_{READ/WRITE}_GPR_IDX` and 
`V/S_INDIRECT_REG_WRITE_MOVREL` for `V6` and `V7` vector sizes.

---

Patch is 53.61 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/171835.diff


4 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+24) 
- (modified) llvm/lib/Target/AMDGPU/SIInstructions.td (+16) 
- (modified) 
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-extract-vector-elt.mir (+126) 
- (modified) 
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert-vector-elt.mir (+139) 


``diff



  
Unicorn! · GitHub

  body {
background-color: #f1f1f1;
margin: 0;
font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
  }

  .container { margin: 50px auto 40px auto; width: 600px; text-align: 
center; }

  a { color: #4183c4; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { letter-spacing: -1px; line-height: 60px; font-size: 60px; 
font-weight: 100; margin: 0px; text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 10px 0 10px; font-size: 18px; 
font-weight: 200; line-height: 1.6em;}

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  


  
https://github.com/llvm/llvm-project/pull/171835
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/21.x: [ExtractAPI] Format typedef params correctly (#171516) (PR #171522)

2025-12-11 Thread Prajwal Nadig via llvm-branch-commits


snprajwal wrote:

It's not a regression, it's a bug that surfaced recently.

https://github.com/llvm/llvm-project/pull/171522
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/21.x: [ExtractAPI] Format typedef params correctly (#171516) (PR #171522)

2025-12-11 Thread Prajwal Nadig via llvm-branch-commits


snprajwal wrote:

No, this specific code path has always produced incorrect output, even in past 
versions of LLVM. I took care to keep the scope of the change as small as 
possible to reduce the risk for this backport.

https://github.com/llvm/llvm-project/pull/171522
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/21.x: [ExtractAPI] Format typedef params correctly (#171516) (PR #171522)

2025-12-11 Thread via llvm-branch-commits


dyung wrote:

> It's not a regression, it's a bug that surfaced recently.

How recently? Was the code above producing the correct result in LLVM 20, 19, 
18, etc.?

https://github.com/llvm/llvm-project/pull/171522
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/21.x: [ExtractAPI] Format typedef params correctly (#171516) (PR #171522)

2025-12-11 Thread Prajwal Nadig via llvm-branch-commits


snprajwal wrote:

Understandable, thank you!

https://github.com/llvm/llvm-project/pull/171522
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-11 Thread via llvm-branch-commits


https://github.com/skc7 updated https://github.com/llvm/llvm-project/pull/171825

>From d02544b205133749563f6222fd0e71c863226d3c Mon Sep 17 00:00:00 2001
From: skc7 
Date: Thu, 11 Dec 2025 13:35:05 +0530
Subject: [PATCH 1/2] [OpenMP][MLIR] Add thread_limit with dims modifier
 support

---
 .../Optimizer/OpenMP/LowerWorkdistribute.cpp  |  16 +-
 .../mlir/Dialect/OpenMP/OpenMPClauses.td  |  29 +++-
 mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp  |  72 -
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |   8 +
 mlir/test/Dialect/OpenMP/invalid.mlir | 149 +-
 mlir/test/Dialect/OpenMP/ops.mlir |   8 +-
 6 files changed, 264 insertions(+), 18 deletions(-)

diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp 
b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
index 7b61539984232..a3b9e5c76bdd2 100644
--- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
+++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
@@ -766,6 +766,7 @@ FailureOr splitTargetData(omp::TargetOp 
targetOp,
   targetOp.getInReductionSymsAttr(), targetOp.getIsDevicePtrVars(),
   innerMapInfos, targetOp.getNowaitAttr(), targetOp.getPrivateVars(),
   targetOp.getPrivateSymsAttr(), targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
   targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   rewriter.inlineRegionBefore(targetOp.getRegion(), newTargetOp.getRegion(),
   newTargetOp.getRegion().begin());
@@ -1485,8 +1486,9 @@ genPreTargetOp(omp::TargetOp targetOp, SmallVector 
&preMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), preMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *preTargetBlock = rewriter.createBlock(
   &preTargetOp.getRegion(), preTargetOp.getRegion().begin(), {}, {});
   IRMapping preMapping;
@@ -1575,8 +1577,9 @@ genIsolatedTargetOp(omp::TargetOp targetOp, 
SmallVector &postMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *isolatedTargetBlock =
   rewriter.createBlock(&isolatedTargetOp.getRegion(),
isolatedTargetOp.getRegion().begin(), {}, {});
@@ -1655,8 +1658,9 @@ static omp::TargetOp genPostTargetOp(omp::TargetOp 
targetOp,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   // Create the block for postTargetOp
   auto *postTargetBlock = rewriter.createBlock(
   &postTargetOp.getRegion(), postTargetOp.getRegion().begin(), {}, {});
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..366855bf02968 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1452,16 +1452,43 @@ class OpenMP_ThreadLimitClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$thread_limit_num_dims,
+Variadic:$thread_limit_dims_values,
 Optional:$thread_limit
   );
 
   let optAssemblyFormat = [{
-`thread_limit` `(` $thread_limit `:` type($thread_limit) `)`
+`thread_limit` `(` custom(
+  $thread_limit_num_dims, $thread_limit_dims_values, 
type($thread_limit_dims_values),
+  $thread_limit, type($thread_limit)
+) `)`
   }];
 
   let description = [{
 The optional `thread_limit` specifies the limit on the number of threads.
   }];
+
+  let extraClassDeclaration = [{
+/// Returns true if the dims modifier is explicitly present
+bool hasThreadLimitDimsModifier() {
+  return getThreadLimitNumDims().has_v

[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: Chaitanya (skc7)


Changes

PR adds support of openmp 6.1 feature num_threads with dims modifier.
llvmIR translation for num_threads with dims modifier is marked as NYI.

---
Full diff: https://github.com/llvm/llvm-project/pull/171767.diff


6 Files Affected:

- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+42-3) 
- (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+70-7) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+11-1) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+32-1) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+10-5) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..09c1d4a8a5866 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1069,16 +1069,55 @@ class OpenMP_NumThreadsClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$num_threads_num_dims,
+Variadic:$num_threads_dims_values,
 Optional:$num_threads
   );
 
   let optAssemblyFormat = [{
-`num_threads` `(` $num_threads `:` type($num_threads) `)`
+`num_threads` `(` custom(
+  $num_threads_num_dims, $num_threads_dims_values, 
type($num_threads_dims_values),
+  $num_threads, type($num_threads)
+) `)`
   }];
 
   let description = [{
-The optional `num_threads` parameter specifies the number of threads which
-should be used to execute the parallel region.
+num_threads clause specifies the desired number of threads in the team
+space formed by the construct on which it appears.
+
+With dims modifier:
+- Uses `num_threads_num_dims` (dimension count) and 
`num_threads_dims_values` (upper bounds list)
+- Specifies upper bounds for each dimension (all must have same type)
+- Format: `num_threads(dims(N): upper_bound_0, ..., upper_bound_N-1 : 
type)`
+- Example: `num_threads(dims(3): %ub0, %ub1, %ub2 : i32)`
+
+Without dims modifier:
+- Uses `num_threads`
+- If lower bound not specified, it defaults to upper bound value
+- Format: `num_threads(bounds : type)`
+- Example: `num_threads(%ub : i32)`
+  }];
+
+  let extraClassDeclaration = [{
+/// Returns true if the dims modifier is explicitly present
+bool hasNumThreadsDimsModifier() {
+  return getNumThreadsNumDims().has_value() && 
getNumThreadsNumDims().value();
+}
+
+/// Returns the number of dimensions specified by dims modifier
+unsigned getNumThreadsDimsCount() {
+  if (!hasNumThreadsDimsModifier())
+return 1;
+  return static_cast(*getNumThreadsNumDims());
+}
+
+/// Returns the value for a specific dimension index
+/// Index must be less than getNumThreadsDimsCount()
+::mlir::Value getNumThreadsDimsValue(unsigned index) {
+  assert(index < getNumThreadsDimsCount() &&
+ "Num threads dims index out of bounds");
+  return getNumThreadsDimsValues()[index];
+}
   }];
 }
 
diff --git a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp 
b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
index 6423d49859c97..ab7bded7835be 100644
--- a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
+++ b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
@@ -448,6 +448,8 @@ struct ParallelOpLowering : public 
OpRewritePattern {
 /* allocate_vars = */ llvm::SmallVector{},
 /* allocator_vars = */ llvm::SmallVector{},
 /* if_expr = */ Value{},
+/* num_threads_num_dims = */ nullptr,
+/* num_threads_dims_values = */ llvm::SmallVector{},
 /* num_threads = */ numThreadsVar,
 /* private_vars = */ ValueRange(),
 /* private_syms = */ nullptr,
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp 
b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index d4dbf5f5244df..a9ed0274cd21c 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -2533,6 +2533,8 @@ void ParallelOp::build(OpBuilder &builder, OperationState 
&state,
ArrayRef attributes) {
   ParallelOp::build(builder, state, /*allocate_vars=*/ValueRange(),
 /*allocator_vars=*/ValueRange(), /*if_expr=*/nullptr,
+/*num_threads_dims=*/nullptr,
+/*num_threads_values=*/ValueRange(),
 /*num_threads=*/nullptr, /*private_vars=*/ValueRange(),
 /*private_syms=*/nullptr, 
/*private_needs_barrier=*/nullptr,
 /*proc_bind_kind=*/nullptr,
@@ -2544,13 +2546,14 @@ void ParallelOp::build(OpBuilder &builder, 
OperationState &state,
 void ParallelOp::build(OpBuilder &builder, OperationState &state,
const ParallelOperands &clauses) {
   MLIRContext *ctx = b

[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-mlir-llvm

Author: Chaitanya (skc7)


Changes

PR adds support of openmp 6.1 feature num_threads with dims modifier.
llvmIR translation for num_threads with dims modifier is marked as NYI.

---
Full diff: https://github.com/llvm/llvm-project/pull/171767.diff


6 Files Affected:

- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+42-3) 
- (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+70-7) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+11-1) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+32-1) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+10-5) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..09c1d4a8a5866 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1069,16 +1069,55 @@ class OpenMP_NumThreadsClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$num_threads_num_dims,
+Variadic:$num_threads_dims_values,
 Optional:$num_threads
   );
 
   let optAssemblyFormat = [{
-`num_threads` `(` $num_threads `:` type($num_threads) `)`
+`num_threads` `(` custom(
+  $num_threads_num_dims, $num_threads_dims_values, 
type($num_threads_dims_values),
+  $num_threads, type($num_threads)
+) `)`
   }];
 
   let description = [{
-The optional `num_threads` parameter specifies the number of threads which
-should be used to execute the parallel region.
+num_threads clause specifies the desired number of threads in the team
+space formed by the construct on which it appears.
+
+With dims modifier:
+- Uses `num_threads_num_dims` (dimension count) and 
`num_threads_dims_values` (upper bounds list)
+- Specifies upper bounds for each dimension (all must have same type)
+- Format: `num_threads(dims(N): upper_bound_0, ..., upper_bound_N-1 : 
type)`
+- Example: `num_threads(dims(3): %ub0, %ub1, %ub2 : i32)`
+
+Without dims modifier:
+- Uses `num_threads`
+- If lower bound not specified, it defaults to upper bound value
+- Format: `num_threads(bounds : type)`
+- Example: `num_threads(%ub : i32)`
+  }];
+
+  let extraClassDeclaration = [{
+/// Returns true if the dims modifier is explicitly present
+bool hasNumThreadsDimsModifier() {
+  return getNumThreadsNumDims().has_value() && 
getNumThreadsNumDims().value();
+}
+
+/// Returns the number of dimensions specified by dims modifier
+unsigned getNumThreadsDimsCount() {
+  if (!hasNumThreadsDimsModifier())
+return 1;
+  return static_cast(*getNumThreadsNumDims());
+}
+
+/// Returns the value for a specific dimension index
+/// Index must be less than getNumThreadsDimsCount()
+::mlir::Value getNumThreadsDimsValue(unsigned index) {
+  assert(index < getNumThreadsDimsCount() &&
+ "Num threads dims index out of bounds");
+  return getNumThreadsDimsValues()[index];
+}
   }];
 }
 
diff --git a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp 
b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
index 6423d49859c97..ab7bded7835be 100644
--- a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
+++ b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
@@ -448,6 +448,8 @@ struct ParallelOpLowering : public 
OpRewritePattern {
 /* allocate_vars = */ llvm::SmallVector{},
 /* allocator_vars = */ llvm::SmallVector{},
 /* if_expr = */ Value{},
+/* num_threads_num_dims = */ nullptr,
+/* num_threads_dims_values = */ llvm::SmallVector{},
 /* num_threads = */ numThreadsVar,
 /* private_vars = */ ValueRange(),
 /* private_syms = */ nullptr,
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp 
b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index d4dbf5f5244df..a9ed0274cd21c 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -2533,6 +2533,8 @@ void ParallelOp::build(OpBuilder &builder, OperationState 
&state,
ArrayRef attributes) {
   ParallelOp::build(builder, state, /*allocate_vars=*/ValueRange(),
 /*allocator_vars=*/ValueRange(), /*if_expr=*/nullptr,
+/*num_threads_dims=*/nullptr,
+/*num_threads_values=*/ValueRange(),
 /*num_threads=*/nullptr, /*private_vars=*/ValueRange(),
 /*private_syms=*/nullptr, 
/*private_needs_barrier=*/nullptr,
 /*proc_bind_kind=*/nullptr,
@@ -2544,13 +2546,14 @@ void ParallelOp::build(OpBuilder &builder, 
OperationState &state,
 void ParallelOp::build(OpBuilder &builder, OperationState &state,
const ParallelOperands &clauses) {
   MLIRContext *ct

[llvm-branch-commits] [llvm] SROA: Recognize llvm.protected.field.ptr intrinsics. (PR #151650)

2025-12-11 Thread Nikita Popov via llvm-branch-commits



@@ -5875,6 +5895,32 @@ SROA::runOnAlloca(AllocaInst &AI) {
 return {Changed, CFGChanged};
   }
 
+  for (auto &P : AS.partitions()) {
+std::optional ProtectedFieldDisc;
+// For now, we can't split if a field is accessed both via protected
+// field and not.
+for (Slice &S : P) {
+  if (auto *II = dyn_cast(S.getUse()->getUser()))
+if (II->getIntrinsicID() == Intrinsic::lifetime_start ||
+II->getIntrinsicID() == Intrinsic::lifetime_end)
+  continue;
+  if (!ProtectedFieldDisc)
+ProtectedFieldDisc = S.ProtectedFieldDisc;
+  if (*ProtectedFieldDisc != S.ProtectedFieldDisc)

nikic wrote:

What I had in mind here is the case where ProtectedFieldDisc is non-null but 
has different values.

https://github.com/llvm/llvm-project/pull/151650
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-mlir-openmp

Author: Chaitanya (skc7)


Changes

PR adds support of openmp 6.1 feature num_threads with dims modifier.
llvmIR translation for num_threads with dims modifier is marked as NYI.

---
Full diff: https://github.com/llvm/llvm-project/pull/171767.diff


6 Files Affected:

- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+42-3) 
- (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+70-7) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+11-1) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+32-1) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+10-5) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..09c1d4a8a5866 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1069,16 +1069,55 @@ class OpenMP_NumThreadsClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$num_threads_num_dims,
+Variadic:$num_threads_dims_values,
 Optional:$num_threads
   );
 
   let optAssemblyFormat = [{
-`num_threads` `(` $num_threads `:` type($num_threads) `)`
+`num_threads` `(` custom(
+  $num_threads_num_dims, $num_threads_dims_values, 
type($num_threads_dims_values),
+  $num_threads, type($num_threads)
+) `)`
   }];
 
   let description = [{
-The optional `num_threads` parameter specifies the number of threads which
-should be used to execute the parallel region.
+num_threads clause specifies the desired number of threads in the team
+space formed by the construct on which it appears.
+
+With dims modifier:
+- Uses `num_threads_num_dims` (dimension count) and 
`num_threads_dims_values` (upper bounds list)
+- Specifies upper bounds for each dimension (all must have same type)
+- Format: `num_threads(dims(N): upper_bound_0, ..., upper_bound_N-1 : 
type)`
+- Example: `num_threads(dims(3): %ub0, %ub1, %ub2 : i32)`
+
+Without dims modifier:
+- Uses `num_threads`
+- If lower bound not specified, it defaults to upper bound value
+- Format: `num_threads(bounds : type)`
+- Example: `num_threads(%ub : i32)`
+  }];
+
+  let extraClassDeclaration = [{
+/// Returns true if the dims modifier is explicitly present
+bool hasNumThreadsDimsModifier() {
+  return getNumThreadsNumDims().has_value() && 
getNumThreadsNumDims().value();
+}
+
+/// Returns the number of dimensions specified by dims modifier
+unsigned getNumThreadsDimsCount() {
+  if (!hasNumThreadsDimsModifier())
+return 1;
+  return static_cast(*getNumThreadsNumDims());
+}
+
+/// Returns the value for a specific dimension index
+/// Index must be less than getNumThreadsDimsCount()
+::mlir::Value getNumThreadsDimsValue(unsigned index) {
+  assert(index < getNumThreadsDimsCount() &&
+ "Num threads dims index out of bounds");
+  return getNumThreadsDimsValues()[index];
+}
   }];
 }
 
diff --git a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp 
b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
index 6423d49859c97..ab7bded7835be 100644
--- a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
+++ b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
@@ -448,6 +448,8 @@ struct ParallelOpLowering : public 
OpRewritePattern {
 /* allocate_vars = */ llvm::SmallVector{},
 /* allocator_vars = */ llvm::SmallVector{},
 /* if_expr = */ Value{},
+/* num_threads_num_dims = */ nullptr,
+/* num_threads_dims_values = */ llvm::SmallVector{},
 /* num_threads = */ numThreadsVar,
 /* private_vars = */ ValueRange(),
 /* private_syms = */ nullptr,
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp 
b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index d4dbf5f5244df..a9ed0274cd21c 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -2533,6 +2533,8 @@ void ParallelOp::build(OpBuilder &builder, OperationState 
&state,
ArrayRef attributes) {
   ParallelOp::build(builder, state, /*allocate_vars=*/ValueRange(),
 /*allocator_vars=*/ValueRange(), /*if_expr=*/nullptr,
+/*num_threads_dims=*/nullptr,
+/*num_threads_values=*/ValueRange(),
 /*num_threads=*/nullptr, /*private_vars=*/ValueRange(),
 /*private_syms=*/nullptr, 
/*private_needs_barrier=*/nullptr,
 /*proc_bind_kind=*/nullptr,
@@ -2544,13 +2546,14 @@ void ParallelOp::build(OpBuilder &builder, 
OperationState &state,
 void ParallelOp::build(OpBuilder &builder, OperationState &state,
const ParallelOperands &clauses) {
   MLIRContext *

[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-flang-openmp

Author: Chaitanya (skc7)


Changes

PR adds support of openmp 6.1 feature num_threads with dims modifier.
llvmIR translation for num_threads with dims modifier is marked as NYI.

---
Full diff: https://github.com/llvm/llvm-project/pull/171767.diff


6 Files Affected:

- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+42-3) 
- (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+70-7) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+11-1) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+32-1) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+10-5) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..09c1d4a8a5866 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1069,16 +1069,55 @@ class OpenMP_NumThreadsClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$num_threads_num_dims,
+Variadic:$num_threads_dims_values,
 Optional:$num_threads
   );
 
   let optAssemblyFormat = [{
-`num_threads` `(` $num_threads `:` type($num_threads) `)`
+`num_threads` `(` custom(
+  $num_threads_num_dims, $num_threads_dims_values, 
type($num_threads_dims_values),
+  $num_threads, type($num_threads)
+) `)`
   }];
 
   let description = [{
-The optional `num_threads` parameter specifies the number of threads which
-should be used to execute the parallel region.
+num_threads clause specifies the desired number of threads in the team
+space formed by the construct on which it appears.
+
+With dims modifier:
+- Uses `num_threads_num_dims` (dimension count) and 
`num_threads_dims_values` (upper bounds list)
+- Specifies upper bounds for each dimension (all must have same type)
+- Format: `num_threads(dims(N): upper_bound_0, ..., upper_bound_N-1 : 
type)`
+- Example: `num_threads(dims(3): %ub0, %ub1, %ub2 : i32)`
+
+Without dims modifier:
+- Uses `num_threads`
+- If lower bound not specified, it defaults to upper bound value
+- Format: `num_threads(bounds : type)`
+- Example: `num_threads(%ub : i32)`
+  }];
+
+  let extraClassDeclaration = [{
+/// Returns true if the dims modifier is explicitly present
+bool hasNumThreadsDimsModifier() {
+  return getNumThreadsNumDims().has_value() && 
getNumThreadsNumDims().value();
+}
+
+/// Returns the number of dimensions specified by dims modifier
+unsigned getNumThreadsDimsCount() {
+  if (!hasNumThreadsDimsModifier())
+return 1;
+  return static_cast(*getNumThreadsNumDims());
+}
+
+/// Returns the value for a specific dimension index
+/// Index must be less than getNumThreadsDimsCount()
+::mlir::Value getNumThreadsDimsValue(unsigned index) {
+  assert(index < getNumThreadsDimsCount() &&
+ "Num threads dims index out of bounds");
+  return getNumThreadsDimsValues()[index];
+}
   }];
 }
 
diff --git a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp 
b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
index 6423d49859c97..ab7bded7835be 100644
--- a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
+++ b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
@@ -448,6 +448,8 @@ struct ParallelOpLowering : public 
OpRewritePattern {
 /* allocate_vars = */ llvm::SmallVector{},
 /* allocator_vars = */ llvm::SmallVector{},
 /* if_expr = */ Value{},
+/* num_threads_num_dims = */ nullptr,
+/* num_threads_dims_values = */ llvm::SmallVector{},
 /* num_threads = */ numThreadsVar,
 /* private_vars = */ ValueRange(),
 /* private_syms = */ nullptr,
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp 
b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index d4dbf5f5244df..a9ed0274cd21c 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -2533,6 +2533,8 @@ void ParallelOp::build(OpBuilder &builder, OperationState 
&state,
ArrayRef attributes) {
   ParallelOp::build(builder, state, /*allocate_vars=*/ValueRange(),
 /*allocator_vars=*/ValueRange(), /*if_expr=*/nullptr,
+/*num_threads_dims=*/nullptr,
+/*num_threads_values=*/ValueRange(),
 /*num_threads=*/nullptr, /*private_vars=*/ValueRange(),
 /*private_syms=*/nullptr, 
/*private_needs_barrier=*/nullptr,
 /*proc_bind_kind=*/nullptr,
@@ -2544,13 +2546,14 @@ void ParallelOp::build(OpBuilder &builder, 
OperationState &state,
 void ParallelOp::build(OpBuilder &builder, OperationState &state,
const ParallelOperands &clauses) {
   MLIRContext

[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-11 Thread via llvm-branch-commits


https://github.com/skc7 ready_for_review 
https://github.com/llvm/llvm-project/pull/171767
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] SROA: Recognize llvm.protected.field.ptr intrinsics. (PR #151650)

2025-12-11 Thread Nikita Popov via llvm-branch-commits


https://github.com/nikic approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/151650
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-11 Thread via llvm-branch-commits


https://github.com/skc7 ready_for_review 
https://github.com/llvm/llvm-project/pull/171825
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: Chaitanya (skc7)


Changes

PR adds support of openmp 6.1 feature thread_limit with dims modifier.
llvmIR translation for thread_limit with dims modifier is marked as NYI.

---

Patch is 21.39 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/171825.diff


6 Files Affected:

- (modified) flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp (+10-6) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+40-2) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+70-2) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+8) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+141-8) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+7-1) 


``diff
diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp 
b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
index 7b61539984232..a3b9e5c76bdd2 100644
--- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
+++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
@@ -766,6 +766,7 @@ FailureOr splitTargetData(omp::TargetOp 
targetOp,
   targetOp.getInReductionSymsAttr(), targetOp.getIsDevicePtrVars(),
   innerMapInfos, targetOp.getNowaitAttr(), targetOp.getPrivateVars(),
   targetOp.getPrivateSymsAttr(), targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
   targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   rewriter.inlineRegionBefore(targetOp.getRegion(), newTargetOp.getRegion(),
   newTargetOp.getRegion().begin());
@@ -1485,8 +1486,9 @@ genPreTargetOp(omp::TargetOp targetOp, SmallVector 
&preMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), preMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *preTargetBlock = rewriter.createBlock(
   &preTargetOp.getRegion(), preTargetOp.getRegion().begin(), {}, {});
   IRMapping preMapping;
@@ -1575,8 +1577,9 @@ genIsolatedTargetOp(omp::TargetOp targetOp, 
SmallVector &postMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *isolatedTargetBlock =
   rewriter.createBlock(&isolatedTargetOp.getRegion(),
isolatedTargetOp.getRegion().begin(), {}, {});
@@ -1655,8 +1658,9 @@ static omp::TargetOp genPostTargetOp(omp::TargetOp 
targetOp,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   // Create the block for postTargetOp
   auto *postTargetBlock = rewriter.createBlock(
   &postTargetOp.getRegion(), postTargetOp.getRegion().begin(), {}, {});
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..4a0d1fd0af02c 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1452,15 +1452,53 @@ class OpenMP_ThreadLimitClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$thread_limit_num_dims,
+Variadic:$thread_limit_dims_values,
 Optional:$thread_limit
   );
 
   let optAssemblyFormat = [{
-`thread_limit` `(` $thread_limit `:` type($thread_limit) `)`
+`thread_limit` `(` custom(
+  $thread_limit_num_dims, $thread_limit_dims_values, 
type($thread_limit_dims_values),
+  $thread_limit, type($thread_limit)
+) `)`
   }];
 
   let description = [{
-The optional `thread_limit` specifies the limit on the number of threads.
+The `thread_limit` clause specifies the limit on the number of threads.
+
+With di

[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-mlir-llvm

Author: Chaitanya (skc7)


Changes

PR adds support of openmp 6.1 feature thread_limit with dims modifier.
llvmIR translation for thread_limit with dims modifier is marked as NYI.

---

Patch is 21.39 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/171825.diff


6 Files Affected:

- (modified) flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp (+10-6) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+40-2) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+70-2) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+8) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+141-8) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+7-1) 


``diff
diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp 
b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
index 7b61539984232..a3b9e5c76bdd2 100644
--- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
+++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
@@ -766,6 +766,7 @@ FailureOr splitTargetData(omp::TargetOp 
targetOp,
   targetOp.getInReductionSymsAttr(), targetOp.getIsDevicePtrVars(),
   innerMapInfos, targetOp.getNowaitAttr(), targetOp.getPrivateVars(),
   targetOp.getPrivateSymsAttr(), targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
   targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   rewriter.inlineRegionBefore(targetOp.getRegion(), newTargetOp.getRegion(),
   newTargetOp.getRegion().begin());
@@ -1485,8 +1486,9 @@ genPreTargetOp(omp::TargetOp targetOp, SmallVector 
&preMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), preMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *preTargetBlock = rewriter.createBlock(
   &preTargetOp.getRegion(), preTargetOp.getRegion().begin(), {}, {});
   IRMapping preMapping;
@@ -1575,8 +1577,9 @@ genIsolatedTargetOp(omp::TargetOp targetOp, 
SmallVector &postMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *isolatedTargetBlock =
   rewriter.createBlock(&isolatedTargetOp.getRegion(),
isolatedTargetOp.getRegion().begin(), {}, {});
@@ -1655,8 +1658,9 @@ static omp::TargetOp genPostTargetOp(omp::TargetOp 
targetOp,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   // Create the block for postTargetOp
   auto *postTargetBlock = rewriter.createBlock(
   &postTargetOp.getRegion(), postTargetOp.getRegion().begin(), {}, {});
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..4a0d1fd0af02c 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1452,15 +1452,53 @@ class OpenMP_ThreadLimitClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$thread_limit_num_dims,
+Variadic:$thread_limit_dims_values,
 Optional:$thread_limit
   );
 
   let optAssemblyFormat = [{
-`thread_limit` `(` $thread_limit `:` type($thread_limit) `)`
+`thread_limit` `(` custom(
+  $thread_limit_num_dims, $thread_limit_dims_values, 
type($thread_limit_dims_values),
+  $thread_limit, type($thread_limit)
+) `)`
   }];
 
   let description = [{
-The optional `thread_limit` specifies the limit on the number of threads.
+The `thread_limit` clause specifies the limit on the number of threads.
+
+Wi

[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:



@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-openmp

Author: Chaitanya (skc7)


Changes

PR adds support of openmp 6.1 feature thread_limit with dims modifier.
llvmIR translation for thread_limit with dims modifier is marked as NYI.

---

Patch is 21.39 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/171825.diff


6 Files Affected:

- (modified) flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp (+10-6) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+40-2) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+70-2) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+8) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+141-8) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+7-1) 


``diff
diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp 
b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
index 7b61539984232..a3b9e5c76bdd2 100644
--- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
+++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
@@ -766,6 +766,7 @@ FailureOr splitTargetData(omp::TargetOp 
targetOp,
   targetOp.getInReductionSymsAttr(), targetOp.getIsDevicePtrVars(),
   innerMapInfos, targetOp.getNowaitAttr(), targetOp.getPrivateVars(),
   targetOp.getPrivateSymsAttr(), targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
   targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   rewriter.inlineRegionBefore(targetOp.getRegion(), newTargetOp.getRegion(),
   newTargetOp.getRegion().begin());
@@ -1485,8 +1486,9 @@ genPreTargetOp(omp::TargetOp targetOp, SmallVector 
&preMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), preMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *preTargetBlock = rewriter.createBlock(
   &preTargetOp.getRegion(), preTargetOp.getRegion().begin(), {}, {});
   IRMapping preMapping;
@@ -1575,8 +1577,9 @@ genIsolatedTargetOp(omp::TargetOp targetOp, 
SmallVector &postMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *isolatedTargetBlock =
   rewriter.createBlock(&isolatedTargetOp.getRegion(),
isolatedTargetOp.getRegion().begin(), {}, {});
@@ -1655,8 +1658,9 @@ static omp::TargetOp genPostTargetOp(omp::TargetOp 
targetOp,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   // Create the block for postTargetOp
   auto *postTargetBlock = rewriter.createBlock(
   &postTargetOp.getRegion(), postTargetOp.getRegion().begin(), {}, {});
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..4a0d1fd0af02c 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1452,15 +1452,53 @@ class OpenMP_ThreadLimitClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$thread_limit_num_dims,
+Variadic:$thread_limit_dims_values,
 Optional:$thread_limit
   );
 
   let optAssemblyFormat = [{
-`thread_limit` `(` $thread_limit `:` type($thread_limit) `)`
+`thread_limit` `(` custom(
+  $thread_limit_num_dims, $thread_limit_dims_values, 
type($thread_limit_dims_values),
+  $thread_limit, type($thread_limit)
+) `)`
   }];
 
   let description = [{
-The optional `thread_limit` specifies the limit on the number of threads.
+The `thread_limit` clause specifies the l

[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-11 Thread via llvm-branch-commits


https://github.com/skc7 updated https://github.com/llvm/llvm-project/pull/171767

>From 1c69d29651bb1b73c04cca422454eb7d7c4c Mon Sep 17 00:00:00 2001
From: skc7 
Date: Thu, 11 Dec 2025 11:56:58 +0530
Subject: [PATCH 1/3] [OpenMP][MLIR] Add num_threads clause with dims modifier
 support

---
 .../mlir/Dialect/OpenMP/OpenMPClauses.td  | 50 +++-
 .../Conversion/SCFToOpenMP/SCFToOpenMP.cpp|  2 +
 mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp  | 79 +--
 mlir/test/Dialect/OpenMP/invalid.mlir | 33 +++-
 mlir/test/Dialect/OpenMP/ops.mlir | 15 ++--
 5 files changed, 163 insertions(+), 16 deletions(-)

diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..7525b6e4e99f6 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1069,16 +1069,60 @@ class OpenMP_NumThreadsClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$num_threads_dims,
+Variadic:$num_threads_values,
 Optional:$num_threads
   );
 
   let optAssemblyFormat = [{
-`num_threads` `(` $num_threads `:` type($num_threads) `)`
+`num_threads` `(` custom(
+  $num_threads_dims, $num_threads_values, type($num_threads_values),
+  $num_threads, type($num_threads)
+) `)`
   }];
 
   let description = [{
-The optional `num_threads` parameter specifies the number of threads which
-should be used to execute the parallel region.
+num_threads clause specifies the desired number of threads in the team
+space formed by the construct on which it appears.
+
+With dims modifier:
+- Uses `num_threads_dims` (dimension count) and `num_threads_values` 
(upper bounds list)
+- Specifies upper bounds for each dimension (all must have same type)
+- Format: `num_threads(dims(N): upper_bound_0, ..., upper_bound_N-1 : 
type)`
+- Example: `num_threads(dims(3): %ub0, %ub1, %ub2 : i32)`
+
+Without dims modifier:
+- Uses `num_threads`
+- If lower bound not specified, it defaults to upper bound value
+- Format: `num_threads(bounds : type)`
+- Example: `num_threads(%ub : i32)`
+  }];
+
+  let extraClassDeclaration = [{
+/// Returns true if the dims modifier is explicitly present
+bool hasDimsModifier() {
+  return getNumThreadsDims().has_value();
+}
+
+/// Returns the number of dimensions specified by dims modifier
+unsigned getNumDimensions() {
+  if (!hasDimsModifier())
+return 1;
+  return static_cast(*getNumThreadsDims());
+}
+
+/// Returns all dimension values as an operand range
+::mlir::OperandRange getDimensionValues() {
+  return getNumThreadsValues();
+}
+
+/// Returns the value for a specific dimension index
+/// Index must be less than getNumDimensions()
+::mlir::Value getDimensionValue(unsigned index) {
+  assert(index < getDimensionValues().size() &&
+ "Dimension index out of bounds");
+  return getDimensionValues()[index];
+}
   }];
 }
 
diff --git a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp 
b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
index 6423d49859c97..0d5333ec2e455 100644
--- a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
+++ b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
@@ -448,6 +448,8 @@ struct ParallelOpLowering : public 
OpRewritePattern {
 /* allocate_vars = */ llvm::SmallVector{},
 /* allocator_vars = */ llvm::SmallVector{},
 /* if_expr = */ Value{},
+/* num_threads_dims = */ nullptr,
+/* num_threads_values = */ llvm::SmallVector{},
 /* num_threads = */ numThreadsVar,
 /* private_vars = */ ValueRange(),
 /* private_syms = */ nullptr,
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp 
b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index d4dbf5f5244df..303ab94fbedff 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -2533,6 +2533,8 @@ void ParallelOp::build(OpBuilder &builder, OperationState 
&state,
ArrayRef attributes) {
   ParallelOp::build(builder, state, /*allocate_vars=*/ValueRange(),
 /*allocator_vars=*/ValueRange(), /*if_expr=*/nullptr,
+/*num_threads_dims=*/nullptr,
+/*num_threads_values=*/ValueRange(),
 /*num_threads=*/nullptr, /*private_vars=*/ValueRange(),
 /*private_syms=*/nullptr, 
/*private_needs_barrier=*/nullptr,
 /*proc_bind_kind=*/nullptr,
@@ -2544,13 +2546,14 @@ void ParallelOp::build(OpBuilder &builder, 
OperationState &state,
 void ParallelOp::build(OpBuilder &builder, OperationState &state,
const ParallelOperands &clauses) {
   MLIRContext *ctx = builder.getContext();
-  ParallelOp::build(builder, state, clau

[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-11 Thread via llvm-branch-commits


https://github.com/skc7 created https://github.com/llvm/llvm-project/pull/171825

PR adds support of openmp 6.1 feature thread_limit with dims modifier.
llvmIR translation for thread_limit with dims modifier is marked as NYI.

>From d02544b205133749563f6222fd0e71c863226d3c Mon Sep 17 00:00:00 2001
From: skc7 
Date: Thu, 11 Dec 2025 13:35:05 +0530
Subject: [PATCH] [OpenMP][MLIR] Add thread_limit with dims modifier support

---
 .../Optimizer/OpenMP/LowerWorkdistribute.cpp  |  16 +-
 .../mlir/Dialect/OpenMP/OpenMPClauses.td  |  29 +++-
 mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp  |  72 -
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |   8 +
 mlir/test/Dialect/OpenMP/invalid.mlir | 149 +-
 mlir/test/Dialect/OpenMP/ops.mlir |   8 +-
 6 files changed, 264 insertions(+), 18 deletions(-)

diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp 
b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
index 7b61539984232..a3b9e5c76bdd2 100644
--- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
+++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
@@ -766,6 +766,7 @@ FailureOr splitTargetData(omp::TargetOp 
targetOp,
   targetOp.getInReductionSymsAttr(), targetOp.getIsDevicePtrVars(),
   innerMapInfos, targetOp.getNowaitAttr(), targetOp.getPrivateVars(),
   targetOp.getPrivateSymsAttr(), targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
   targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   rewriter.inlineRegionBefore(targetOp.getRegion(), newTargetOp.getRegion(),
   newTargetOp.getRegion().begin());
@@ -1485,8 +1486,9 @@ genPreTargetOp(omp::TargetOp targetOp, SmallVector 
&preMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), preMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *preTargetBlock = rewriter.createBlock(
   &preTargetOp.getRegion(), preTargetOp.getRegion().begin(), {}, {});
   IRMapping preMapping;
@@ -1575,8 +1577,9 @@ genIsolatedTargetOp(omp::TargetOp targetOp, 
SmallVector &postMapOperands,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   auto *isolatedTargetBlock =
   rewriter.createBlock(&isolatedTargetOp.getRegion(),
isolatedTargetOp.getRegion().begin(), {}, {});
@@ -1655,8 +1658,9 @@ static omp::TargetOp genPostTargetOp(omp::TargetOp 
targetOp,
   targetOp.getInReductionByrefAttr(), targetOp.getInReductionSymsAttr(),
   targetOp.getIsDevicePtrVars(), postMapOperands, targetOp.getNowaitAttr(),
   targetOp.getPrivateVars(), targetOp.getPrivateSymsAttr(),
-  targetOp.getPrivateNeedsBarrierAttr(), targetOp.getThreadLimit(),
-  targetOp.getPrivateMapsAttr());
+  targetOp.getPrivateNeedsBarrierAttr(),
+  targetOp.getThreadLimitNumDimsAttr(), 
targetOp.getThreadLimitDimsValues(),
+  targetOp.getThreadLimit(), targetOp.getPrivateMapsAttr());
   // Create the block for postTargetOp
   auto *postTargetBlock = rewriter.createBlock(
   &postTargetOp.getRegion(), postTargetOp.getRegion().begin(), {}, {});
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index e36dc7c246f01..366855bf02968 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1452,16 +1452,43 @@ class OpenMP_ThreadLimitClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$thread_limit_num_dims,
+Variadic:$thread_limit_dims_values,
 Optional:$thread_limit
   );
 
   let optAssemblyFormat = [{
-`thread_limit` `(` $thread_limit `:` type($thread_limit) `)`
+`thread_limit` `(` custom(
+  $thread_limit_num_dims, $thread_limit_dims_values, 
type($thread_limit_dims_values),
+  $thread_limit, type($thread_limit)
+) `)`
   }];
 
   let description = [{
 The optional `thread_limit` specifies the limit on the number of threads.
   }];
+
+  let extraClassDeclaration = [{
+///

[llvm-branch-commits] [llvm] ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (PR #171837)

2025-12-11 Thread Yingwei Zheng via llvm-branch-commits



@@ -5553,6 +5553,37 @@ void computeKnownFPClass(const Value *V, const APInt 
&DemandedElts,
 
   // TODO: Copy inf handling from instructions
   break;
+case Intrinsic::amdgcn_rsq: {
+  KnownFPClass KnownSrc;
+  // The only negative value that can be returned is -0 for -0 inputs.
+  Known.knownNot(fcNegInf | fcNegSubnormal | fcNegNormal);

dtcxzyw wrote:

`rsq(-0.0) = -inf`.

https://github.com/llvm/llvm-project/pull/171837
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (PR #171837)

2025-12-11 Thread Yingwei Zheng via llvm-branch-commits



@@ -5553,6 +5553,37 @@ void computeKnownFPClass(const Value *V, const APInt 
&DemandedElts,
 
   // TODO: Copy inf handling from instructions
   break;
+case Intrinsic::amdgcn_rsq: {
+  KnownFPClass KnownSrc;
+  // The only negative value that can be returned is -0 for -0 inputs.
+  Known.knownNot(fcNegInf | fcNegSubnormal | fcNegNormal);
+
+  computeKnownFPClass(II->getArgOperand(0), DemandedElts, 
InterestedClasses,
+  KnownSrc, Q, Depth + 1);
+
+  if (KnownSrc.isKnownNever(fcSNan))
+Known.knownNot(fcSNan);
+
+  // Negative -> nan
+  if (KnownSrc.isKnownNeverNaN() && KnownSrc.cannotBeOrderedLessThanZero())
+Known.knownNot(fcNan);
+
+  Type *EltTy = II->getType()->getScalarType();
+
+  // f32 denormal always flushed.
+  if (EltTy->isFloatTy())
+Known.knownNot(fcPosSubnormal);
+  else {

dtcxzyw wrote:

The then part and the else part are orthogonal.

https://github.com/llvm/llvm-project/pull/171837
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (PR #171837)

2025-12-11 Thread Yingwei Zheng via llvm-branch-commits



@@ -5553,6 +5553,37 @@ void computeKnownFPClass(const Value *V, const APInt 
&DemandedElts,
 
   // TODO: Copy inf handling from instructions
   break;
+case Intrinsic::amdgcn_rsq: {
+  KnownFPClass KnownSrc;
+  // The only negative value that can be returned is -0 for -0 inputs.
+  Known.knownNot(fcNegInf | fcNegSubnormal | fcNegNormal);
+
+  computeKnownFPClass(II->getArgOperand(0), DemandedElts, 
InterestedClasses,
+  KnownSrc, Q, Depth + 1);
+
+  if (KnownSrc.isKnownNever(fcSNan))
+Known.knownNot(fcSNan);
+
+  // Negative -> nan
+  if (KnownSrc.isKnownNeverNaN() && KnownSrc.cannotBeOrderedLessThanZero())
+Known.knownNot(fcNan);
+
+  Type *EltTy = II->getType()->getScalarType();
+
+  // f32 denormal always flushed.
+  if (EltTy->isFloatTy())
+Known.knownNot(fcPosSubnormal);

dtcxzyw wrote:

This behavior is not documented for older architectures (Vega/GCN 3).

https://github.com/llvm/llvm-project/pull/171837
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (PR #171837)

2025-12-11 Thread Yingwei Zheng via llvm-branch-commits



@@ -5553,6 +5553,37 @@ void computeKnownFPClass(const Value *V, const APInt 
&DemandedElts,
 
   // TODO: Copy inf handling from instructions
   break;
+case Intrinsic::amdgcn_rsq: {
+  KnownFPClass KnownSrc;
+  // The only negative value that can be returned is -0 for -0 inputs.
+  Known.knownNot(fcNegInf | fcNegSubnormal | fcNegNormal);
+
+  computeKnownFPClass(II->getArgOperand(0), DemandedElts, 
InterestedClasses,
+  KnownSrc, Q, Depth + 1);
+
+  if (KnownSrc.isKnownNever(fcSNan))
+Known.knownNot(fcSNan);
+
+  // Negative -> nan
+  if (KnownSrc.isKnownNeverNaN() && KnownSrc.cannotBeOrderedLessThanZero())
+Known.knownNot(fcNan);
+
+  Type *EltTy = II->getType()->getScalarType();
+
+  // f32 denormal always flushed.
+  if (EltTy->isFloatTy())
+Known.knownNot(fcPosSubnormal);
+  else {
+const Function *F = II->getFunction();
+if (Q.IIQ.hasNoSignedZeros(II) ||
+(F && KnownSrc.isKnownNeverLogicalNegZero(

dtcxzyw wrote:

See my previous comment.

https://github.com/llvm/llvm-project/pull/171837
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [NFCI][ELF][AArch64][PAC] Teach addRelativeReloc to emit R_AARCH64_AUTH_RELATIVE (PR #171180)

2025-12-11 Thread Florian Mayer via llvm-branch-commits



@@ -704,8 +704,10 @@ static void addRelativeReloc(Ctx &ctx, InputSectionBase 
&isec,
  uint64_t offsetInSec, Symbol &sym, int64_t addend,
  RelExpr expr, RelType type) {
   Partition &part = isec.getPartition(ctx);
+  bool isAArch64Auth =
+  ctx.arg.emachine == EM_AARCH64 && type == R_AARCH64_AUTH_ABS64;
 
-  if (sym.isTagged()) {
+  if (sym.isTagged() && !isAArch64Auth) {

fmayer wrote:

Yes, MTE and PAuth ABI are not mutually exclusive, as they can use different 
bits.

https://github.com/llvm/llvm-project/pull/171180
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [WIP][CodeGen][DebugInfo][RISCV] Support scalable offsets in CFI (PR #170607)

2025-12-11 Thread Mikhail Gudim via llvm-branch-commits

mgudim wrote:

> > @ppenzin I am in the process of posting the commits myself, this may not be 
> > entirely ready. Please close these PRs.
> 
> I'm entirely sympathetic to your wanting these commits credited to you, but 
> I've been finding these _much_ easier to understand than the existing 
> patches. 

I should clarify the  situation with branches / patches:

(1) There exist 2 POC branches:
the original one: https://github.com/llvm/llvm-project/pull/90819
and rebased on a more recent, but still old `main`: 
https://github.com/mgudim/llvm-project/tree/rebased_save_csr_in_ra
These have one huge commit and are impossible to understand, they are not for 
review, they are just proof-of-concept.

(2) I have another branch: 
https://github.com/mgudim/llvm-project/tree/save_csr_in_ra3 where I split up 
all the work from (1) into small commits.  This branch is still in the broken 
state and it is missing 3 - 4 commits.

(3) As commits are ready in (2) and can be merged into `main` I am posting them 
as individual PRs:
https://github.com/llvm/llvm-project/pull/168869
https://github.com/llvm/llvm-project/pull/168531
https://github.com/llvm/llvm-project/pull/166773
https://github.com/llvm/llvm-project/pull/166763
https://github.com/llvm/llvm-project/pull/164480

This commit was just taken from (2). I didn't post it myself because it's not 
ready yet, I still have above 5 PRs to merge. 

> I really don't want to loose that until your patches are in a state where 
> they really do explain what is being proposed.
Sure we can keep it just to save this discussion, but in terms of content this 
is just a commit from (2).

I am going to post this explanation on the original PR too.

https://github.com/llvm/llvm-project/pull/170607
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Add DS loop preheader flush (3/4) (PR #171948)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: None (hidekisaito)


Changes

Add insertDSPreheaderFlushes() to insert S_WAIT_DSCNT 0 in loop preheaders when 
DS wait relaxation was applied.

Assisted-by: Cursor / claude-4.5-opus-high

Depends on https://github.com/llvm/llvm-project/pull/171944

---
Full diff: https://github.com/llvm/llvm-project/pull/171948.diff


2 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp (+67) 
- (modified) llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir (+4-2) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 777491fb58b80..28bc57ed2db4e 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -605,6 +605,7 @@ class SIInsertWaitcnts {
   std::optional getOptimalDSWaitCount(MachineBasicBlock *LoopHeader,
 const MachineInstr &MI) const;
   bool applyDSLoopWaitOpt(MachineInstr &MI, AMDGPU::Waitcnt &Wait);
+  bool insertDSPreheaderFlushes(MachineFunction &MF);
 };
 
 // This objects maintains the current score brackets of each wait counter, and
@@ -2904,6 +2905,68 @@ bool SIInsertWaitcnts::applyDSLoopWaitOpt(MachineInstr 
&MI,
   return true;
 }
 
+// Insert DS_CNT flush in preheaders of loops where DS wait relaxation was
+// applied. This is necessary because the relaxed wait counts inside the loop
+// are computed based on the DS loads issued at the end of the previous
+// iteration (via backedge), but the first iteration enters via the preheader.
+// We must ensure all DS loads from the preheader are complete before entering
+// the loop.
+bool SIInsertWaitcnts::insertDSPreheaderFlushes(MachineFunction &MF) {
+  bool Modified = false;
+
+  for (auto &[LoopHeader, Info] : LoopDSWaitOptCache) {
+if (!Info.Valid || !Info.RelaxationApplied)
+  continue;
+
+MachineLoop *ML = MLI->getLoopFor(LoopHeader);
+if (!ML)
+  continue;
+
+MachineBasicBlock *Preheader = ML->getLoopPreheader();
+if (!Preheader)
+  continue;
+
+// Insert s_wait_dscnt 0 at the end of the preheader (before the 
terminator)
+MachineBasicBlock::iterator InsertPos = Preheader->getFirstTerminator();
+if (InsertPos == Preheader->end() && !Preheader->empty())
+  InsertPos = std::prev(Preheader->end());
+
+// Check if there's already a DS wait at this position
+bool NeedInsert = true;
+if (InsertPos != Preheader->end() && InsertPos != Preheader->begin()) {
+  auto CheckPos = std::prev(InsertPos);
+  if (CheckPos->getOpcode() == AMDGPU::S_WAIT_DSCNT_soft ||
+  CheckPos->getOpcode() == AMDGPU::S_WAIT_DSCNT) {
+if (CheckPos->getOperand(0).getImm() == 0)
+  NeedInsert = false;
+else {
+  // Change existing wait to 0
+  CheckPos->getOperand(0).setImm(0);
+  NeedInsert = false;
+  Modified = true;
+  LLVM_DEBUG(dbgs() << "DS Loop Opt: Changed existing DS_CNT wait to 0"
+<< " in preheader ";
+ Preheader->printName(dbgs()); dbgs() << "\n");
+}
+  }
+}
+
+if (NeedInsert) {
+  DebugLoc DL;
+  if (InsertPos != Preheader->end())
+DL = InsertPos->getDebugLoc();
+  BuildMI(*Preheader, InsertPos, DL, TII->get(AMDGPU::S_WAIT_DSCNT_soft))
+  .addImm(0);
+  Modified = true;
+  LLVM_DEBUG(dbgs() << "DS Loop Opt: Inserted DS_CNT flush in preheader ";
+ Preheader->printName(dbgs()); dbgs() << " for loop at ";
+ LoopHeader->printName(dbgs()); dbgs() << "\n");
+}
+  }
+
+  return Modified;
+}
+
 // Return true if it is better to flush the vmcnt counter in the preheader of
 // the given loop. We currently decide to flush in two situations:
 // 1. The loop contains vmem store(s), no vmem load and at least one use of a
@@ -3250,6 +3313,10 @@ bool SIInsertWaitcnts::run(MachineFunction &MF) {
   }
 }
   }
+
+  // Insert DS_CNT flushes in preheaders of loops that had wait counts relaxed.
+  Modified |= insertDSPreheaderFlushes(MF);
+
   ReleaseVGPRInsts.clear();
   PreheadersToFlush.clear();
   LoopDSWaitOptCache.clear();
diff --git a/llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir 
b/llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir
index 48fdabf255e6f..e6237338fda5b 100644
--- a/llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir
+++ b/llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir
@@ -17,6 +17,7 @@
 # DBG: Loop DS Wait Opt: Loop at bb.1 - 16 DS loads, 8 WMMA/MFMA, {{[0-9]+}} 
total insts, eligible
 # DBG: Loop DS Wait Opt: Analyzed loop at bb.1 - 16 DS loads, HasBarrier=1, 
Valid=1
 # DBG: DS Loop Opt: Relaxing DsCnt from 0 to 12 for:
+# DBG: DS Loop Opt: Inserted DS_CNT flush in preheader bb.0 for loop at bb.1
 
 --- |
   define amdgpu_kernel void @ds_loop_eligible() { ret void }
@@ -31,9 +32,10 @@ mach

[llvm-branch-commits] [llvm] [AMDGPU] Add DS loop preheader flush (3/4) (PR #171948)

2025-12-11 Thread via llvm-branch-commits


https://github.com/hidekisaito created 
https://github.com/llvm/llvm-project/pull/171948

Add insertDSPreheaderFlushes() to insert S_WAIT_DSCNT 0 in loop preheaders when 
DS wait relaxation was applied.

Assisted-by: Cursor / claude-4.5-opus-high

Depends on https://github.com/llvm/llvm-project/pull/171944

>From 70beea81a01952a7de4cbb0d33c060d9946c05a5 Mon Sep 17 00:00:00 2001
From: Hideki Saito 
Date: Thu, 11 Dec 2025 20:02:23 -0500
Subject: [PATCH] [AMDGPU] Add DS loop preheader flush (3/4)

Add insertDSPreheaderFlushes() to insert S_WAIT_DSCNT 0 in loop preheaders
when DS wait relaxation was applied.

Assisted-by: Cursor / claude-4.5-opus-high
---
 llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp   | 67 +++
 .../AMDGPU/waitcnt-loop-ds-opt-eligible.mir   |  6 +-
 2 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 777491fb58b80..28bc57ed2db4e 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -605,6 +605,7 @@ class SIInsertWaitcnts {
   std::optional getOptimalDSWaitCount(MachineBasicBlock *LoopHeader,
 const MachineInstr &MI) const;
   bool applyDSLoopWaitOpt(MachineInstr &MI, AMDGPU::Waitcnt &Wait);
+  bool insertDSPreheaderFlushes(MachineFunction &MF);
 };
 
 // This objects maintains the current score brackets of each wait counter, and
@@ -2904,6 +2905,68 @@ bool SIInsertWaitcnts::applyDSLoopWaitOpt(MachineInstr 
&MI,
   return true;
 }
 
+// Insert DS_CNT flush in preheaders of loops where DS wait relaxation was
+// applied. This is necessary because the relaxed wait counts inside the loop
+// are computed based on the DS loads issued at the end of the previous
+// iteration (via backedge), but the first iteration enters via the preheader.
+// We must ensure all DS loads from the preheader are complete before entering
+// the loop.
+bool SIInsertWaitcnts::insertDSPreheaderFlushes(MachineFunction &MF) {
+  bool Modified = false;
+
+  for (auto &[LoopHeader, Info] : LoopDSWaitOptCache) {
+if (!Info.Valid || !Info.RelaxationApplied)
+  continue;
+
+MachineLoop *ML = MLI->getLoopFor(LoopHeader);
+if (!ML)
+  continue;
+
+MachineBasicBlock *Preheader = ML->getLoopPreheader();
+if (!Preheader)
+  continue;
+
+// Insert s_wait_dscnt 0 at the end of the preheader (before the 
terminator)
+MachineBasicBlock::iterator InsertPos = Preheader->getFirstTerminator();
+if (InsertPos == Preheader->end() && !Preheader->empty())
+  InsertPos = std::prev(Preheader->end());
+
+// Check if there's already a DS wait at this position
+bool NeedInsert = true;
+if (InsertPos != Preheader->end() && InsertPos != Preheader->begin()) {
+  auto CheckPos = std::prev(InsertPos);
+  if (CheckPos->getOpcode() == AMDGPU::S_WAIT_DSCNT_soft ||
+  CheckPos->getOpcode() == AMDGPU::S_WAIT_DSCNT) {
+if (CheckPos->getOperand(0).getImm() == 0)
+  NeedInsert = false;
+else {
+  // Change existing wait to 0
+  CheckPos->getOperand(0).setImm(0);
+  NeedInsert = false;
+  Modified = true;
+  LLVM_DEBUG(dbgs() << "DS Loop Opt: Changed existing DS_CNT wait to 0"
+<< " in preheader ";
+ Preheader->printName(dbgs()); dbgs() << "\n");
+}
+  }
+}
+
+if (NeedInsert) {
+  DebugLoc DL;
+  if (InsertPos != Preheader->end())
+DL = InsertPos->getDebugLoc();
+  BuildMI(*Preheader, InsertPos, DL, TII->get(AMDGPU::S_WAIT_DSCNT_soft))
+  .addImm(0);
+  Modified = true;
+  LLVM_DEBUG(dbgs() << "DS Loop Opt: Inserted DS_CNT flush in preheader ";
+ Preheader->printName(dbgs()); dbgs() << " for loop at ";
+ LoopHeader->printName(dbgs()); dbgs() << "\n");
+}
+  }
+
+  return Modified;
+}
+
 // Return true if it is better to flush the vmcnt counter in the preheader of
 // the given loop. We currently decide to flush in two situations:
 // 1. The loop contains vmem store(s), no vmem load and at least one use of a
@@ -3250,6 +3313,10 @@ bool SIInsertWaitcnts::run(MachineFunction &MF) {
   }
 }
   }
+
+  // Insert DS_CNT flushes in preheaders of loops that had wait counts relaxed.
+  Modified |= insertDSPreheaderFlushes(MF);
+
   ReleaseVGPRInsts.clear();
   PreheadersToFlush.clear();
   LoopDSWaitOptCache.clear();
diff --git a/llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir 
b/llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir
index 48fdabf255e6f..e6237338fda5b 100644
--- a/llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir
+++ b/llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir
@@ -17,6 +17,7 @@
 # DBG: Loop DS Wait Opt: Loop at bb.1 - 16 DS loads, 8 WMMA/MFMA, {{[0-9]+}} 
total insts, eligible
 # DBG: Loop DS Wait Opt: Analyzed l

[llvm-branch-commits] [NFCI][ELF][AArch64][PAC] Teach addRelativeReloc to emit R_AARCH64_AUTH_RELATIVE (PR #171180)

2025-12-11 Thread Jessica Clarke via llvm-branch-commits



@@ -704,8 +704,10 @@ static void addRelativeReloc(Ctx &ctx, InputSectionBase 
&isec,
  uint64_t offsetInSec, Symbol &sym, int64_t addend,
  RelExpr expr, RelType type) {
   Partition &part = isec.getPartition(ctx);
+  bool isAArch64Auth =
+  ctx.arg.emachine == EM_AARCH64 && type == R_AARCH64_AUTH_ABS64;
 
-  if (sym.isTagged()) {
+  if (sym.isTagged() && !isAArch64Auth) {

jrtc27 wrote:

Is the existing implementation that uses .relr.auth.dyn and/or no offset to the 
start of the symbol for AUTH_RELATIVE relocations against tagged symbols 
correct? I am assuming not, and that it should be doing the "obvious" 
composition of the two. Which would also simplify this patch.

https://github.com/llvm/llvm-project/pull/171180
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add a "Home" link to navbar (PR #171900)

2025-12-11 Thread via llvm-branch-commits


github-actions[bot] wrote:


# :penguin: Linux x64 Test Results

* 3048 tests passed
* 7 tests skipped
* 1 test failed

## Failed Tests
(click on a test name to see its output)

### Extra Tools Unit Tests

Extra Tools Unit 
Tests.clang-doc/_/ClangDocTests/JSONGeneratorTest/emitRecordJSON

```
Script:
--
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/tools/clang/tools/extra/unittests/clang-doc/./ClangDocTests
 --gtest_filter=JSONGeneratorTest.emitRecordJSON
--
/home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/unittests/clang-doc/JSONGeneratorTest.cpp:192
Expected equality of these values:
  Expected
Which is: "{\n  \"Bases\": [\n{\n  \"Access\": \"public\",\n  
\"End\": true,\n  \"HasPublicFunctions\": true,\n  
\"HasPublicMembers\": true,\n  \"InfoType\": \"record\",\n  
\"IsParent\": true,\n  \"IsTypedef\": false,\n  \"IsVirtual\": true,\n  
\"MangledName\": \"\",\n  \"Name\": \"F\",\n  \"Path\": 
\"path/to/F\",\n  \"PublicFunctions\": [\n{\n  
\"InfoType\": \"function\",\n  \"IsStatic\": false,\n  
\"Name\": \"InheritedFunctionOne\",\n  \"ReturnType\": {\n
\"IsBuiltIn\": false,\n\"IsTemplate\": false,\n
\"Name\": \"\",\n\"QualName\": \"\",\n\"USR\": 
\"\"\n  },\n  \"USR\": 
\"\"\n}\n  ],\n  
\"PublicMembers\": [\n{\n  \"IsStatic\": false,\n  
\"Name\": \"N\",\n  \"Type\": \"int\"\n}\n  ],\n  
\"TagType\": \"struct\",\n  \"USR\": 
\"\"\n}\n  ],\n  \"Enums\": [\n
{\n  \"End\": true,\n  \"InfoType\": \"enum\",\n  \"Members\": [\n  
  {\n  \"End\": true,\n  \"Name\": \"RED\",\n  
\"Value\": \"0\"\n}\n  ],\n  \"Name\": \"Color\",\n  
\"Scoped\": false,\n  \"USR\": 
\"\"\n}\n  ],\n  \"HasEnums\": 
true,\n  \"HasParents\": true,\n  \"HasPublicFunctions\": true,\n  
\"HasRecords\": true,\n  \"HasVirtualParents\": true,\n  \"InfoType\": 
\"record\",\n  \"IsTypedef\": false,\n  \"Location\": {\n\"Filename\": 
\"main.cpp\",\n\"LineNumber\": 1\n  },\n  \"MangledName\": \"\",\n  
\"Name\": \"Foo\",\n  \"Namespace\": [\n\"GlobalNamespace\"\n  ],\n  
\"Parents\": [\n{\n  \"End\": true,\n  \"Name\": \"F\",\n  
\"Path\": \"\",\n  \"QualName\": \"\",\n  \"USR\": 
\"\"\n}\n  ],\n  \"Path\": 
\"GlobalNamespace\",\n  \"ProtectedMembers\": [\n{\n  \"IsStatic\": 
false,\n  \"Name\": \"X\",\n  \"Type\": \"int\"\n}\n  ],\n  
\"PublicFunctions\": [\n{\n  \"InfoType\": \"function\",\n  
\"IsStatic\": false,\n  \"Name\": \"OneFunction\",\n  \"ReturnType\": 
{\n\"IsBuiltIn\": false,\n\"IsTemplate\": false,\n
\"Name\": \"\",\n\"QualName\": \"\",\n\"USR\": 
\"\"\n  },\n  \"USR\": 
\"\"\n}\n  ],\n  \"Records\": [\n   
 {\n  \"End\": true,\n  \"Name\": \"ChildStruct\",\n  \"Path\": 
\"path/to/A/r\",\n  \"QualName\": \"path::to::A::r::ChildStruct\",\n  
\"USR\": \"\"\n}\n  ],\n  
\"TagType\": \"class\",\n  \"Template\": {\n\"Parameters\": [\n  
\"class T\"\n]\n  },\n  \"USR\": 
\"\",\n  \"VirtualParents\": [\n{\n 
 \"End\": true,\n  \"Name\": \"G\",\n  \"Path\": \"path/to/G\",\n   
   \"QualName\": \"path::to::G::G\",\n  \"USR\": 
\"\"\n}\n  ]\n}"
  Actual.str()
Which is: "{\n  \"Bases\": [\n{\n  \"Access\": \"public\",\n  
\"End\": true,\n  \"HasPublicFunctions\": true,\n  
\"HasPublicMembers\": true,\n  \"InfoType\": \"record\",\n  
\"IsParent\": true,\n  \"IsTypedef\": false,\n  \"IsVirtual\": true,\n  
\"MangledName\": \"\",\n  \"Name\": \"F\",\n  \"Path\": 
\"path/to/F\",\n  \"PublicFunctions\": [\n{\n  
\"InfoType\": \"function\",\n  \"IsStatic\": false,\n  
\"Name\": \"InheritedFunctionOne\",\n  \"ReturnType\": {\n
\"IsBuiltIn\": false,\n\"IsTemplate\": false,\n
\"Name\": \"\",\n\"QualName\": \"\",\n\"USR\": 
\"\"\n  },\n  \"USR\": 
\"\"\n}\n  ],\n  
\"PublicMembers\": [\n{\n  \"IsStatic\": false,\n  
\"Name\": \"N\",\n  \"Type\": \"int\"\n}\n  ],\n  
\"TagType\": \"struct\",\n  \"USR\

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add a "Home" link to navbar (PR #171900)

2025-12-11 Thread via llvm-branch-commits


github-actions[bot] wrote:


# :window: Windows x64 Test Results

* 2986 tests passed
* 30 tests skipped
* 1 test failed

## Failed Tests
(click on a test name to see its output)

### Extra Tools Unit Tests

Extra Tools Unit 
Tests.clang-doc/_/ClangDocTests_exe/JSONGeneratorTest/emitRecordJSON

```
Script:
--
C:\_work\llvm-project\llvm-project\build\tools\clang\tools\extra\unittests\clang-doc\.\ClangDocTests.exe
 --gtest_filter=JSONGeneratorTest.emitRecordJSON
--
C:\_work\llvm-project\llvm-project\clang-tools-extra\unittests\clang-doc\JSONGeneratorTest.cpp:192
Expected equality of these values:
  Expected
Which is: "{\n  \"Bases\": [\n{\n  \"Access\": \"public\",\n  
\"End\": true,\n  \"HasPublicFunctions\": true,\n  
\"HasPublicMembers\": true,\n  \"InfoType\": \"record\",\n  
\"IsParent\": true,\n  \"IsTypedef\": false,\n  \"IsVirtual\": true,\n  
\"MangledName\": \"\",\n  \"Name\": \"F\",\n  \"Path\": 
\"path/to/F\",\n  \"PublicFunctions\": [\n{\n  
\"InfoType\": \"function\",\n  \"IsStatic\": false,\n  
\"Name\": \"InheritedFunctionOne\",\n  \"ReturnType\": {\n
\"IsBuiltIn\": false,\n\"IsTemplate\": false,\n
\"Name\": \"\",\n\"QualName\": \"\",\n\"USR\": 
\"\"\n  },\n  \"USR\": 
\"\"\n}\n  ],\n  
\"PublicMembers\": [\n{\n  \"IsStatic\": false,\n  
\"Name\": \"N\",\n  \"Type\": \"int\"\n}\n  ],\n  
\"TagType\": \"struct\",\n  \"USR\": 
\"\"\n}\n  ],\n  \"Enums\": [\n
{\n  \"End\": true,\n  \"InfoType\": \"enum\",\n  \"Members\": [\n  
  {\n  \"End\": true,\n  \"Name\": \"RED\",\n  
\"Value\": \"0\"\n}\n  ],\n  \"Name\": \"Color\",\n  
\"Scoped\": false,\n  \"USR\": 
\"\"\n}\n  ],\n  \"HasEnums\": 
true,\n  \"HasParents\": true,\n  \"HasPublicFunctions\": true,\n  
\"HasRecords\": true,\n  \"HasVirtualParents\": true,\n  \"InfoType\": 
\"record\",\n  \"IsTypedef\": false,\n  \"Location\": {\n\"Filename\": 
\"main.cpp\",\n\"LineNumber\": 1\n  },\n  \"MangledName\": \"\",\n  
\"Name\": \"Foo\",\n  \"Namespace\": [\n\"GlobalNamespace\"\n  ],\n  
\"Parents\": [\n{\n  \"End\": true,\n  \"Name\": \"F\",\n  
\"Path\": \"\",\n  \"QualName\": \"\",\n  \"USR\": 
\"\"\n}\n  ],\n  \"Path\": 
\"GlobalNamespace\",\n  \"ProtectedMembers\": [\n{\n  \"IsStatic\": 
false,\n  \"Name\": \"X\",\n  \"Type\": \"int\"\n}\n  ],\n  
\"PublicFunctions\": [\n{\n  \"InfoType\": \"function\",\n  
\"IsStatic\": false,\n  \"Name\": \"OneFunction\",\n  \"ReturnType\": 
{\n\"IsBuiltIn\": false,\n\"IsTemplate\": false,\n
\"Name\": \"\",\n\"QualName\": \"\",\n\"USR\": 
\"\"\n  },\n  \"USR\": 
\"\"\n}\n  ],\n  \"Records\": [\n   
 {\n  \"End\": true,\n  \"Name\": \"ChildStruct\",\n  \"Path\": 
\"path/to/A/r\",\n  \"QualName\": \"path::to::A::r::ChildStruct\",\n  
\"USR\": \"\"\n}\n  ],\n  
\"TagType\": \"class\",\n  \"Template\": {\n\"Parameters\": [\n  
\"class T\"\n]\n  },\n  \"USR\": 
\"\",\n  \"VirtualParents\": [\n{\n 
 \"End\": true,\n  \"Name\": \"G\",\n  \"Path\": \"path/to/G\",\n   
   \"QualName\": \"path::to::G::G\",\n  \"USR\": 
\"\"\n}\n  ]\n}"
  Actual.str()
Which is: "{\n  \"Bases\": [\n{\n  \"Access\": \"public\",\n  
\"End\": true,\n  \"HasPublicFunctions\": true,\n  
\"HasPublicMembers\": true,\n  \"InfoType\": \"record\",\n  
\"IsParent\": true,\n  \"IsTypedef\": false,\n  \"IsVirtual\": true,\n  
\"MangledName\": \"\",\n  \"Name\": \"F\",\n  \"Path\": 
\"path/to/F\",\n  \"PublicFunctions\": [\n{\n  
\"InfoType\": \"function\",\n  \"IsStatic\": false,\n  
\"Name\": \"InheritedFunctionOne\",\n  \"ReturnType\": {\n
\"IsBuiltIn\": false,\n\"IsTemplate\": false,\n
\"Name\": \"\",\n\"QualName\": \"\",\n\"USR\": 
\"\"\n  },\n  \"USR\": 
\"\"\n}\n  ],\n  
\"PublicMembers\": [\n{\n  \"IsStatic\": false,\n  
\"Name\": \"N\",\n  \"Type\": \"int\"\n}\n  ],\n  
\"TagType\": \"struct\",\n  \"USR\": 
\"

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add a "Home" link to navbar (PR #171900)

2025-12-11 Thread Erick Velez via llvm-branch-commits


https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/171900

>From 24e5f13b8af15c4a3615e7444cbb4f5b439964b4 Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Tue, 9 Dec 2025 09:41:28 -0800
Subject: [PATCH] [clang-doc] Add a "Home" link to navbar

This patch removes the old buttons and adds a link to the homepage.
---
 clang-tools-extra/clang-doc/HTMLGenerator.cpp |  7 +++
 .../clang-doc/assets/navbar-template.mustache |  5 +
 .../clang-doc/basic-project.mustache.test | 20 ---
 3 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/clang-tools-extra/clang-doc/HTMLGenerator.cpp 
b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
index 3fc89311749ad..1af555a5b772b 100644
--- a/clang-tools-extra/clang-doc/HTMLGenerator.cpp
+++ b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
@@ -120,6 +120,13 @@ Error HTMLGenerator::setupTemplateResources(const 
ClangDocContext &CDCtx,
 SCA->emplace_back(JsPath);
   }
   V.getAsObject()->insert({"Scripts", ScriptArr});
+  if (RelativeRootPath.empty()) {
+RelativeRootPath = "";
+  } else {
+sys::path::append(RelativeRootPath, "/index.html");
+sys::path::native(RelativeRootPath, sys::path::Style::posix);
+  }
+  V.getAsObject()->insert({"Homepage", RelativeRootPath});
   return Error::success();
 }
 
diff --git a/clang-tools-extra/clang-doc/assets/navbar-template.mustache 
b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
index 178d147a556d3..2767d5af86668 100644
--- a/clang-tools-extra/clang-doc/assets/navbar-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
@@ -8,10 +8,7 @@
 
 
 
-Namespace
-
-
-Class
+Home
 
 
 
diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test 
b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
index d406c9f297960..26e42280f3474 100644
--- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test
+++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
@@ -25,10 +25,7 @@ HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
-HTML-SHAPE: Namespace
-HTML-SHAPE: 
-HTML-SHAPE: 
-HTML-SHAPE: Class
+HTML-SHAPE: Home
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
@@ -135,10 +132,7 @@ HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
-HTML-CALC: Namespace
-HTML-CALC: 
-HTML-CALC: 
-HTML-CALC: Class
+HTML-CALC: Home
 HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
@@ -339,10 +333,7 @@ HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
-HTML-RECTANGLE: Namespace
-HTML-RECTANGLE: 
-HTML-RECTANGLE: 
-HTML-RECTANGLE: Class
+HTML-RECTANGLE: Home
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
@@ -457,10 +448,7 @@ HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 
-HTML-CIRCLE: Namespace
-HTML-CIRCLE: 
-HTML-CIRCLE: 
-HTML-CIRCLE: Class
+HTML-CIRCLE: Home
 HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add a "Home" link to navbar (PR #171900)

2025-12-11 Thread Erick Velez via llvm-branch-commits


https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/171900

>From 24e5f13b8af15c4a3615e7444cbb4f5b439964b4 Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Tue, 9 Dec 2025 09:41:28 -0800
Subject: [PATCH] [clang-doc] Add a "Home" link to navbar

This patch removes the old buttons and adds a link to the homepage.
---
 clang-tools-extra/clang-doc/HTMLGenerator.cpp |  7 +++
 .../clang-doc/assets/navbar-template.mustache |  5 +
 .../clang-doc/basic-project.mustache.test | 20 ---
 3 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/clang-tools-extra/clang-doc/HTMLGenerator.cpp 
b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
index 3fc89311749ad..1af555a5b772b 100644
--- a/clang-tools-extra/clang-doc/HTMLGenerator.cpp
+++ b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
@@ -120,6 +120,13 @@ Error HTMLGenerator::setupTemplateResources(const 
ClangDocContext &CDCtx,
 SCA->emplace_back(JsPath);
   }
   V.getAsObject()->insert({"Scripts", ScriptArr});
+  if (RelativeRootPath.empty()) {
+RelativeRootPath = "";
+  } else {
+sys::path::append(RelativeRootPath, "/index.html");
+sys::path::native(RelativeRootPath, sys::path::Style::posix);
+  }
+  V.getAsObject()->insert({"Homepage", RelativeRootPath});
   return Error::success();
 }
 
diff --git a/clang-tools-extra/clang-doc/assets/navbar-template.mustache 
b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
index 178d147a556d3..2767d5af86668 100644
--- a/clang-tools-extra/clang-doc/assets/navbar-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
@@ -8,10 +8,7 @@
 
 
 
-Namespace
-
-
-Class
+Home
 
 
 
diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test 
b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
index d406c9f297960..26e42280f3474 100644
--- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test
+++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
@@ -25,10 +25,7 @@ HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
-HTML-SHAPE: Namespace
-HTML-SHAPE: 
-HTML-SHAPE: 
-HTML-SHAPE: Class
+HTML-SHAPE: Home
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
@@ -135,10 +132,7 @@ HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
-HTML-CALC: Namespace
-HTML-CALC: 
-HTML-CALC: 
-HTML-CALC: Class
+HTML-CALC: Home
 HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
@@ -339,10 +333,7 @@ HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
-HTML-RECTANGLE: Namespace
-HTML-RECTANGLE: 
-HTML-RECTANGLE: 
-HTML-RECTANGLE: Class
+HTML-RECTANGLE: Home
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
@@ -457,10 +448,7 @@ HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 
-HTML-CIRCLE: Namespace
-HTML-CIRCLE: 
-HTML-CIRCLE: 
-HTML-CIRCLE: Class
+HTML-CIRCLE: Home
 HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add a "Home" link to navbar (PR #171900)

2025-12-11 Thread Erick Velez via llvm-branch-commits


https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/171900

>From d54ed0de5d44799974e153041878c7de6c228dd1 Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Tue, 9 Dec 2025 09:41:28 -0800
Subject: [PATCH] [clang-doc] Add a "Home" link to navbar

This patch removes the old buttons and adds a link to the homepage.
---
 clang-tools-extra/clang-doc/HTMLGenerator.cpp |  7 +++
 .../clang-doc/assets/navbar-template.mustache |  5 +
 .../clang-doc/basic-project.mustache.test | 20 ---
 3 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/clang-tools-extra/clang-doc/HTMLGenerator.cpp 
b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
index 3fc89311749ad..1af555a5b772b 100644
--- a/clang-tools-extra/clang-doc/HTMLGenerator.cpp
+++ b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
@@ -120,6 +120,13 @@ Error HTMLGenerator::setupTemplateResources(const 
ClangDocContext &CDCtx,
 SCA->emplace_back(JsPath);
   }
   V.getAsObject()->insert({"Scripts", ScriptArr});
+  if (RelativeRootPath.empty()) {
+RelativeRootPath = "";
+  } else {
+sys::path::append(RelativeRootPath, "/index.html");
+sys::path::native(RelativeRootPath, sys::path::Style::posix);
+  }
+  V.getAsObject()->insert({"Homepage", RelativeRootPath});
   return Error::success();
 }
 
diff --git a/clang-tools-extra/clang-doc/assets/navbar-template.mustache 
b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
index 178d147a556d3..2767d5af86668 100644
--- a/clang-tools-extra/clang-doc/assets/navbar-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
@@ -8,10 +8,7 @@
 
 
 
-Namespace
-
-
-Class
+Home
 
 
 
diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test 
b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
index d406c9f297960..26e42280f3474 100644
--- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test
+++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
@@ -25,10 +25,7 @@ HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
-HTML-SHAPE: Namespace
-HTML-SHAPE: 
-HTML-SHAPE: 
-HTML-SHAPE: Class
+HTML-SHAPE: Home
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
@@ -135,10 +132,7 @@ HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
-HTML-CALC: Namespace
-HTML-CALC: 
-HTML-CALC: 
-HTML-CALC: Class
+HTML-CALC: Home
 HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
@@ -339,10 +333,7 @@ HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
-HTML-RECTANGLE: Namespace
-HTML-RECTANGLE: 
-HTML-RECTANGLE: 
-HTML-RECTANGLE: Class
+HTML-RECTANGLE: Home
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
@@ -457,10 +448,7 @@ HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 
-HTML-CIRCLE: Namespace
-HTML-CIRCLE: 
-HTML-CIRCLE: 
-HTML-CIRCLE: Class
+HTML-CIRCLE: Home
 HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add a "Home" link to navbar (PR #171900)

2025-12-11 Thread Erick Velez via llvm-branch-commits


https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/171900

>From d54ed0de5d44799974e153041878c7de6c228dd1 Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Tue, 9 Dec 2025 09:41:28 -0800
Subject: [PATCH] [clang-doc] Add a "Home" link to navbar

This patch removes the old buttons and adds a link to the homepage.
---
 clang-tools-extra/clang-doc/HTMLGenerator.cpp |  7 +++
 .../clang-doc/assets/navbar-template.mustache |  5 +
 .../clang-doc/basic-project.mustache.test | 20 ---
 3 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/clang-tools-extra/clang-doc/HTMLGenerator.cpp 
b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
index 3fc89311749ad..1af555a5b772b 100644
--- a/clang-tools-extra/clang-doc/HTMLGenerator.cpp
+++ b/clang-tools-extra/clang-doc/HTMLGenerator.cpp
@@ -120,6 +120,13 @@ Error HTMLGenerator::setupTemplateResources(const 
ClangDocContext &CDCtx,
 SCA->emplace_back(JsPath);
   }
   V.getAsObject()->insert({"Scripts", ScriptArr});
+  if (RelativeRootPath.empty()) {
+RelativeRootPath = "";
+  } else {
+sys::path::append(RelativeRootPath, "/index.html");
+sys::path::native(RelativeRootPath, sys::path::Style::posix);
+  }
+  V.getAsObject()->insert({"Homepage", RelativeRootPath});
   return Error::success();
 }
 
diff --git a/clang-tools-extra/clang-doc/assets/navbar-template.mustache 
b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
index 178d147a556d3..2767d5af86668 100644
--- a/clang-tools-extra/clang-doc/assets/navbar-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/navbar-template.mustache
@@ -8,10 +8,7 @@
 
 
 
-Namespace
-
-
-Class
+Home
 
 
 
diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test 
b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
index d406c9f297960..26e42280f3474 100644
--- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test
+++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
@@ -25,10 +25,7 @@ HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
-HTML-SHAPE: Namespace
-HTML-SHAPE: 
-HTML-SHAPE: 
-HTML-SHAPE: Class
+HTML-SHAPE: Home
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
@@ -135,10 +132,7 @@ HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
-HTML-CALC: Namespace
-HTML-CALC: 
-HTML-CALC: 
-HTML-CALC: Class
+HTML-CALC: Home
 HTML-CALC: 
 HTML-CALC: 
 HTML-CALC: 
@@ -339,10 +333,7 @@ HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
-HTML-RECTANGLE: Namespace
-HTML-RECTANGLE: 
-HTML-RECTANGLE: 
-HTML-RECTANGLE: Class
+HTML-RECTANGLE: Home
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
 HTML-RECTANGLE: 
@@ -457,10 +448,7 @@ HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 
-HTML-CIRCLE: Namespace
-HTML-CIRCLE: 
-HTML-CIRCLE: 
-HTML-CIRCLE: Class
+HTML-CIRCLE: Home
 HTML-CIRCLE: 
 HTML-CIRCLE: 
 HTML-CIRCLE: 

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] SROA: Recognize llvm.protected.field.ptr intrinsics. (PR #151650)

2025-12-11 Thread Peter Collingbourne via llvm-branch-commits



@@ -5875,6 +5895,32 @@ SROA::runOnAlloca(AllocaInst &AI) {
 return {Changed, CFGChanged};
   }
 
+  for (auto &P : AS.partitions()) {
+std::optional ProtectedFieldDisc;
+// For now, we can't split if a field is accessed both via protected
+// field and not.
+for (Slice &S : P) {
+  if (auto *II = dyn_cast(S.getUse()->getUser()))
+if (II->getIntrinsicID() == Intrinsic::lifetime_start ||
+II->getIntrinsicID() == Intrinsic::lifetime_end)
+  continue;
+  if (!ProtectedFieldDisc)
+ProtectedFieldDisc = S.ProtectedFieldDisc;
+  if (*ProtectedFieldDisc != S.ProtectedFieldDisc)

pcc wrote:

Okay, I added a case `mixed2` where they are both non-null.

https://github.com/llvm/llvm-project/pull/151650
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] SROA: Recognize llvm.protected.field.ptr intrinsics. (PR #151650)

2025-12-11 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/151650

>From a4419c94b0812e3b9d4fea97f9f4fe9b9b10793c Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Fri, 5 Dec 2025 15:01:45 -0800
Subject: [PATCH] Address review comments

Created using spr 1.3.6-beta.1
---
 llvm/include/llvm/Analysis/PtrUseVisitor.h | 15 --
 llvm/lib/Analysis/PtrUseVisitor.cpp|  5 +-
 llvm/lib/Transforms/Scalar/SROA.cpp| 55 +++---
 3 files changed, 40 insertions(+), 35 deletions(-)

diff --git a/llvm/include/llvm/Analysis/PtrUseVisitor.h 
b/llvm/include/llvm/Analysis/PtrUseVisitor.h
index a39f6881f24f3..0858d8aee2186 100644
--- a/llvm/include/llvm/Analysis/PtrUseVisitor.h
+++ b/llvm/include/llvm/Analysis/PtrUseVisitor.h
@@ -134,7 +134,6 @@ class PtrUseVisitorBase {
 
 UseAndIsOffsetKnownPair UseAndIsOffsetKnown;
 APInt Offset;
-Value *ProtectedFieldDisc;
   };
 
   /// The worklist of to-visit uses.
@@ -159,10 +158,6 @@ class PtrUseVisitorBase {
   /// The constant offset of the use if that is known.
   APInt Offset;
 
-  // When this access is via an llvm.protected.field.ptr intrinsic, contains
-  // the second argument to the intrinsic, the discriminator.
-  Value *ProtectedFieldDisc;
-
   /// @}
 
   /// Note that the constructor is protected because this class must be a base
@@ -235,7 +230,6 @@ class PtrUseVisitor : protected InstVisitor,
 IntegerType *IntIdxTy = cast(DL.getIndexType(I.getType()));
 IsOffsetKnown = true;
 Offset = APInt(IntIdxTy->getBitWidth(), 0);
-ProtectedFieldDisc = nullptr;
 PI.reset();
 
 // Enqueue the uses of this pointer.
@@ -248,7 +242,6 @@ class PtrUseVisitor : protected InstVisitor,
   IsOffsetKnown = ToVisit.UseAndIsOffsetKnown.getInt();
   if (IsOffsetKnown)
 Offset = std::move(ToVisit.Offset);
-  ProtectedFieldDisc = ToVisit.ProtectedFieldDisc;
 
   Instruction *I = cast(U->getUser());
   static_cast(this)->visit(I);
@@ -307,14 +300,6 @@ class PtrUseVisitor : protected InstVisitor,
 case Intrinsic::lifetime_start:
 case Intrinsic::lifetime_end:
   return; // No-op intrinsics.
-
-case Intrinsic::protected_field_ptr: {
-  if (!IsOffsetKnown)
-return Base::visitIntrinsicInst(II);
-  ProtectedFieldDisc = II.getArgOperand(1);
-  enqueueUsers(II);
-  break;
-}
 }
   }
 
diff --git a/llvm/lib/Analysis/PtrUseVisitor.cpp 
b/llvm/lib/Analysis/PtrUseVisitor.cpp
index 0a79f84196602..9c79546f491ef 100644
--- a/llvm/lib/Analysis/PtrUseVisitor.cpp
+++ b/llvm/lib/Analysis/PtrUseVisitor.cpp
@@ -21,9 +21,8 @@ void detail::PtrUseVisitorBase::enqueueUsers(Value &I) {
   for (Use &U : I.uses()) {
 if (VisitedUses.insert(&U).second) {
   UseToVisit NewU = {
-  UseToVisit::UseAndIsOffsetKnownPair(&U, IsOffsetKnown),
-  Offset,
-  ProtectedFieldDisc,
+UseToVisit::UseAndIsOffsetKnownPair(&U, IsOffsetKnown),
+Offset
   };
   Worklist.push_back(std::move(NewU));
 }
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp 
b/llvm/lib/Transforms/Scalar/SROA.cpp
index 4c5d4a72eebe4..1102699aa04e9 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -648,7 +648,8 @@ class AllocaSlices {
   /// Access the dead users for this alloca.
   ArrayRef getDeadUsers() const { return DeadUsers; }
 
-  /// Access the PFP users for this alloca.
+  /// Access the users for this alloca that are llvm.protected.field.ptr
+  /// intrinsics.
   ArrayRef getPFPUsers() const { return PFPUsers; }
 
   /// Access Uses that should be dropped if the alloca is promotable.
@@ -1043,6 +1044,10 @@ class AllocaSlices::SliceBuilder : public 
PtrUseVisitor {
   /// Set to de-duplicate dead instructions found in the use walk.
   SmallPtrSet VisitedDeadInsts;
 
+  // When this access is via an llvm.protected.field.ptr intrinsic, contains
+  // the second argument to the intrinsic, the discriminator.
+  Value *ProtectedFieldDisc = nullptr;
+
 public:
   SliceBuilder(const DataLayout &DL, AllocaInst &AI, AllocaSlices &AS)
   : PtrUseVisitor(DL),
@@ -1289,8 +1294,26 @@ class AllocaSlices::SliceBuilder : public 
PtrUseVisitor {
   return;
 }
 
-if (II.getIntrinsicID() == Intrinsic::protected_field_ptr)
+if (II.getIntrinsicID() == Intrinsic::protected_field_ptr) {
+  // We only handle loads and stores as users of llvm.protected.field.ptr.
+  // Other uses may add items to the worklist, which will cause
+  // ProtectedFieldDisc to be tracked incorrectly.
   AS.PFPUsers.push_back(&II);
+  ProtectedFieldDisc = II.getArgOperand(1);
+  for (Use &U : II.uses()) {
+this->U = &U;
+if (auto *LI = dyn_cast(U.getUser()))
+  visitLoadInst(*LI);
+else if (auto *SI = dyn_cast(U.getUser()))
+  visitStoreInst(*SI);
+else
+  PI.setAborted(&II);
+if (PI.isAborted())
+  break;
+  }
+  ProtectedFi

[llvm-branch-commits] [llvm] SROA: Recognize llvm.protected.field.ptr intrinsics. (PR #151650)

2025-12-11 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/151650

>From a4419c94b0812e3b9d4fea97f9f4fe9b9b10793c Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Fri, 5 Dec 2025 15:01:45 -0800
Subject: [PATCH] Address review comments

Created using spr 1.3.6-beta.1
---
 llvm/include/llvm/Analysis/PtrUseVisitor.h | 15 --
 llvm/lib/Analysis/PtrUseVisitor.cpp|  5 +-
 llvm/lib/Transforms/Scalar/SROA.cpp| 55 +++---
 3 files changed, 40 insertions(+), 35 deletions(-)

diff --git a/llvm/include/llvm/Analysis/PtrUseVisitor.h 
b/llvm/include/llvm/Analysis/PtrUseVisitor.h
index a39f6881f24f3..0858d8aee2186 100644
--- a/llvm/include/llvm/Analysis/PtrUseVisitor.h
+++ b/llvm/include/llvm/Analysis/PtrUseVisitor.h
@@ -134,7 +134,6 @@ class PtrUseVisitorBase {
 
 UseAndIsOffsetKnownPair UseAndIsOffsetKnown;
 APInt Offset;
-Value *ProtectedFieldDisc;
   };
 
   /// The worklist of to-visit uses.
@@ -159,10 +158,6 @@ class PtrUseVisitorBase {
   /// The constant offset of the use if that is known.
   APInt Offset;
 
-  // When this access is via an llvm.protected.field.ptr intrinsic, contains
-  // the second argument to the intrinsic, the discriminator.
-  Value *ProtectedFieldDisc;
-
   /// @}
 
   /// Note that the constructor is protected because this class must be a base
@@ -235,7 +230,6 @@ class PtrUseVisitor : protected InstVisitor,
 IntegerType *IntIdxTy = cast(DL.getIndexType(I.getType()));
 IsOffsetKnown = true;
 Offset = APInt(IntIdxTy->getBitWidth(), 0);
-ProtectedFieldDisc = nullptr;
 PI.reset();
 
 // Enqueue the uses of this pointer.
@@ -248,7 +242,6 @@ class PtrUseVisitor : protected InstVisitor,
   IsOffsetKnown = ToVisit.UseAndIsOffsetKnown.getInt();
   if (IsOffsetKnown)
 Offset = std::move(ToVisit.Offset);
-  ProtectedFieldDisc = ToVisit.ProtectedFieldDisc;
 
   Instruction *I = cast(U->getUser());
   static_cast(this)->visit(I);
@@ -307,14 +300,6 @@ class PtrUseVisitor : protected InstVisitor,
 case Intrinsic::lifetime_start:
 case Intrinsic::lifetime_end:
   return; // No-op intrinsics.
-
-case Intrinsic::protected_field_ptr: {
-  if (!IsOffsetKnown)
-return Base::visitIntrinsicInst(II);
-  ProtectedFieldDisc = II.getArgOperand(1);
-  enqueueUsers(II);
-  break;
-}
 }
   }
 
diff --git a/llvm/lib/Analysis/PtrUseVisitor.cpp 
b/llvm/lib/Analysis/PtrUseVisitor.cpp
index 0a79f84196602..9c79546f491ef 100644
--- a/llvm/lib/Analysis/PtrUseVisitor.cpp
+++ b/llvm/lib/Analysis/PtrUseVisitor.cpp
@@ -21,9 +21,8 @@ void detail::PtrUseVisitorBase::enqueueUsers(Value &I) {
   for (Use &U : I.uses()) {
 if (VisitedUses.insert(&U).second) {
   UseToVisit NewU = {
-  UseToVisit::UseAndIsOffsetKnownPair(&U, IsOffsetKnown),
-  Offset,
-  ProtectedFieldDisc,
+UseToVisit::UseAndIsOffsetKnownPair(&U, IsOffsetKnown),
+Offset
   };
   Worklist.push_back(std::move(NewU));
 }
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp 
b/llvm/lib/Transforms/Scalar/SROA.cpp
index 4c5d4a72eebe4..1102699aa04e9 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -648,7 +648,8 @@ class AllocaSlices {
   /// Access the dead users for this alloca.
   ArrayRef getDeadUsers() const { return DeadUsers; }
 
-  /// Access the PFP users for this alloca.
+  /// Access the users for this alloca that are llvm.protected.field.ptr
+  /// intrinsics.
   ArrayRef getPFPUsers() const { return PFPUsers; }
 
   /// Access Uses that should be dropped if the alloca is promotable.
@@ -1043,6 +1044,10 @@ class AllocaSlices::SliceBuilder : public 
PtrUseVisitor {
   /// Set to de-duplicate dead instructions found in the use walk.
   SmallPtrSet VisitedDeadInsts;
 
+  // When this access is via an llvm.protected.field.ptr intrinsic, contains
+  // the second argument to the intrinsic, the discriminator.
+  Value *ProtectedFieldDisc = nullptr;
+
 public:
   SliceBuilder(const DataLayout &DL, AllocaInst &AI, AllocaSlices &AS)
   : PtrUseVisitor(DL),
@@ -1289,8 +1294,26 @@ class AllocaSlices::SliceBuilder : public 
PtrUseVisitor {
   return;
 }
 
-if (II.getIntrinsicID() == Intrinsic::protected_field_ptr)
+if (II.getIntrinsicID() == Intrinsic::protected_field_ptr) {
+  // We only handle loads and stores as users of llvm.protected.field.ptr.
+  // Other uses may add items to the worklist, which will cause
+  // ProtectedFieldDisc to be tracked incorrectly.
   AS.PFPUsers.push_back(&II);
+  ProtectedFieldDisc = II.getArgOperand(1);
+  for (Use &U : II.uses()) {
+this->U = &U;
+if (auto *LI = dyn_cast(U.getUser()))
+  visitLoadInst(*LI);
+else if (auto *SI = dyn_cast(U.getUser()))
+  visitStoreInst(*SI);
+else
+  PI.setAborted(&II);
+if (PI.isAborted())
+  break;
+  }
+  ProtectedFi

[llvm-branch-commits] [llvm] backport: [RISCV] Sources of vmerge shouldn't overlap V0 (#170070) (PR #170604)

2025-12-11 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp closed 
https://github.com/llvm/llvm-project/pull/170604
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/21.x: [SelectOptimize] Fix incorrect -1 immediate for large integers (#170860) (PR #171596)

2025-12-11 Thread via llvm-branch-commits


dyung wrote:

Was this a regression from LLVM 20.x, or has this issue always existed?

https://github.com/llvm/llvm-project/pull/171596
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] backport: [RISCV] Sources of vmerge shouldn't overlap V0 (#170070) (PR #170604)

2025-12-11 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

Let's wait for llvm 22.x. :-)

https://github.com/llvm/llvm-project/pull/170604
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] DS loop wait relaxation -- more test cases and improvements … (PR #171952)

2025-12-11 Thread via llvm-branch-commits


https://github.com/hidekisaito created 
https://github.com/llvm/llvm-project/pull/171952

…to handle them (4/4)

Add handling for same-iteration use/overwrite of DS load results:
- Track DS load destinations and detect when results are used or overwritten 
within the same iteration
- Compute FloorWaitCount for WMMAs that only use flushed loads Add bailout for 
tensor_load_to_lds and LDS DMA writes after barrier Add negative test based on 
profitability criteria

Assisted-by: Cursor / claude-4.5-opus-high

Depends on https://github.com/llvm/llvm-project/pull/171948

>From 238a970d621ed4b0758d8042ec30ed89895c4c3c Mon Sep 17 00:00:00 2001
From: Hideki Saito 
Date: Thu, 11 Dec 2025 21:56:17 -0500
Subject: [PATCH] [AMDGPU] DS loop wait relaxation -- more test cases and
 improvements to handle them (4/4)

Add handling for same-iteration use/overwrite of DS load results:
- Track DS load destinations and detect when results are used or
  overwritten within the same iteration
- Compute FloorWaitCount for WMMAs that only use flushed loads
Add bailout for tensor_load_to_lds and LDS DMA writes after barrier
Add negative test based on profitability criteria

Assisted-by: Cursor / claude-4.5-opus-high
---
 llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp   |  99 +++-
 .../AMDGPU/waitcnt-loop-ds-opt-eligible.mir   |   2 +-
 .../waitcnt-loop-ds-opt-no-improvement.mir| 109 +
 ...aitcnt-loop-ds-opt-same-iter-overwrite.mir | 111 ++
 .../waitcnt-loop-ds-opt-same-iter-use.mir | 107 +
 .../waitcnt-loop-ds-opt-tensor-load.mir   |  97 +++
 6 files changed, 518 insertions(+), 7 deletions(-)
 create mode 100644 
llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-no-improvement.mir
 create mode 100644 
llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-same-iter-overwrite.mir
 create mode 100644 
llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-same-iter-use.mir
 create mode 100644 llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-tensor-load.mir

diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 28bc57ed2db4e..55c0d72c125af 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -468,6 +468,11 @@ class SIInsertWaitcnts {
 mutable bool RelaxationApplied = false;
 // Pointer to the last barrier in the loop (found during eligibility check)
 const MachineInstr *LastBarrier = nullptr;
+// The wait count "floor" established by same-iteration uses/overwrites.
+// When a DS load result is used in the same iteration, the baseline 
inserts
+// a wait. This floor indicates the expected counter state after that wait.
+// WMMAs that only use flushed loads can rely on this floor.
+unsigned FloorWaitCount = 0;
   };
 
   // Cache of loop DS wait optimization info, keyed by loop header MBB.
@@ -2775,9 +2780,21 @@ void 
SIInsertWaitcnts::analyzeSingleBBLoopDSLoads(MachineLoop *ML) {
   // if one exists. LastBarrier was already found during eligibility check.
   // These are likely to be prefetch loads whose results are used in the next
   // iteration.
+  //
+  // If a load result is used or overwritten within the same iteration, the
+  // baseline will insert a wait before that instruction. Since DS loads
+  // complete in FIFO order, that wait also completes all earlier loads. So we
+  // can drop those "flushed" loads from our tracking and only consider
+  // subsequent loads as true prefetch loads. Overwrites also require the load
+  // to complete first to avoid write-after-write races.
   const MachineInstr *LastBarrier = Info.LastBarrier;
 
+  // Single pass: track DS load destinations, handle uses (which flush prior
+  // loads) and detect overwrites (which invalidate our analysis).
+  // TrackedLoads: (Register, Position) pairs for checking uses/overwrites
+  SmallVector, 64> TrackedLoads;
   unsigned LoadPosition = 0;
+  unsigned LastFlushedPosition = 0; // Loads up to this position will be 
flushed
   bool AfterLastBarrier = (LastBarrier == nullptr); // If no barrier, track all
 
   for (const MachineInstr &MI : *MBB) {
@@ -2789,6 +2806,42 @@ void 
SIInsertWaitcnts::analyzeSingleBBLoopDSLoads(MachineLoop *ML) {
 if (!AfterLastBarrier)
   continue;
 
+// Check for instructions that write to LDS through DMA (global_load_lds,
+// etc). These write to LDS but aren't DS instructions.
+// Bail out if any appear after the barrier.
+if (SIInstrInfo::mayWriteLDSThroughDMA(MI)) {
+  LLVM_DEBUG(
+  dbgs() << "Loop DS Wait Opt: LDS DMA write after last barrier, "
+ << "skipping\n");
+  Info.Valid = false;
+  return;
+}
+
+// Check for tensor_load_to_lds instructions (MIMG, not caught by above)
+if (MI.getOpcode() == AMDGPU::TENSOR_LOAD_TO_LDS ||
+MI.getOpcode() == AMDGPU::TENSOR_LOAD_TO_LDS_D2) {
+  LLVM_DEBUG(dbgs() << "Loop DS Wait Opt: tensor_load_to_lds after la

[llvm-branch-commits] [llvm] [AMDGPU] DS loop wait relaxation -- more test cases and improvements … (PR #171952)

2025-12-11 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: None (hidekisaito)


Changes

…to handle them (4/4)

Add handling for same-iteration use/overwrite of DS load results:
- Track DS load destinations and detect when results are used or overwritten 
within the same iteration
- Compute FloorWaitCount for WMMAs that only use flushed loads Add bailout for 
tensor_load_to_lds and LDS DMA writes after barrier Add negative test based on 
profitability criteria

Assisted-by: Cursor / claude-4.5-opus-high

Depends on https://github.com/llvm/llvm-project/pull/171948

---

Patch is 41.94 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/171952.diff


6 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp (+93-6) 
- (modified) llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir (+1-1) 
- (added) llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-no-improvement.mir 
(+109) 
- (added) llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-same-iter-overwrite.mir 
(+111) 
- (added) llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-same-iter-use.mir (+107) 
- (added) llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-tensor-load.mir (+97) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 28bc57ed2db4e..55c0d72c125af 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -468,6 +468,11 @@ class SIInsertWaitcnts {
 mutable bool RelaxationApplied = false;
 // Pointer to the last barrier in the loop (found during eligibility check)
 const MachineInstr *LastBarrier = nullptr;
+// The wait count "floor" established by same-iteration uses/overwrites.
+// When a DS load result is used in the same iteration, the baseline 
inserts
+// a wait. This floor indicates the expected counter state after that wait.
+// WMMAs that only use flushed loads can rely on this floor.
+unsigned FloorWaitCount = 0;
   };
 
   // Cache of loop DS wait optimization info, keyed by loop header MBB.
@@ -2775,9 +2780,21 @@ void 
SIInsertWaitcnts::analyzeSingleBBLoopDSLoads(MachineLoop *ML) {
   // if one exists. LastBarrier was already found during eligibility check.
   // These are likely to be prefetch loads whose results are used in the next
   // iteration.
+  //
+  // If a load result is used or overwritten within the same iteration, the
+  // baseline will insert a wait before that instruction. Since DS loads
+  // complete in FIFO order, that wait also completes all earlier loads. So we
+  // can drop those "flushed" loads from our tracking and only consider
+  // subsequent loads as true prefetch loads. Overwrites also require the load
+  // to complete first to avoid write-after-write races.
   const MachineInstr *LastBarrier = Info.LastBarrier;
 
+  // Single pass: track DS load destinations, handle uses (which flush prior
+  // loads) and detect overwrites (which invalidate our analysis).
+  // TrackedLoads: (Register, Position) pairs for checking uses/overwrites
+  SmallVector, 64> TrackedLoads;
   unsigned LoadPosition = 0;
+  unsigned LastFlushedPosition = 0; // Loads up to this position will be 
flushed
   bool AfterLastBarrier = (LastBarrier == nullptr); // If no barrier, track all
 
   for (const MachineInstr &MI : *MBB) {
@@ -2789,6 +2806,42 @@ void 
SIInsertWaitcnts::analyzeSingleBBLoopDSLoads(MachineLoop *ML) {
 if (!AfterLastBarrier)
   continue;
 
+// Check for instructions that write to LDS through DMA (global_load_lds,
+// etc). These write to LDS but aren't DS instructions.
+// Bail out if any appear after the barrier.
+if (SIInstrInfo::mayWriteLDSThroughDMA(MI)) {
+  LLVM_DEBUG(
+  dbgs() << "Loop DS Wait Opt: LDS DMA write after last barrier, "
+ << "skipping\n");
+  Info.Valid = false;
+  return;
+}
+
+// Check for tensor_load_to_lds instructions (MIMG, not caught by above)
+if (MI.getOpcode() == AMDGPU::TENSOR_LOAD_TO_LDS ||
+MI.getOpcode() == AMDGPU::TENSOR_LOAD_TO_LDS_D2) {
+  LLVM_DEBUG(dbgs() << "Loop DS Wait Opt: tensor_load_to_lds after last "
+<< "barrier, skipping\n");
+  Info.Valid = false;
+  return;
+}
+
+// Check if this instruction uses or overwrites any tracked DS load
+// destination. If so, baseline will have inserted a wait that flushes
+// all loads up to that position (since DS loads complete in order).
+// Overwrites also require the load to complete first to avoid races.
+for (auto &[Reg, Position] : TrackedLoads) {
+  if (Position <= LastFlushedPosition)
+continue; // Already flushed
+
+  if (MI.readsRegister(Reg, TRI) || MI.modifiesRegister(Reg, TRI)) {
+LLVM_DEBUG(dbgs() << "Loop DS Wait Opt: DS load at position "
+  << Position << " used/overwritten in same iteration, 
"
+  << "f

[llvm-branch-commits] [TableGen] Support RegClassByHwMode in CompressPat (PR #171061)

2025-12-11 Thread Alexander Richardson via llvm-branch-commits


arichardson wrote:

Tested using my current draft MC-level support for RVY 
(https://github.com/arichardson/upstream-llvm-project/pull/1)

https://github.com/llvm/llvm-project/pull/171061
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [NFCI][ELF][AArch64][PAC] Teach addRelativeReloc to emit R_AARCH64_AUTH_RELATIVE (PR #171180)

2025-12-11 Thread Daniil Kovalev via llvm-branch-commits



@@ -704,8 +704,10 @@ static void addRelativeReloc(Ctx &ctx, InputSectionBase 
&isec,
  uint64_t offsetInSec, Symbol &sym, int64_t addend,
  RelExpr expr, RelType type) {
   Partition &part = isec.getPartition(ctx);
+  bool isAArch64Auth =
+  ctx.arg.emachine == EM_AARCH64 && type == R_AARCH64_AUTH_ABS64;
 
-  if (sym.isTagged()) {
+  if (sym.isTagged() && !isAArch64Auth) {

kovdan01 wrote:

> Is the existing implementation that uses .relr.auth.dyn and/or no offset to 
> the start of the symbol for AUTH_RELATIVE relocations against tagged symbols 
> correct?

@jrtc27 When implementing support for `.relr.auth.dyn` initially, I was not 
accounting for tagged symbols. See initial PR for relr auth support just in 
case it helps (things might have changed over time though): #96496.

And from my side, I unfortunately can't tell you for sure if the existing 
implementation is correct in terms of using memtag + pauth at the same time 
since I've never have used such a combination.

But given the description from docs and explanation from comments in this 
thread above (that these two features are using different bits and are not 
overlapping), I suppose that the following is correct:

> it should be doing the "obvious" composition of the two.

https://github.com/llvm/llvm-project/pull/171180
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [TableGen] Support RegClassByHwMode in CompressPat (PR #171061)

2025-12-11 Thread Alexander Richardson via llvm-branch-commits


https://github.com/arichardson updated 
https://github.com/llvm/llvm-project/pull/171061


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [TableGen] Support RegClassByHwMode in CompressPat (PR #171061)

2025-12-11 Thread Alexander Richardson via llvm-branch-commits


https://github.com/arichardson updated 
https://github.com/llvm/llvm-project/pull/171061


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [TableGen] Support RegClassByHwMode in CompressPat (PR #171061)

2025-12-11 Thread via llvm-branch-commits


github-actions[bot] wrote:


# :window: Windows x64 Test Results

* 128590 tests passed
* 2806 tests skipped
* 1 test failed

## Failed Tests
(click on a test name to see its output)

### LLVM

LLVM.TableGen/RegClassByHwModeCompressPat.td

```
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
c:\_work\llvm-project\llvm-project\build\bin\llvm-tblgen.exe 
--gen-compress-inst-emitter -I 
C:\_work\llvm-project\llvm-project\llvm\test\TableGen/../../include -I 
C:\_work\llvm-project\llvm-project\llvm\test\TableGen 
C:\_work\llvm-project\llvm-project\llvm\test\TableGen\RegClassByHwModeCompressPat.td
 -o - | c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe 
C:\_work\llvm-project\llvm-project\llvm\test\TableGen\RegClassByHwModeCompressPat.td
# executed command: 
'c:\_work\llvm-project\llvm-project\build\bin\llvm-tblgen.exe' 
--gen-compress-inst-emitter -I 
'C:\_work\llvm-project\llvm-project\llvm\test\TableGen/../../include' -I 
'C:\_work\llvm-project\llvm-project\llvm\test\TableGen' 
'C:\_work\llvm-project\llvm-project\llvm\test\TableGen\RegClassByHwModeCompressPat.td'
 -o -
# note: command had no output on stdout or stderr
# executed command: 
'c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe' 
'C:\_work\llvm-project\llvm-project\llvm\test\TableGen\RegClassByHwModeCompressPat.td'
# .---command stderr
# | 
C:\_work\llvm-project\llvm-project\llvm\test\TableGen\RegClassByHwModeCompressPat.td:94:16:
 error: CHECK-NEXT: expected string not found in input
# | // CHECK-NEXT: 
MyTargetMCRegisterClasses[STI.getInstrInfo().getOpRegClassID(MI.getDesc().operands()[1])].contains(MI.getOperand(1).getReg()))
 {
# |^
# | :22:29: note: scanning from here
# |  MI.getOperand(1).isReg() &&
# | ^
# | :63:2: note: possible intended match here
# |  
MyTargetMCRegisterClasses[MyTarget::XRegsRegClassID].contains(MI.getOperand(1).getReg()))
 {
# |  ^
# | 
# | Input file: 
# | Check file: 
C:\_work\llvm-project\llvm-project\llvm\test\TableGen\RegClassByHwModeCompressPat.td
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<
# |.
# |.
# |.
# |   17:  [[maybe_unused]] unsigned HwModeId = 
STI.getHwMode(MCSubtargetInfo::HwMode_RegInfo); switch (MI.getOpcode()) { 
# |   18:  default: return false; 
# |   19:  case MyTarget::PTR_MOV: { 
# |   20:  if (MI.getOperand(1).isReg() && MI.getOperand(0).isReg() && 
# |   21:  (MI.getOperand(1).getReg() == MI.getOperand(0).getReg()) && 
# |   22:  MI.getOperand(1).isReg() && 
# | next:94'0 X error: no match found
# |   23:  
MyTargetMCRegisterClasses[MyTargetRegClassByHwModeTables[HwModeId][MyTarget::PtrRC]].contains(MI.getOperand(1).getReg()))
 { 
# | next:94'0 
~
# |   24:  // ptr_mov.tied $dst, $src 
# | next:94'0 
# |   25:  OutInst.setOpcode(MyTarget::PTR_MOV_TIED); 
# | next:94'0 
# |   26:  // Operand: dst 
# | next:94'0 ~
# |   27:  OutInst.addOperand(MI.getOperand(1)); 
# | next:94'0 ~~~
# |.
# |.
# |.
# |   58:  return true; 
# | next:94'0 ~~
# |   59:  } // if 
# | next:94'0 ~
# |   60:  if (MI.getOperand(1).isReg() && MI.getOperand(0).isReg() && 
# | next:94'0 ~
# |   61:  (MI.getOperand(1).getReg() == MI.getOperand(0).getReg()) && 
# | next:94'0 ~
# |   62:  MI.getOperand(1).isReg() && 
# | next:94'0 ~
# |   63:  
MyTargetMCRegisterClasses[MyTarget::XRegsRegClassID].contains(MI.getOperand(1).getReg()))
 { 
# | next:94'0 
~
# | next:94'1  ?
possible intended match
# |   64:  // x_mov.tied $dst, $src 
# | next:94'0 ~~
# |   65:  OutInst.setOpcode(MyTarget::X_MOV_TIED); 
# | next:94'0 ~~
# |   66:  // Operand: dst 
# | next:94'0 ~
# |   67:  OutInst.addOperand(MI.getOperand(1)); 
# | next:94'0 ~~~
# |   68:  // Operand: src 
# | next:94'0 ~
# |.
# |.
# |.
# | >>
# `-
# error: command failed with exit status: 1

--

```


If these failures are u

[llvm-branch-commits] [TableGen] Support RegClassByHwMode in CompressPat (PR #171061)

2025-12-11 Thread via llvm-branch-commits


github-actions[bot] wrote:


# :penguin: Linux x64 Test Results

* 167083 tests passed
* 2938 tests skipped
* 1 test failed

## Failed Tests
(click on a test name to see its output)

### LLVM

LLVM.TableGen/RegClassByHwModeCompressPat.td

```
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llvm-tblgen 
--gen-compress-inst-emitter -I 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen/../../include
 -I /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen/RegClassByHwModeCompressPat.td
 -o - | 
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen/RegClassByHwModeCompressPat.td
# executed command: 
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llvm-tblgen 
--gen-compress-inst-emitter -I 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen/../../include
 -I /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen/RegClassByHwModeCompressPat.td
 -o -
# note: command had no output on stdout or stderr
# executed command: 
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen/RegClassByHwModeCompressPat.td
# .---command stderr
# | 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen/RegClassByHwModeCompressPat.td:94:16:
 error: CHECK-NEXT: expected string not found in input
# | // CHECK-NEXT: 
MyTargetMCRegisterClasses[STI.getInstrInfo().getOpRegClassID(MI.getDesc().operands()[1])].contains(MI.getOperand(1).getReg()))
 {
# |^
# | :22:29: note: scanning from here
# |  MI.getOperand(1).isReg() &&
# | ^
# | :63:2: note: possible intended match here
# |  
MyTargetMCRegisterClasses[MyTarget::XRegsRegClassID].contains(MI.getOperand(1).getReg()))
 {
# |  ^
# | 
# | Input file: 
# | Check file: 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/TableGen/RegClassByHwModeCompressPat.td
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<
# |.
# |.
# |.
# |   17:  [[maybe_unused]] unsigned HwModeId = 
STI.getHwMode(MCSubtargetInfo::HwMode_RegInfo); switch (MI.getOpcode()) { 
# |   18:  default: return false; 
# |   19:  case MyTarget::PTR_MOV: { 
# |   20:  if (MI.getOperand(1).isReg() && MI.getOperand(0).isReg() && 
# |   21:  (MI.getOperand(1).getReg() == MI.getOperand(0).getReg()) && 
# |   22:  MI.getOperand(1).isReg() && 
# | next:94'0 X error: no match found
# |   23:  
MyTargetMCRegisterClasses[MyTargetRegClassByHwModeTables[HwModeId][MyTarget::PtrRC]].contains(MI.getOperand(1).getReg()))
 { 
# | next:94'0 
~
# |   24:  // ptr_mov.tied $dst, $src 
# | next:94'0 
# |   25:  OutInst.setOpcode(MyTarget::PTR_MOV_TIED); 
# | next:94'0 
# |   26:  // Operand: dst 
# | next:94'0 ~
# |   27:  OutInst.addOperand(MI.getOperand(1)); 
# | next:94'0 ~~~
# |.
# |.
# |.
# |   58:  return true; 
# | next:94'0 ~~
# |   59:  } // if 
# | next:94'0 ~
# |   60:  if (MI.getOperand(1).isReg() && MI.getOperand(0).isReg() && 
# | next:94'0 ~
# |   61:  (MI.getOperand(1).getReg() == MI.getOperand(0).getReg()) && 
# | next:94'0 ~
# |   62:  MI.getOperand(1).isReg() && 
# | next:94'0 ~
# |   63:  
MyTargetMCRegisterClasses[MyTarget::XRegsRegClassID].contains(MI.getOperand(1).getReg()))
 { 
# | next:94'0 
~
# | next:94'1  ?
possible intended match
# |   64:  // x_mov.tied $dst, $src 
# | next:94'0 ~~
# |   65:  OutInst.setOpcode(MyTarget::X_MOV_TIED); 
# | next:94'0 ~~
# |   66:  // Operand: dst 
# | next:94'0 ~
# |   67:  OutInst.addOperand(MI.getOperand(1)); 
# | next:94'0 ~

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add functions to namespace template (PR #171938)

2025-12-11 Thread Petr Hosek via llvm-branch-commits


https://github.com/petrhosek approved this pull request.


https://github.com/llvm/llvm-project/pull/171938
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add class template to HTML (PR #171937)

2025-12-11 Thread Petr Hosek via llvm-branch-commits


https://github.com/petrhosek approved this pull request.


https://github.com/llvm/llvm-project/pull/171937
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

1 2 >

1 - 100 of 106 matches

Mail list logo