[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

kbeyls wrote:

Thank you for those improved comments and the test case, that makes a lot more 
sense to me now!

One thing I'm still wondering about before resolving this review thread, is if 
you have any thoughts about whether any instruction patterns that guarantee the 
program will terminate on an authentication fault might not be detected by 
either of the 2 versions of `getAuthCheckedReg`?

It makes sense to update the documentation in a separate PR, but it might be 
useful to record the answer to the above question in a review comment to later 
help write up in the documentation what classes of false positive/negatives the 
user may still expect to see?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

atrosinenko wrote:

I expect all the check methods listed in 
[`AArch64PointerAuth.h`](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64PointerAuth.h)
 to be detected by one of these two methods. On the other hand, the 
`getAuthCheckedReg(BB)` method has a closed list of hard-coded patterns to 
recognize - if some other compilers or some other versions of LLVM would use 
other instruction sequences, it should be possible to update this method as 
appropriate. For example, LLVM used to emit an "inverted" variant of the checks 
in the past: the fall-through basic block was executed on success and on 
failure a jump was performed to a basic block containing a BRK instruction. Not 
sure if these variants should be supported, as later the mainline backend was 
aligned with the original variants of checks emitted by the prototype from the 
Ahmed Bougacha's branch (these are the patterns detected in this PR), but if I 
would have to support the inverted variants, the only change needed would 
probably be accepting TBNZ in addition to TBZ and B.NE in addition to B.EQ.

Considering the `getAuthCheckedReg(Inst, bool)` method, it could obviously be 
improved by handling store instructions (though, I doubt it would change 
anything for `pauthtest` ABI). False positives (not recognizing some load 
instruction as "checking") are technically possible, but I would expect them to 
be due to the existing implementation of `AArch64MCPlusBuilder::mayLoad()` 
which hardcodes a number of opcodes.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] Backport to 20.x "[clang][analyzer] Fix error path of builtin overflow (#136345)" (PR #136589)

2025-04-22 Thread Pavel Skripkin via llvm-branch-commits

pskrgag wrote:

> Then issue the "git cherry-pick -x HASH" command and force push to your 
> branch.

I've done that locally (+ fixed merge conflicts) and commit message is the same 
as in original patch. I've only changed the PR title

https://github.com/llvm/llvm-project/pull/136589
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lldb] [lldb] Fix SBTarget::ReadInstruction with flavor (PR #136034)

2025-04-22 Thread Ebuka Ezike via llvm-branch-commits

da-viper wrote:

@JDevlieghere  I do not have the right permission to merge this to the 20.X 
branch as it is protected. 

https://github.com/llvm/llvm-project/pull/136034
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)

2025-04-22 Thread Paschalis Mpeis via llvm-branch-commits

paschalis-mpeis wrote:

Thanks a lot both! In case there's some delay in resolving this edge case, may 
I suggest temporarily disabling this test on AArch64 until a more consistent 
workaround is in place?

https://github.com/llvm/llvm-project/pull/135867
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Use OmpMemoryOrderType enumeration in FAIL clause (PR #136313)

2025-04-22 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/136313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Use OmpMemoryOrderType enumeration in FAIL clause (PR #136313)

2025-04-22 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah approved this pull request.

LGTM, thanks

https://github.com/llvm/llvm-project/pull/136313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Use OmpMemoryOrderType enumeration in FAIL clause (PR #136313)

2025-04-22 Thread Tom Eccles via llvm-branch-commits


@@ -4242,9 +4242,8 @@ struct OmpDeviceTypeClause {
 // OMP 5.2 15.8.3 extended-atomic, fail-clause ->
 //FAIL(memory-order)
 struct OmpFailClause {
-  WRAPPER_CLASS_BOILERPLATE(
-  OmpFailClause, common::Indirection);
-  CharBlock source;

tblah wrote:

Why did you remove `source`?

https://github.com/llvm/llvm-project/pull/136313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [RISCV] Handle scalarized reductions in getArithmeticReductionCost (PR #136688)

2025-04-22 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/136688

Backport 053451cb3502144564b4d0b30a9046045d1820d4

Requested by: @hvdijk

>From e8ffc64c5c9ce8b10218f2dac9b801e94b3a577d Mon Sep 17 00:00:00 2001
From: Luke Lau 
Date: Mon, 21 Apr 2025 16:04:25 +0800
Subject: [PATCH] [RISCV] Handle scalarized reductions in
 getArithmeticReductionCost

This fixes a crash reported at
https://github.com/llvm/llvm-project/pull/114250#issuecomment-2813686061

If the vector type isn't legal at all, e.g. bfloat with +zvfbfmin,
then the legalized type will be scalarized. So use getScalarType()
instead of getVectorElement() when checking for f16/bf16.

(cherry picked from commit 053451cb3502144564b4d0b30a9046045d1820d4)
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |   5 +-
 .../Analysis/CostModel/RISCV/reduce-fadd.ll   | 167 ++
 2 files changed, 137 insertions(+), 35 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index add82dc80c429..8f1094413a756 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1658,9 +1658,8 @@ RISCVTTIImpl::getArithmeticReductionCost(unsigned Opcode, 
VectorType *Ty,
 break;
   case ISD::FADD:
 // We can't promote f16/bf16 fadd reductions.
-if ((LT.second.getVectorElementType() == MVT::f16 &&
- !ST->hasVInstructionsF16()) ||
-LT.second.getVectorElementType() == MVT::bf16)
+if ((LT.second.getScalarType() == MVT::f16 && !ST->hasVInstructionsF16()) 
||
+LT.second.getScalarType() == MVT::bf16)
   return BaseT::getArithmeticReductionCost(Opcode, Ty, FMF, CostKind);
 if (TTI::requiresOrderedReduction(FMF)) {
   Opcodes.push_back(RISCV::VFMV_S_F);
diff --git a/llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll 
b/llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll
index 1762f701a9b2d..71685b4acc822 100644
--- a/llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll
@@ -1,25 +1,60 @@
 ; NOTE: Assertions have been autogenerated by 
utils/update_analyze_test_checks.py
 ; RUN: opt < %s -mtriple=riscv64 -mattr=+v,+zfh,+zvfh,+zfbfmin,+zvfbfmin 
-passes="print" -cost-kind=throughput 2>&1 -disable-output | 
FileCheck %s --check-prefixes=FP-REDUCE,FP-REDUCE-ZVFH
 ; RUN: opt < %s -mtriple=riscv64 -mattr=+v,+zfh,+zvfhmin,+zfbfmin,+zvfbfmin 
-passes="print" -cost-kind=throughput 2>&1 -disable-output | 
FileCheck %s --check-prefixes=FP-REDUCE,FP-REDUCE-ZVFHMIN
+; RUN: opt < %s -mtriple=riscv64 -mattr=+v -passes="print" 
-cost-kind=throughput 2>&1 -disable-output | FileCheck %s 
--check-prefixes=FP-REDUCE,FP-REDUCE-NO-ZFHMIN-NO-ZFBFMIN
 ; RUN: opt < %s -mtriple=riscv64 -mattr=+v,+zfh,+zvfh,+zfbfmin,+zvfbfmin 
-passes="print" -cost-kind=code-size 2>&1 -disable-output | 
FileCheck %s  --check-prefix=SIZE
 
 define void @reduce_fadd_bfloat() {
-; FP-REDUCE-LABEL: 'reduce_fadd_bfloat'
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: 
%V1 = call fast bfloat @llvm.vector.reduce.fadd.v1bf16(bfloat 0xR, <1 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: 
%V2 = call fast bfloat @llvm.vector.reduce.fadd.v2bf16(bfloat 0xR, <2 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: 
%V4 = call fast bfloat @llvm.vector.reduce.fadd.v4bf16(bfloat 0xR, <4 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: 
%V8 = call fast bfloat @llvm.vector.reduce.fadd.v8bf16(bfloat 0xR, <8 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 73 for instruction: 
%V16 = call fast bfloat @llvm.vector.reduce.fadd.v16bf16(bfloat 0xR, <16 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 211 for instruction: 
%v32 = call fast bfloat @llvm.vector.reduce.fadd.v32bf16(bfloat 0xR, <32 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 541 for instruction: 
%V64 = call fast bfloat @llvm.vector.reduce.fadd.v64bf16(bfloat 0xR, <64 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 573 for instruction: 
%V128 = call fast bfloat @llvm.vector.reduce.fadd.v128bf16(bfloat 0xR, <128 
x bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV1 = call fast 
bfloat @llvm.vector.reduce.fadd.nxv1bf16(bfloat 0xR,  
undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV2 = call fast 
bfloat @llvm.vector.reduce.fadd.nxv2bf16(bfloat 0xR,  
undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV4 = call fast 
bfloat @llvm.vector.reduce.fadd.nxv4bf16(bfloat 0xR,  
undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV8 = call fast 
bfloat @llvm.vector.reduce.fadd.nxv8bf16(bfloat 0xR,  
unde

[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-22 Thread Benjamin Maxwell via llvm-branch-commits


@@ -253,38 +253,38 @@ define i64 @not_dotp_i8_to_i64_has_neon_dotprod(ptr 
readonly %a, ptr readonly %b
 ; CHECK-MAXBW-SAME: ptr readonly [[A:%.*]], ptr readonly [[B:%.*]]) 
#[[ATTR1:[0-9]+]] {
 ; CHECK-MAXBW-NEXT:  entry:
 ; CHECK-MAXBW-NEXT:[[TMP0:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-MAXBW-NEXT:[[TMP1:%.*]] = mul i64 [[TMP0]], 8
+; CHECK-MAXBW-NEXT:[[TMP1:%.*]] = mul i64 [[TMP0]], 16
 ; CHECK-MAXBW-NEXT:br i1 false, label [[SCALAR_PH:%.*]], label 
[[VECTOR_PH:%.*]]
 ; CHECK-MAXBW:   vector.ph:
 ; CHECK-MAXBW-NEXT:[[TMP2:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-MAXBW-NEXT:[[TMP3:%.*]] = mul i64 [[TMP2]], 8
+; CHECK-MAXBW-NEXT:[[TMP3:%.*]] = mul i64 [[TMP2]], 16
 ; CHECK-MAXBW-NEXT:[[N_MOD_VF:%.*]] = urem i64 1024, [[TMP3]]
 ; CHECK-MAXBW-NEXT:[[N_VEC:%.*]] = sub i64 1024, [[N_MOD_VF]]
 ; CHECK-MAXBW-NEXT:[[TMP4:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-MAXBW-NEXT:[[TMP5:%.*]] = mul i64 [[TMP4]], 8
+; CHECK-MAXBW-NEXT:[[TMP5:%.*]] = mul i64 [[TMP4]], 16
 ; CHECK-MAXBW-NEXT:[[TMP6:%.*]] = getelementptr i8, ptr [[A]], i64 
[[N_VEC]]
 ; CHECK-MAXBW-NEXT:[[TMP7:%.*]] = getelementptr i8, ptr [[B]], i64 
[[N_VEC]]
 ; CHECK-MAXBW-NEXT:br label [[VECTOR_BODY:%.*]]
 ; CHECK-MAXBW:   vector.body:
 ; CHECK-MAXBW-NEXT:[[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ 
[[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
-; CHECK-MAXBW-NEXT:[[VEC_PHI:%.*]] = phi  [ 
zeroinitializer, [[VECTOR_PH]] ], [ [[TMP15:%.*]], [[VECTOR_BODY]] ]
+; CHECK-MAXBW-NEXT:[[VEC_PHI:%.*]] = phi  [ 
zeroinitializer, [[VECTOR_PH]] ], [ [[PARTIAL_REDUCE:%.*]], [[VECTOR_BODY]] ]
 ; CHECK-MAXBW-NEXT:[[TMP8:%.*]] = add i64 [[INDEX]], 0
 ; CHECK-MAXBW-NEXT:[[NEXT_GEP:%.*]] = getelementptr i8, ptr [[A]], i64 
[[TMP8]]
 ; CHECK-MAXBW-NEXT:[[TMP9:%.*]] = add i64 [[INDEX]], 0
 ; CHECK-MAXBW-NEXT:[[NEXT_GEP1:%.*]] = getelementptr i8, ptr [[B]], i64 
[[TMP9]]
 ; CHECK-MAXBW-NEXT:[[TMP10:%.*]] = getelementptr i8, ptr [[NEXT_GEP]], i32 0
-; CHECK-MAXBW-NEXT:[[WIDE_LOAD:%.*]] = load , ptr 
[[TMP10]], align 1
-; CHECK-MAXBW-NEXT:[[TMP11:%.*]] = zext  [[WIDE_LOAD]] to 

+; CHECK-MAXBW-NEXT:[[WIDE_LOAD:%.*]] = load , ptr 
[[TMP10]], align 1
 ; CHECK-MAXBW-NEXT:[[TMP12:%.*]] = getelementptr i8, ptr [[NEXT_GEP1]], 
i32 0
-; CHECK-MAXBW-NEXT:[[WIDE_LOAD2:%.*]] = load , ptr 
[[TMP12]], align 1
-; CHECK-MAXBW-NEXT:[[TMP13:%.*]] = zext  [[WIDE_LOAD2]] 
to 
-; CHECK-MAXBW-NEXT:[[TMP14:%.*]] = mul nuw nsw  
[[TMP13]], [[TMP11]]
-; CHECK-MAXBW-NEXT:[[TMP15]] = add  [[TMP14]], 
[[VEC_PHI]]
+; CHECK-MAXBW-NEXT:[[WIDE_LOAD2:%.*]] = load , ptr 
[[TMP12]], align 1
+; CHECK-MAXBW-NEXT:[[TMP15:%.*]] = zext  [[WIDE_LOAD2]] 
to 
+; CHECK-MAXBW-NEXT:[[TMP13:%.*]] = zext  [[WIDE_LOAD]] 
to 
+; CHECK-MAXBW-NEXT:[[TMP14:%.*]] = mul nuw nsw  
[[TMP15]], [[TMP13]]
+; CHECK-MAXBW-NEXT:[[PARTIAL_REDUCE]] = call  
@llvm.experimental.vector.partial.reduce.add.nxv2i64.nxv16i64( [[VEC_PHI]],  [[TMP14]])

MacDue wrote:

This test is called "not_dotp" but now looks like it's dotp 
:slightly_smiling_face: IIRC this won't map directly a dot product instruction 
(as `nxv16i64` to `nxv2i64` is not supported at the moment). 

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [RISCV] Handle scalarized reductions in getArithmeticReductionCost (PR #136688)

2025-04-22 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/136688
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-22 Thread Benjamin Maxwell via llvm-branch-commits


@@ -2376,6 +2327,59 @@ class VPReductionRecipe : public VPRecipeWithIRFlags {
   }
 };
 
+/// A recipe for forming partial reductions. In the loop, an accumulator and
+/// vector operand are added together and passed to the next iteration as the
+/// next accumulator. After the loop body, the accumulator is reduced to a
+/// scalar value.
+class VPPartialReductionRecipe : public VPReductionRecipe {

MacDue wrote:

Should the `classof` for `VPReductionRecipe` now include 
`VPPartialReductionRecipe`? 

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-22 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue edited 
https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [RISCV] Handle scalarized reductions in getArithmeticReductionCost (PR #136688)

2025-04-22 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: None (llvmbot)


Changes

Backport 053451cb3502144564b4d0b30a9046045d1820d4

Requested by: @hvdijk

---

Patch is 29.03 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/136688.diff


2 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+2-3) 
- (modified) llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll (+135-32) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index add82dc80c429..8f1094413a756 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1658,9 +1658,8 @@ RISCVTTIImpl::getArithmeticReductionCost(unsigned Opcode, 
VectorType *Ty,
 break;
   case ISD::FADD:
 // We can't promote f16/bf16 fadd reductions.
-if ((LT.second.getVectorElementType() == MVT::f16 &&
- !ST->hasVInstructionsF16()) ||
-LT.second.getVectorElementType() == MVT::bf16)
+if ((LT.second.getScalarType() == MVT::f16 && !ST->hasVInstructionsF16()) 
||
+LT.second.getScalarType() == MVT::bf16)
   return BaseT::getArithmeticReductionCost(Opcode, Ty, FMF, CostKind);
 if (TTI::requiresOrderedReduction(FMF)) {
   Opcodes.push_back(RISCV::VFMV_S_F);
diff --git a/llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll 
b/llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll
index 1762f701a9b2d..71685b4acc822 100644
--- a/llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/reduce-fadd.ll
@@ -1,25 +1,60 @@
 ; NOTE: Assertions have been autogenerated by 
utils/update_analyze_test_checks.py
 ; RUN: opt < %s -mtriple=riscv64 -mattr=+v,+zfh,+zvfh,+zfbfmin,+zvfbfmin 
-passes="print" -cost-kind=throughput 2>&1 -disable-output | 
FileCheck %s --check-prefixes=FP-REDUCE,FP-REDUCE-ZVFH
 ; RUN: opt < %s -mtriple=riscv64 -mattr=+v,+zfh,+zvfhmin,+zfbfmin,+zvfbfmin 
-passes="print" -cost-kind=throughput 2>&1 -disable-output | 
FileCheck %s --check-prefixes=FP-REDUCE,FP-REDUCE-ZVFHMIN
+; RUN: opt < %s -mtriple=riscv64 -mattr=+v -passes="print" 
-cost-kind=throughput 2>&1 -disable-output | FileCheck %s 
--check-prefixes=FP-REDUCE,FP-REDUCE-NO-ZFHMIN-NO-ZFBFMIN
 ; RUN: opt < %s -mtriple=riscv64 -mattr=+v,+zfh,+zvfh,+zfbfmin,+zvfbfmin 
-passes="print" -cost-kind=code-size 2>&1 -disable-output | 
FileCheck %s  --check-prefix=SIZE
 
 define void @reduce_fadd_bfloat() {
-; FP-REDUCE-LABEL: 'reduce_fadd_bfloat'
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: 
%V1 = call fast bfloat @llvm.vector.reduce.fadd.v1bf16(bfloat 0xR, <1 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: 
%V2 = call fast bfloat @llvm.vector.reduce.fadd.v2bf16(bfloat 0xR, <2 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: 
%V4 = call fast bfloat @llvm.vector.reduce.fadd.v4bf16(bfloat 0xR, <4 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 28 for instruction: 
%V8 = call fast bfloat @llvm.vector.reduce.fadd.v8bf16(bfloat 0xR, <8 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 73 for instruction: 
%V16 = call fast bfloat @llvm.vector.reduce.fadd.v16bf16(bfloat 0xR, <16 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 211 for instruction: 
%v32 = call fast bfloat @llvm.vector.reduce.fadd.v32bf16(bfloat 0xR, <32 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 541 for instruction: 
%V64 = call fast bfloat @llvm.vector.reduce.fadd.v64bf16(bfloat 0xR, <64 x 
bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 573 for instruction: 
%V128 = call fast bfloat @llvm.vector.reduce.fadd.v128bf16(bfloat 0xR, <128 
x bfloat> undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV1 = call fast 
bfloat @llvm.vector.reduce.fadd.nxv1bf16(bfloat 0xR,  
undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV2 = call fast 
bfloat @llvm.vector.reduce.fadd.nxv2bf16(bfloat 0xR,  
undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV4 = call fast 
bfloat @llvm.vector.reduce.fadd.nxv4bf16(bfloat 0xR,  
undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV8 = call fast 
bfloat @llvm.vector.reduce.fadd.nxv8bf16(bfloat 0xR,  
undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV16 = call 
fast bfloat @llvm.vector.reduce.fadd.nxv16bf16(bfloat 0xR,  undef)
-; FP-REDUCE-NEXT:  Cost Model: Invalid cost for instruction: %NXV32 = call 
fast bfloat @llvm.vector.reduce.fadd.nxv32bf16(bfloat 0xR,  undef)
-; FP-REDUCE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: 
ret void
+; FP-REDUCE-ZVFH-LABEL: 'reduce_fadd_bfloat'
+; FP-REDUCE-ZVFH-NE

[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad approved this pull request.

Patch looks OK to me, unless you are still worried about the global_inv loadcnt 
decrement ordering thing.

Removing unnecessary waits at a function call boundary can be done as a 
separate optimization.

https://github.com/llvm/llvm-project/pull/135340
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad edited 
https://github.com/llvm/llvm-project/pull/135340
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits


@@ -19,7 +19,7 @@ body: |
 ; GFX12-NEXT: {{  $}}
 ; GFX12-NEXT: renamable $vgpr0 = GLOBAL_LOAD_DWORD_SADDR renamable 
$sgpr2_sgpr3, killed $vgpr0, 0, 0, implicit $exec :: (load (s32), addrspace 1)
 ; GFX12-NEXT: GLOBAL_INV 16, implicit $exec
-; GFX12-NEXT: S_WAIT_LOADCNT 0
+; GFX12-NEXT: S_WAIT_LOADCNT 1

jayfoad wrote:

Normally loadcnt is incremented and decremented in order within a wave. Are you 
saying global_inv might be an exception to that? Can you get confirmation one 
way or the other?

https://github.com/llvm/llvm-project/pull/135340
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits


@@ -2130,13 +2140,14 @@ void 
SIInsertWaitcnts::updateEventWaitcntAfter(MachineInstr &Inst,
   ScoreBrackets->updateByEvent(TII, TRI, MRI, LDS_ACCESS, Inst);
 }
   } else if (TII->isFLAT(Inst)) {
-// TODO: Track this properly.
-if (isCacheInvOrWBInst(Inst))
+if (isGFX12CacheInvOrWBInst(Inst)) {
+  ScoreBrackets->updateByEvent(TII, TRI, MRI, getVmemWaitEventType(Inst),
+   Inst);
   return;
-
-assert(Inst.mayLoadOrStore());
+}
 
 int FlatASCount = 0;
+assert(Inst.mayLoadOrStore());

jayfoad wrote:

Nit: why move this?

https://github.com/llvm/llvm-project/pull/135340
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Pierre van Houtryve via llvm-branch-commits


@@ -19,7 +19,7 @@ body: |
 ; GFX12-NEXT: {{  $}}
 ; GFX12-NEXT: renamable $vgpr0 = GLOBAL_LOAD_DWORD_SADDR renamable 
$sgpr2_sgpr3, killed $vgpr0, 0, 0, implicit $exec :: (load (s32), addrspace 1)
 ; GFX12-NEXT: GLOBAL_INV 16, implicit $exec
-; GFX12-NEXT: S_WAIT_LOADCNT 0
+; GFX12-NEXT: S_WAIT_LOADCNT 1

Pierre-vh wrote:

Loads and stores are executed in order, but the data return & counter decrement 
can be done out of order.

Though, this doesn't look like an issue specific to this patch at all. The INV 
is correctly marked as a load (VMEM_READ) so this may be a separate issue, or 
maybe there is a reason why we can assume things return in order.

https://github.com/llvm/llvm-project/pull/135340
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [CIR] Let ConstantOp builder infer its type automatically (PR #136606)

2025-04-22 Thread Henrich Lauko via llvm-branch-commits

https://github.com/xlauko updated 
https://github.com/llvm/llvm-project/pull/136606

>From 166bf478dde94b6dac144a43a7ececa6cbcb8223 Mon Sep 17 00:00:00 2001
From: xlauko 
Date: Mon, 21 Apr 2025 22:23:40 +0200
Subject: [PATCH] [CIR] Let ConstantOp builder infer its type automatically

---
 clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h | 4 ++--
 clang/include/clang/CIR/Dialect/IR/CIROps.td | 9 -
 clang/lib/CIR/CodeGen/CIRGenBuilder.h| 2 +-
 clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp   | 4 ++--
 clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp   | 2 +-
 5 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h 
b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
index b303aa07838ee..0385b4f476c3b 100644
--- a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+++ b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
@@ -61,11 +61,11 @@ class CIRBaseBuilderTy : public mlir::OpBuilder {
 
   mlir::Value getConstAPInt(mlir::Location loc, mlir::Type typ,
 const llvm::APInt &val) {
-return create(loc, typ, getAttr(typ, val));
+return create(loc, getAttr(typ, val));
   }
 
   cir::ConstantOp getConstant(mlir::Location loc, mlir::TypedAttr attr) {
-return create(loc, attr.getType(), attr);
+return create(loc, attr);
   }
 
   cir::ConstantOp getConstantInt(mlir::Location loc, mlir::Type ty,
diff --git a/clang/include/clang/CIR/Dialect/IR/CIROps.td 
b/clang/include/clang/CIR/Dialect/IR/CIROps.td
index b526d077a910c..577cb8db41f57 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIROps.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIROps.td
@@ -288,18 +288,9 @@ def ConstantOp : CIR_Op<"const",
 ```
   }];
 
-  // The constant operation takes an attribute as the only input.
   let arguments = (ins TypedAttrInterface:$value);
-
-  // The constant operation returns a single value of CIR_AnyType.
   let results = (outs CIR_AnyType:$res);
 
-  let builders = [
-OpBuilder<(ins "cir::BoolAttr":$value), [{
-  build($_builder, $_state, value.getType(), value);
-}]>
-  ];
-
   let assemblyFormat = "attr-dict $value";
 
   let hasVerifier = 1;
diff --git a/clang/lib/CIR/CodeGen/CIRGenBuilder.h 
b/clang/lib/CIR/CodeGen/CIRGenBuilder.h
index 4f7ff5128d914..7d9988cc52c5a 100644
--- a/clang/lib/CIR/CodeGen/CIRGenBuilder.h
+++ b/clang/lib/CIR/CodeGen/CIRGenBuilder.h
@@ -185,7 +185,7 @@ class CIRGenBuilderTy : public cir::CIRBaseBuilderTy {
   // Creates constant nullptr for pointer type ty.
   cir::ConstantOp getNullPtr(mlir::Type ty, mlir::Location loc) {
 assert(!cir::MissingFeatures::targetCodeGenInfoGetNullPointer());
-return create(loc, ty, getConstPtrAttr(ty, 0));
+return create(loc, getConstPtrAttr(ty, 0));
   }
 
   mlir::Value createNeg(mlir::Value value) {
diff --git a/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp 
b/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
index f1561d1b26fc0..1e69ecae831e9 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
@@ -138,7 +138,7 @@ class ScalarExprEmitter : public 
StmtVisitor {
   mlir::Value VisitIntegerLiteral(const IntegerLiteral *e) {
 mlir::Type type = cgf.convertType(e->getType());
 return builder.create(
-cgf.getLoc(e->getExprLoc()), type,
+cgf.getLoc(e->getExprLoc()),
 builder.getAttr(type, e->getValue()));
   }
 
@@ -147,7 +147,7 @@ class ScalarExprEmitter : public 
StmtVisitor {
 assert(mlir::isa(type) &&
"expect floating-point type");
 return builder.create(
-cgf.getLoc(e->getExprLoc()), type,
+cgf.getLoc(e->getExprLoc()),
 builder.getAttr(type, e->getValue()));
   }
 
diff --git a/clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp 
b/clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp
index fe8a5e7428a81..20b086ffdd850 100644
--- a/clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp
@@ -34,7 +34,7 @@ llvm::SmallVector 
cir::AllocaOp::getPromotableSlots() {
 
 Value cir::AllocaOp::getDefaultValue(const MemorySlot &slot,
  OpBuilder &builder) {
-  return builder.create(getLoc(), slot.elemType,
+  return builder.create(getLoc(),
  cir::UndefAttr::get(slot.elemType));
 }
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [CIR] Let ConstantOp builder infer its type automatically (PR #136606)

2025-04-22 Thread Henrich Lauko via llvm-branch-commits

https://github.com/xlauko updated 
https://github.com/llvm/llvm-project/pull/136606

>From 7a8311eddc12d1119338261721de7561297037cb Mon Sep 17 00:00:00 2001
From: xlauko 
Date: Mon, 21 Apr 2025 22:23:40 +0200
Subject: [PATCH] [CIR] Let ConstantOp builder infer its type automatically

---
 clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h | 4 ++--
 clang/include/clang/CIR/Dialect/IR/CIROps.td | 9 -
 clang/lib/CIR/CodeGen/CIRGenBuilder.h| 2 +-
 clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp   | 4 ++--
 clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp   | 2 +-
 5 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h 
b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
index b303aa07838ee..0385b4f476c3b 100644
--- a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+++ b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
@@ -61,11 +61,11 @@ class CIRBaseBuilderTy : public mlir::OpBuilder {
 
   mlir::Value getConstAPInt(mlir::Location loc, mlir::Type typ,
 const llvm::APInt &val) {
-return create(loc, typ, getAttr(typ, val));
+return create(loc, getAttr(typ, val));
   }
 
   cir::ConstantOp getConstant(mlir::Location loc, mlir::TypedAttr attr) {
-return create(loc, attr.getType(), attr);
+return create(loc, attr);
   }
 
   cir::ConstantOp getConstantInt(mlir::Location loc, mlir::Type ty,
diff --git a/clang/include/clang/CIR/Dialect/IR/CIROps.td 
b/clang/include/clang/CIR/Dialect/IR/CIROps.td
index b526d077a910c..577cb8db41f57 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIROps.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIROps.td
@@ -288,18 +288,9 @@ def ConstantOp : CIR_Op<"const",
 ```
   }];
 
-  // The constant operation takes an attribute as the only input.
   let arguments = (ins TypedAttrInterface:$value);
-
-  // The constant operation returns a single value of CIR_AnyType.
   let results = (outs CIR_AnyType:$res);
 
-  let builders = [
-OpBuilder<(ins "cir::BoolAttr":$value), [{
-  build($_builder, $_state, value.getType(), value);
-}]>
-  ];
-
   let assemblyFormat = "attr-dict $value";
 
   let hasVerifier = 1;
diff --git a/clang/lib/CIR/CodeGen/CIRGenBuilder.h 
b/clang/lib/CIR/CodeGen/CIRGenBuilder.h
index 4f7ff5128d914..7d9988cc52c5a 100644
--- a/clang/lib/CIR/CodeGen/CIRGenBuilder.h
+++ b/clang/lib/CIR/CodeGen/CIRGenBuilder.h
@@ -185,7 +185,7 @@ class CIRGenBuilderTy : public cir::CIRBaseBuilderTy {
   // Creates constant nullptr for pointer type ty.
   cir::ConstantOp getNullPtr(mlir::Type ty, mlir::Location loc) {
 assert(!cir::MissingFeatures::targetCodeGenInfoGetNullPointer());
-return create(loc, ty, getConstPtrAttr(ty, 0));
+return create(loc, getConstPtrAttr(ty, 0));
   }
 
   mlir::Value createNeg(mlir::Value value) {
diff --git a/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp 
b/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
index f1561d1b26fc0..1e69ecae831e9 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
@@ -138,7 +138,7 @@ class ScalarExprEmitter : public 
StmtVisitor {
   mlir::Value VisitIntegerLiteral(const IntegerLiteral *e) {
 mlir::Type type = cgf.convertType(e->getType());
 return builder.create(
-cgf.getLoc(e->getExprLoc()), type,
+cgf.getLoc(e->getExprLoc()),
 builder.getAttr(type, e->getValue()));
   }
 
@@ -147,7 +147,7 @@ class ScalarExprEmitter : public 
StmtVisitor {
 assert(mlir::isa(type) &&
"expect floating-point type");
 return builder.create(
-cgf.getLoc(e->getExprLoc()), type,
+cgf.getLoc(e->getExprLoc()),
 builder.getAttr(type, e->getValue()));
   }
 
diff --git a/clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp 
b/clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp
index fe8a5e7428a81..20b086ffdd850 100644
--- a/clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRMemorySlot.cpp
@@ -34,7 +34,7 @@ llvm::SmallVector 
cir::AllocaOp::getPromotableSlots() {
 
 Value cir::AllocaOp::getDefaultValue(const MemorySlot &slot,
  OpBuilder &builder) {
-  return builder.create(getLoc(), slot.elemType,
+  return builder.create(getLoc(),
  cir::UndefAttr::get(slot.elemType));
 }
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lldb] [lldb] Fix SBTarget::ReadInstruction with flavor (PR #136034)

2025-04-22 Thread David Spickett via llvm-branch-commits

DavidSpickett wrote:

One of the release managers will do that for us at some point.

https://github.com/llvm/llvm-project/pull/136034
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [RISCV] Handle scalarized reductions in getArithmeticReductionCost (PR #136688)

2025-04-22 Thread Luke Lau via llvm-branch-commits

https://github.com/lukel97 approved this pull request.


https://github.com/llvm/llvm-project/pull/136688
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits


@@ -19,7 +19,7 @@ body: |
 ; GFX12-NEXT: {{  $}}
 ; GFX12-NEXT: renamable $vgpr0 = GLOBAL_LOAD_DWORD_SADDR renamable 
$sgpr2_sgpr3, killed $vgpr0, 0, 0, implicit $exec :: (load (s32), addrspace 1)
 ; GFX12-NEXT: GLOBAL_INV 16, implicit $exec
-; GFX12-NEXT: S_WAIT_LOADCNT 0
+; GFX12-NEXT: S_WAIT_LOADCNT 1

jayfoad wrote:

> the data return & counter decrement can be done out of order.

No, generally the counter decrement *is* in order, otherwise it would never be 
useful to wait with a non-zero value. There are some specific cases where the 
decrement is not in order, which are handled by `counterOutOfOrder`.

https://github.com/llvm/llvm-project/pull/135340
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Use OmpMemoryOrderType enumeration in FAIL clause (PR #136313)

2025-04-22 Thread Krzysztof Parzyszek via llvm-branch-commits


@@ -4242,9 +4242,8 @@ struct OmpDeviceTypeClause {
 // OMP 5.2 15.8.3 extended-atomic, fail-clause ->
 //FAIL(memory-order)
 struct OmpFailClause {
-  WRAPPER_CLASS_BOILERPLATE(
-  OmpFailClause, common::Indirection);
-  CharBlock source;

kparzysz wrote:

`OmpClause` has `source` in it.

https://github.com/llvm/llvm-project/pull/136313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Pierre van Houtryve via llvm-branch-commits

Pierre-vh wrote:

### Merge activity

* **Apr 22, 8:43 AM EDT**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/135340).


https://github.com/llvm/llvm-project/pull/135340
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Pierre van Houtryve via llvm-branch-commits

Pierre-vh wrote:

> Patch looks OK to me, unless you are still worried about the global_inv 
> loadcnt decrement ordering thing.

It's a bit concerning but in any case it's not the fault of this patch, so I'll 
land and track that separately. Same for the optimization

https://github.com/llvm/llvm-project/pull/135340
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-22 Thread Sam Tebbs via llvm-branch-commits


@@ -253,38 +253,38 @@ define i64 @not_dotp_i8_to_i64_has_neon_dotprod(ptr 
readonly %a, ptr readonly %b
 ; CHECK-MAXBW-SAME: ptr readonly [[A:%.*]], ptr readonly [[B:%.*]]) 
#[[ATTR1:[0-9]+]] {
 ; CHECK-MAXBW-NEXT:  entry:
 ; CHECK-MAXBW-NEXT:[[TMP0:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-MAXBW-NEXT:[[TMP1:%.*]] = mul i64 [[TMP0]], 8
+; CHECK-MAXBW-NEXT:[[TMP1:%.*]] = mul i64 [[TMP0]], 16
 ; CHECK-MAXBW-NEXT:br i1 false, label [[SCALAR_PH:%.*]], label 
[[VECTOR_PH:%.*]]
 ; CHECK-MAXBW:   vector.ph:
 ; CHECK-MAXBW-NEXT:[[TMP2:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-MAXBW-NEXT:[[TMP3:%.*]] = mul i64 [[TMP2]], 8
+; CHECK-MAXBW-NEXT:[[TMP3:%.*]] = mul i64 [[TMP2]], 16
 ; CHECK-MAXBW-NEXT:[[N_MOD_VF:%.*]] = urem i64 1024, [[TMP3]]
 ; CHECK-MAXBW-NEXT:[[N_VEC:%.*]] = sub i64 1024, [[N_MOD_VF]]
 ; CHECK-MAXBW-NEXT:[[TMP4:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-MAXBW-NEXT:[[TMP5:%.*]] = mul i64 [[TMP4]], 8
+; CHECK-MAXBW-NEXT:[[TMP5:%.*]] = mul i64 [[TMP4]], 16
 ; CHECK-MAXBW-NEXT:[[TMP6:%.*]] = getelementptr i8, ptr [[A]], i64 
[[N_VEC]]
 ; CHECK-MAXBW-NEXT:[[TMP7:%.*]] = getelementptr i8, ptr [[B]], i64 
[[N_VEC]]
 ; CHECK-MAXBW-NEXT:br label [[VECTOR_BODY:%.*]]
 ; CHECK-MAXBW:   vector.body:
 ; CHECK-MAXBW-NEXT:[[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ 
[[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
-; CHECK-MAXBW-NEXT:[[VEC_PHI:%.*]] = phi  [ 
zeroinitializer, [[VECTOR_PH]] ], [ [[TMP15:%.*]], [[VECTOR_BODY]] ]
+; CHECK-MAXBW-NEXT:[[VEC_PHI:%.*]] = phi  [ 
zeroinitializer, [[VECTOR_PH]] ], [ [[PARTIAL_REDUCE:%.*]], [[VECTOR_BODY]] ]
 ; CHECK-MAXBW-NEXT:[[TMP8:%.*]] = add i64 [[INDEX]], 0
 ; CHECK-MAXBW-NEXT:[[NEXT_GEP:%.*]] = getelementptr i8, ptr [[A]], i64 
[[TMP8]]
 ; CHECK-MAXBW-NEXT:[[TMP9:%.*]] = add i64 [[INDEX]], 0
 ; CHECK-MAXBW-NEXT:[[NEXT_GEP1:%.*]] = getelementptr i8, ptr [[B]], i64 
[[TMP9]]
 ; CHECK-MAXBW-NEXT:[[TMP10:%.*]] = getelementptr i8, ptr [[NEXT_GEP]], i32 0
-; CHECK-MAXBW-NEXT:[[WIDE_LOAD:%.*]] = load , ptr 
[[TMP10]], align 1
-; CHECK-MAXBW-NEXT:[[TMP11:%.*]] = zext  [[WIDE_LOAD]] to 

+; CHECK-MAXBW-NEXT:[[WIDE_LOAD:%.*]] = load , ptr 
[[TMP10]], align 1
 ; CHECK-MAXBW-NEXT:[[TMP12:%.*]] = getelementptr i8, ptr [[NEXT_GEP1]], 
i32 0
-; CHECK-MAXBW-NEXT:[[WIDE_LOAD2:%.*]] = load , ptr 
[[TMP12]], align 1
-; CHECK-MAXBW-NEXT:[[TMP13:%.*]] = zext  [[WIDE_LOAD2]] 
to 
-; CHECK-MAXBW-NEXT:[[TMP14:%.*]] = mul nuw nsw  
[[TMP13]], [[TMP11]]
-; CHECK-MAXBW-NEXT:[[TMP15]] = add  [[TMP14]], 
[[VEC_PHI]]
+; CHECK-MAXBW-NEXT:[[WIDE_LOAD2:%.*]] = load , ptr 
[[TMP12]], align 1
+; CHECK-MAXBW-NEXT:[[TMP15:%.*]] = zext  [[WIDE_LOAD2]] 
to 
+; CHECK-MAXBW-NEXT:[[TMP13:%.*]] = zext  [[WIDE_LOAD]] 
to 
+; CHECK-MAXBW-NEXT:[[TMP14:%.*]] = mul nuw nsw  
[[TMP15]], [[TMP13]]
+; CHECK-MAXBW-NEXT:[[PARTIAL_REDUCE]] = call  
@llvm.experimental.vector.partial.reduce.add.nxv2i64.nxv16i64( [[VEC_PHI]],  [[TMP14]])

SamTebbs33 wrote:

Ah it looks like what was previously too high a cost for it to choose a 16i8 -> 
2i64 partial reduction isn't sufficiently high now that the extend cost is 
hidden. I've made this permutation invalid.

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-22 Thread Sam Tebbs via llvm-branch-commits


@@ -2376,6 +2327,59 @@ class VPReductionRecipe : public VPRecipeWithIRFlags {
   }
 };
 
+/// A recipe for forming partial reductions. In the loop, an accumulator and
+/// vector operand are added together and passed to the next iteration as the
+/// next accumulator. After the loop body, the accumulator is reduced to a
+/// scalar value.
+class VPPartialReductionRecipe : public VPReductionRecipe {

SamTebbs33 wrote:

Done.

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [SPARC] Use op-then-neg instructions when we have VIS3 (PR #135717)

2025-04-22 Thread via llvm-branch-commits


@@ -316,4 +316,17 @@ def : Pat<(i64 (sext (i32 (bitconvert f32:$src, 
(MOVSTOSW $src)>;
 def : Pat<(f32 (bitconvert i32:$src)), (MOVWTOS $src)>;
 def : Pat<(i64 (bitconvert f64:$src)), (MOVDTOX $src)>;
 def : Pat<(f64 (bitconvert i64:$src)), (MOVXTOD $src)>;
+
+// OP-then-neg FP operations.
+def : Pat<(f32 (fneg (fadd f32:$rs1, f32:$rs2))), (FNADDS $rs1, $rs2)>;
+def : Pat<(f64 (fneg (fadd f64:$rs1, f64:$rs2))), (FNADDD $rs1, $rs2)>;
+def : Pat<(f32 (fneg (fmul f32:$rs1, f32:$rs2))), (FNMULS $rs1, $rs2)>;
+def : Pat<(f32 (fmul (fneg f32:$rs1), f32:$rs2)), (FNMULS $rs1, $rs2)>;
+def : Pat<(f32 (fmul f32:$rs1, (fneg f32:$rs2))), (FNMULS $rs1, $rs2)>;
+def : Pat<(f64 (fneg (fmul f64:$rs1, f64:$rs2))), (FNMULD $rs1, $rs2)>;
+def : Pat<(f64 (fmul (fneg f64:$rs1), f64:$rs2)), (FNMULD $rs1, $rs2)>;
+def : Pat<(f64 (fmul f64:$rs1, (fneg f64:$rs2))), (FNMULD $rs1, $rs2)>;
+def : Pat<(f64 (fneg (fmul (fpextend f32:$rs1), (fpextend f32:$rs2, 
(FNSMULD $rs1, $rs2)>;
+def : Pat<(f64 (fmul (fneg (fpextend f32:$rs1)), (fpextend f32:$rs2))), 
(FNSMULD $rs1, $rs2)>;
+def : Pat<(f64 (fmul (fpextend f32:$rs1), (fneg (fpextend f32:$rs2, 
(FNSMULD $rs1, $rs2)>;

koachan wrote:

Hmm, I see that isFNegFree only passes the type of the value, I'm not sure how 
to check for it.
In this case I care less about types (it works for f32 and f64 equally well) 
and more about expressions (it's only free in the very specific cases of 
`-(a+b)` and `-(a*b)`). Is it still possible to do that?

https://github.com/llvm/llvm-project/pull/135717
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -307,8 +340,10 @@ class SrcSafetyAnalysis {
 
   SrcState createEntryState() {
 SrcState S(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs())
-  S.SafeToDerefRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs()) {
+  S.TrustedRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+  S.SafeToDerefRegs = S.TrustedRegs;
+}

kbeyls wrote:

Nit pick: I guess
`S.SafeToDerefRegs = S.TrustedRegs` could be move to after the loop, rather 
than inside the loop?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

Just adding a few more comments as I'm reading through the patch.
I haven't yet managed to read the full patch in detail yet

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect authentication oracles (PR #135663)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/135663

>From 32f9acc0f34cb69f532f7f2c5c6808d6e19aeb60 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Sat, 5 Apr 2025 14:54:01 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect authentication oracles

Implement the detection of authentication instructions whose results can
be inspected by an attacker to know whether authentication succeeded.

As the properties of output registers of authentication instructions are
inspected, add a second set of analysis-related classes to iterate over
the instructions in reverse order.
---
 bolt/include/bolt/Passes/PAuthGadgetScanner.h |  12 +
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 542 +
 .../AArch64/gs-pauth-authentication-oracles.s | 723 ++
 .../AArch64/gs-pauth-debug-output.s   |  78 ++
 4 files changed, 1355 insertions(+)
 create mode 100644 
bolt/test/binary-analysis/AArch64/gs-pauth-authentication-oracles.s

diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index 9a4e9598c9f98..7c51ba42c3294 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -260,6 +260,15 @@ class ClobberingInfo : public ExtraInfo {
   void print(raw_ostream &OS, const MCInstReference Location) const override;
 };
 
+class LeakageInfo : public ExtraInfo {
+  SmallVector LeakingInstrs;
+
+public:
+  LeakageInfo(const ArrayRef Instrs) : LeakingInstrs(Instrs) 
{}
+
+  void print(raw_ostream &OS, const MCInstReference Location) const override;
+};
+
 /// A brief version of a report that can be further augmented with the details.
 ///
 /// It is common for a particular type of gadget detector to be tied to some
@@ -301,6 +310,9 @@ class FunctionAnalysis {
   void findUnsafeUses(SmallVector> &Reports);
   void augmentUnsafeUseReports(ArrayRef> Reports);
 
+  void findUnsafeDefs(SmallVector> &Reports);
+  void augmentUnsafeDefReports(const ArrayRef> Reports);
+
   /// Process the reports which do not have to be augmented, and remove them
   /// from Reports.
   void handleSimpleReports(SmallVector> &Reports);
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 0bd9a1ed5941c..fff52de8b9c73 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -712,6 +712,459 @@ SrcSafetyAnalysis::create(BinaryFunction &BF,
RegsToTrackInstsFor);
 }
 
+/// A state representing which registers are safe to be used as the destination
+/// operand of an authentication instruction.
+///
+/// Similar to SrcState, it is the analysis that should take register aliasing
+/// into account.
+///
+/// Depending on the implementation, it may be possible that an authentication
+/// instruction returns an invalid pointer on failure instead of terminating
+/// the program immediately (assuming the program will crash as soon as that
+/// pointer is dereferenced). To prevent brute-forcing the correct signature,
+/// it should be impossible for an attacker to test if a pointer is correctly
+/// signed - either the program should be terminated on authentication failure
+/// or it should be impossible to tell whether authentication succeeded or not.
+///
+/// For that reason, a restricted set of operations is allowed on any register
+/// containing a value derived from the result of an authentication instruction
+/// until that register is either wiped or checked not to contain a result of a
+/// failed authentication.
+///
+/// Specifically, the safety property for a register is computed by iterating
+/// the instructions in backward order: the source register Xn of an 
instruction
+/// Inst is safe if at least one of the following is true:
+/// * Inst checks if Xn contains the result of a successful authentication and
+///   terminates the program on failure. Note that Inst can either naturally
+///   dereference Xn (load, branch, return, etc. instructions) or be the first
+///   instruction of an explicit checking sequence.
+/// * Inst performs safe address arithmetic AND both source and result
+///   registers, as well as any temporary registers, must be safe after
+///   execution of Inst (temporaries are not used on AArch64 and thus not
+///   currently supported/allowed).
+///   See MCPlusBuilder::analyzeAddressArithmeticsForPtrAuth for the details.
+/// * Inst fully overwrites Xn with an unrelated value.
+struct DstState {
+  /// The set of registers whose values cannot be inspected by an attacker in
+  /// a way usable as an authentication oracle. The results of authentication
+  /// instructions should be written to such registers.
+  BitVector CannotEscapeUnchecked;
+
+  std::vector> FirstInstLeakingReg;
+
+  /// Construct an empty state.
+  DstState() {}
+
+  DstState(unsigned NumRegs, unsigned NumRegsToTrack)
+  : CannotE

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: improve handling of unreachable basic blocks (PR #136183)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/136183

>From 629a4239c9d88f49ccf723ab0b7a13e5a9ad0144 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Thu, 17 Apr 2025 20:51:16 +0300
Subject: [PATCH] [BOLT] Gadget scanner: improve handling of unreachable basic
 blocks

Instead of refusing to analyze an instruction completely, when it is
unreachable according to the CFG reconstructed by BOLT, pessimistically
assume all registers to be unsafe at the start of basic blocks without
any predecessors. Nevertheless, unreachable basic blocks found in
optimized code likely means imprecise CFG reconstruction, thus report a
warning once per basic block without predecessors.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 46 ++-
 .../AArch64/gs-pacret-autiasp.s   |  7 ++-
 .../binary-analysis/AArch64/gs-pauth-calls.s  | 57 +++
 3 files changed, 95 insertions(+), 15 deletions(-)

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index f7ac0b67d00da..0ce9f51c44af4 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -344,6 +344,12 @@ class SrcSafetyAnalysis {
 return S;
   }
 
+  /// Creates a state with all registers marked unsafe (not to be confused
+  /// with empty state).
+  SrcState createUnsafeState() const {
+return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
+  }
+
   BitVector getClobberedRegs(const MCInst &Point) const {
 BitVector Clobbered(NumRegs);
 // Assume a call can clobber all registers, including callee-saved
@@ -581,6 +587,13 @@ class DataflowSrcSafetyAnalysis
 if (BB.isEntryPoint())
   return createEntryState();
 
+// If a basic block without any predecessors is found in an optimized code,
+// this likely means that some CFG edges were not detected. Pessimistically
+// assume all registers to be unsafe before this basic block and warn about
+// this fact in FunctionAnalysis::findUnsafeUses().
+if (BB.pred_empty())
+  return createUnsafeState();
+
 return SrcState();
   }
 
@@ -654,12 +667,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -1328,19 +1335,30 @@ void FunctionAnalysis::findUnsafeUses(
 BF.dump();
   });
 
+  if (BF.hasCFG()) {
+// Warn on basic blocks being unreachable according to BOLT, as this
+// likely means CFG is imprecise.
+for (BinaryBasicBlock &BB : BF) {
+  if (!BB.pred_empty() || BB.isEntryPoint())
+continue;
+  // Arbitrarily attach the report to the first instruction of BB.
+  MCInst *InstToReport = BB.getFirstNonPseudoInstr();
+  if (!InstToReport)
+continue; // BB has no real instructions
+
+  Reports.push_back(
+  make_generic_report(MCInstReference::get(InstToReport, BF),
+  "Warning: no predecessor basic blocks detected "
+  "(possibly incomplete CFG)"));
+}
+  }
+
   iterateOverInstrs(BF, [&](MCInstReference Inst) {
 if (BC.MIB->isCFI(Inst))
   return;
 
 const SrcState &S = Analysis->getStateBefore(Inst);
-
-// If non-empty state was never propagated from the entry basic block
-// to Inst, assume it to be unreachable and report a warning.
-if (S.empty()) {
-  Reports.push_back(
-  make_generic_report(Inst, "Warning: unreachable instruction found"));
-  return;
-}
+assert(!S.empty() && "Instruction has no associated state");
 
 if (auto Report = shouldReportReturnGadget(BC, Inst, S))
   Reports.push_back(*Report);
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s 
b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index 284f0bea607a5..6559ba336e8de 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -215,12 +215,17 @@ f_callclobbered_calleesaved:
 .globl  f_unreachable_instruction
 .type   f_unreachable_instruction,@function
 f_unreachable_instruction:
-// CHECK-LABEL: GS-PAUTH: Warning: unreachable instruction found in function 
f_unreachable_instruction, basic block {{[0-9a-zA-Z.]+}}, at address
+// CHECK-LABEL: GS-PAUTH: Warning: no predecessor basic blocks detected 
(possibly incomplete CFG) in function f_unreachable_instruction, basic block 
{{[0-9a-zA-Z.]+}}, at address
 // CHECK-NEXT:The instruction is {{[0-9a-f]+}}:   add x0, x1, 
x2
 // CHECK-NOT:   instructions that wr

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: improve handling of unreachable basic blocks (PR #136183)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/136183

>From 629a4239c9d88f49ccf723ab0b7a13e5a9ad0144 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Thu, 17 Apr 2025 20:51:16 +0300
Subject: [PATCH] [BOLT] Gadget scanner: improve handling of unreachable basic
 blocks

Instead of refusing to analyze an instruction completely, when it is
unreachable according to the CFG reconstructed by BOLT, pessimistically
assume all registers to be unsafe at the start of basic blocks without
any predecessors. Nevertheless, unreachable basic blocks found in
optimized code likely means imprecise CFG reconstruction, thus report a
warning once per basic block without predecessors.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 46 ++-
 .../AArch64/gs-pacret-autiasp.s   |  7 ++-
 .../binary-analysis/AArch64/gs-pauth-calls.s  | 57 +++
 3 files changed, 95 insertions(+), 15 deletions(-)

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index f7ac0b67d00da..0ce9f51c44af4 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -344,6 +344,12 @@ class SrcSafetyAnalysis {
 return S;
   }
 
+  /// Creates a state with all registers marked unsafe (not to be confused
+  /// with empty state).
+  SrcState createUnsafeState() const {
+return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
+  }
+
   BitVector getClobberedRegs(const MCInst &Point) const {
 BitVector Clobbered(NumRegs);
 // Assume a call can clobber all registers, including callee-saved
@@ -581,6 +587,13 @@ class DataflowSrcSafetyAnalysis
 if (BB.isEntryPoint())
   return createEntryState();
 
+// If a basic block without any predecessors is found in an optimized code,
+// this likely means that some CFG edges were not detected. Pessimistically
+// assume all registers to be unsafe before this basic block and warn about
+// this fact in FunctionAnalysis::findUnsafeUses().
+if (BB.pred_empty())
+  return createUnsafeState();
+
 return SrcState();
   }
 
@@ -654,12 +667,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -1328,19 +1335,30 @@ void FunctionAnalysis::findUnsafeUses(
 BF.dump();
   });
 
+  if (BF.hasCFG()) {
+// Warn on basic blocks being unreachable according to BOLT, as this
+// likely means CFG is imprecise.
+for (BinaryBasicBlock &BB : BF) {
+  if (!BB.pred_empty() || BB.isEntryPoint())
+continue;
+  // Arbitrarily attach the report to the first instruction of BB.
+  MCInst *InstToReport = BB.getFirstNonPseudoInstr();
+  if (!InstToReport)
+continue; // BB has no real instructions
+
+  Reports.push_back(
+  make_generic_report(MCInstReference::get(InstToReport, BF),
+  "Warning: no predecessor basic blocks detected "
+  "(possibly incomplete CFG)"));
+}
+  }
+
   iterateOverInstrs(BF, [&](MCInstReference Inst) {
 if (BC.MIB->isCFI(Inst))
   return;
 
 const SrcState &S = Analysis->getStateBefore(Inst);
-
-// If non-empty state was never propagated from the entry basic block
-// to Inst, assume it to be unreachable and report a warning.
-if (S.empty()) {
-  Reports.push_back(
-  make_generic_report(Inst, "Warning: unreachable instruction found"));
-  return;
-}
+assert(!S.empty() && "Instruction has no associated state");
 
 if (auto Report = shouldReportReturnGadget(BC, Inst, S))
   Reports.push_back(*Report);
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s 
b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index 284f0bea607a5..6559ba336e8de 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -215,12 +215,17 @@ f_callclobbered_calleesaved:
 .globl  f_unreachable_instruction
 .type   f_unreachable_instruction,@function
 f_unreachable_instruction:
-// CHECK-LABEL: GS-PAUTH: Warning: unreachable instruction found in function 
f_unreachable_instruction, basic block {{[0-9a-zA-Z.]+}}, at address
+// CHECK-LABEL: GS-PAUTH: Warning: no predecessor basic blocks detected 
(possibly incomplete CFG) in function f_unreachable_instruction, basic block 
{{[0-9a-zA-Z.]+}}, at address
 // CHECK-NEXT:The instruction is {{[0-9a-f]+}}:   add x0, x1, 
x2
 // CHECK-NOT:   instructions that wr

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: clarify MCPlusBuilder callbacks interface (PR #136147)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/136147

>From fa8766c5a9a5bf8c8859bc2f4b9e9e7446ec0015 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Thu, 17 Apr 2025 15:40:05 +0300
Subject: [PATCH] [BOLT] Gadget scanner: clarify MCPlusBuilder callbacks
 interface

Clarify the semantics of `getAuthenticatedReg` and remove a redundant
`isAuthenticationOfReg` method, as combined auth+something instructions
(such as `retaa` on AArch64) should be handled carefully, especially
when searching for authentication oracles: usually, such instructions
cannot be authentication oracles and only some of them actually write
an authenticated pointer to a register (such as "ldra x0, [x1]!").

Use `std::optional` returned type instead of plain MCPhysReg
and returning `getNoRegister()` as a "not applicable" indication.

Document a few existing methods, add information about preconditions.
---
 bolt/include/bolt/Core/MCPlusBuilder.h| 61 ++-
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 64 +---
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   | 76 ---
 .../AArch64/gs-pauth-debug-output.s   |  3 -
 .../AArch64/gs-pauth-signing-oracles.s| 20 +
 5 files changed, 130 insertions(+), 94 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 132d58f3f9f79..83ad70ea97076 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -562,30 +562,50 @@ class MCPlusBuilder {
 return {};
   }
 
-  virtual ErrorOr getAuthenticatedReg(const MCInst &Inst) const {
-llvm_unreachable("not implemented");
-return getNoRegister();
-  }
-
-  virtual bool isAuthenticationOfReg(const MCInst &Inst,
- MCPhysReg AuthenticatedReg) const {
+  /// Returns the register where an authenticated pointer is written to by 
Inst,
+  /// or std::nullopt if not authenticating any register.
+  ///
+  /// Sets IsChecked if the instruction always checks authenticated pointer,
+  /// i.e. it either returns a successfully authenticated pointer or terminates
+  /// the program abnormally (such as "ldra x0, [x1]!" on AArch64, which 
crashes
+  /// on authentication failure even if FEAT_FPAC is not implemented).
+  virtual std::optional
+  getWrittenAuthenticatedReg(const MCInst &Inst, bool &IsChecked) const {
 llvm_unreachable("not implemented");
-return false;
+return std::nullopt;
   }
 
-  virtual MCPhysReg getSignedReg(const MCInst &Inst) const {
+  /// Returns the register signed by Inst, or std::nullopt if not signing any
+  /// register.
+  ///
+  /// The returned register is assumed to be both input and output operand,
+  /// as it is done on AArch64.
+  virtual std::optional getSignedReg(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
-return getNoRegister();
+return std::nullopt;
   }
 
-  virtual ErrorOr getRegUsedAsRetDest(const MCInst &Inst) const {
+  /// Returns the register used as a return address. Returns std::nullopt if
+  /// not applicable, such as reading the return address from a system register
+  /// or from the stack.
+  ///
+  /// Sets IsAuthenticatedInternally if the instruction accepts a signed
+  /// pointer as its operand and authenticates it internally.
+  ///
+  /// Should only be called when isReturn(Inst) is true.
+  virtual std::optional
+  getRegUsedAsRetDest(const MCInst &Inst,
+  bool &IsAuthenticatedInternally) const {
 llvm_unreachable("not implemented");
-return getNoRegister();
+return std::nullopt;
   }
 
   /// Returns the register used as the destination of an indirect branch or 
call
   /// instruction. Sets IsAuthenticatedInternally if the instruction accepts
   /// a signed pointer as its operand and authenticates it internally.
+  ///
+  /// Should only be called if isIndirectCall(Inst) or isIndirectBranch(Inst)
+  /// returns true.
   virtual MCPhysReg
   getRegUsedAsIndirectBranchDest(const MCInst &Inst,
  bool &IsAuthenticatedInternally) const {
@@ -602,14 +622,14 @@ class MCPlusBuilder {
   ///controlled, under the Pointer Authentication threat model.
   ///
   /// If the instruction does not write to any register satisfying the above
-  /// two conditions, NoRegister is returned.
+  /// two conditions, std::nullopt is returned.
   ///
   /// The Pointer Authentication threat model assumes an attacker is able to
   /// modify any writable memory, but not executable code (due to W^X).
-  virtual MCPhysReg
+  virtual std::optional
   getMaterializedAddressRegForPtrAuth(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
-return getNoRegister();
+return std::nullopt;
   }
 
   /// Analyzes if this instruction can safely perform address arithmetics
@@ -622,10 +642,13 @@ class MCPlusBuilder {
   /// controlled, provided InReg and executa

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: clarify MCPlusBuilder callbacks interface (PR #136147)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/136147

>From fa8766c5a9a5bf8c8859bc2f4b9e9e7446ec0015 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Thu, 17 Apr 2025 15:40:05 +0300
Subject: [PATCH] [BOLT] Gadget scanner: clarify MCPlusBuilder callbacks
 interface

Clarify the semantics of `getAuthenticatedReg` and remove a redundant
`isAuthenticationOfReg` method, as combined auth+something instructions
(such as `retaa` on AArch64) should be handled carefully, especially
when searching for authentication oracles: usually, such instructions
cannot be authentication oracles and only some of them actually write
an authenticated pointer to a register (such as "ldra x0, [x1]!").

Use `std::optional` returned type instead of plain MCPhysReg
and returning `getNoRegister()` as a "not applicable" indication.

Document a few existing methods, add information about preconditions.
---
 bolt/include/bolt/Core/MCPlusBuilder.h| 61 ++-
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 64 +---
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   | 76 ---
 .../AArch64/gs-pauth-debug-output.s   |  3 -
 .../AArch64/gs-pauth-signing-oracles.s| 20 +
 5 files changed, 130 insertions(+), 94 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 132d58f3f9f79..83ad70ea97076 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -562,30 +562,50 @@ class MCPlusBuilder {
 return {};
   }
 
-  virtual ErrorOr getAuthenticatedReg(const MCInst &Inst) const {
-llvm_unreachable("not implemented");
-return getNoRegister();
-  }
-
-  virtual bool isAuthenticationOfReg(const MCInst &Inst,
- MCPhysReg AuthenticatedReg) const {
+  /// Returns the register where an authenticated pointer is written to by 
Inst,
+  /// or std::nullopt if not authenticating any register.
+  ///
+  /// Sets IsChecked if the instruction always checks authenticated pointer,
+  /// i.e. it either returns a successfully authenticated pointer or terminates
+  /// the program abnormally (such as "ldra x0, [x1]!" on AArch64, which 
crashes
+  /// on authentication failure even if FEAT_FPAC is not implemented).
+  virtual std::optional
+  getWrittenAuthenticatedReg(const MCInst &Inst, bool &IsChecked) const {
 llvm_unreachable("not implemented");
-return false;
+return std::nullopt;
   }
 
-  virtual MCPhysReg getSignedReg(const MCInst &Inst) const {
+  /// Returns the register signed by Inst, or std::nullopt if not signing any
+  /// register.
+  ///
+  /// The returned register is assumed to be both input and output operand,
+  /// as it is done on AArch64.
+  virtual std::optional getSignedReg(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
-return getNoRegister();
+return std::nullopt;
   }
 
-  virtual ErrorOr getRegUsedAsRetDest(const MCInst &Inst) const {
+  /// Returns the register used as a return address. Returns std::nullopt if
+  /// not applicable, such as reading the return address from a system register
+  /// or from the stack.
+  ///
+  /// Sets IsAuthenticatedInternally if the instruction accepts a signed
+  /// pointer as its operand and authenticates it internally.
+  ///
+  /// Should only be called when isReturn(Inst) is true.
+  virtual std::optional
+  getRegUsedAsRetDest(const MCInst &Inst,
+  bool &IsAuthenticatedInternally) const {
 llvm_unreachable("not implemented");
-return getNoRegister();
+return std::nullopt;
   }
 
   /// Returns the register used as the destination of an indirect branch or 
call
   /// instruction. Sets IsAuthenticatedInternally if the instruction accepts
   /// a signed pointer as its operand and authenticates it internally.
+  ///
+  /// Should only be called if isIndirectCall(Inst) or isIndirectBranch(Inst)
+  /// returns true.
   virtual MCPhysReg
   getRegUsedAsIndirectBranchDest(const MCInst &Inst,
  bool &IsAuthenticatedInternally) const {
@@ -602,14 +622,14 @@ class MCPlusBuilder {
   ///controlled, under the Pointer Authentication threat model.
   ///
   /// If the instruction does not write to any register satisfying the above
-  /// two conditions, NoRegister is returned.
+  /// two conditions, std::nullopt is returned.
   ///
   /// The Pointer Authentication threat model assumes an attacker is able to
   /// modify any writable memory, but not executable code (due to W^X).
-  virtual MCPhysReg
+  virtual std::optional
   getMaterializedAddressRegForPtrAuth(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
-return getNoRegister();
+return std::nullopt;
   }
 
   /// Analyzes if this instruction can safely perform address arithmetics
@@ -622,10 +642,13 @@ class MCPlusBuilder {
   /// controlled, provided InReg and executa

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -307,8 +340,10 @@ class SrcSafetyAnalysis {
 
   SrcState createEntryState() {
 SrcState S(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs())
-  S.SafeToDerefRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs()) {
+  S.TrustedRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+  S.SafeToDerefRegs = S.TrustedRegs;
+}

atrosinenko wrote:

Fixed, thank you!

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/135662

>From 1419bfe2de741c5832de0ceb1b1765ba7b92022d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 14 Apr 2025 15:08:54 +0300
Subject: [PATCH 1/3] [BOLT] Gadget scanner: refactor issue reporting

Remove `getAffectedRegisters` and `setOverwritingInstrs` methods from
the base `Report` class. Instead, make `Report` always represent the
brief version of the report. When an issue is detected on the first run
of the analysis, return an optional request for extra details to attach
to the report on the second run.
---
 bolt/include/bolt/Passes/PAuthGadgetScanner.h | 102 ++---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 200 ++
 .../AArch64/gs-pauth-debug-output.s   |   8 +-
 3 files changed, 187 insertions(+), 123 deletions(-)

diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index 4c1bef3d2265f..3b6c1f6af94a0 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -219,11 +219,6 @@ struct Report {
   virtual void generateReport(raw_ostream &OS,
   const BinaryContext &BC) const = 0;
 
-  // The two methods below are called by Analysis::computeDetailedInfo when
-  // iterating over the reports.
-  virtual ArrayRef getAffectedRegisters() const { return {}; }
-  virtual void setOverwritingInstrs(ArrayRef Instrs) {}
-
   void printBasicInfo(raw_ostream &OS, const BinaryContext &BC,
   StringRef IssueKind) const;
 };
@@ -231,27 +226,11 @@ struct Report {
 struct GadgetReport : public Report {
   // The particular kind of gadget that is detected.
   const GadgetKind &Kind;
-  // The set of registers related to this gadget report (possibly empty).
-  SmallVector AffectedRegisters;
-  // The instructions that clobber the affected registers.
-  // There is no one-to-one correspondence with AffectedRegisters: for example,
-  // the same register can be overwritten by different instructions in 
different
-  // preceding basic blocks.
-  SmallVector OverwritingInstrs;
-
-  GadgetReport(const GadgetKind &Kind, MCInstReference Location,
-   MCPhysReg AffectedRegister)
-  : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {}
-
-  void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
 
-  ArrayRef getAffectedRegisters() const override {
-return AffectedRegisters;
-  }
+  GadgetReport(const GadgetKind &Kind, MCInstReference Location)
+  : Report(Location), Kind(Kind) {}
 
-  void setOverwritingInstrs(ArrayRef Instrs) override {
-OverwritingInstrs.assign(Instrs.begin(), Instrs.end());
-  }
+  void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
 };
 
 /// Report with a free-form message attached.
@@ -263,8 +242,75 @@ struct GenericReport : public Report {
   const BinaryContext &BC) const override;
 };
 
+/// An information about an issue collected on the slower, detailed,
+/// run of an analysis.
+class ExtraInfo {
+public:
+  virtual void print(raw_ostream &OS, const MCInstReference Location) const = 
0;
+
+  virtual ~ExtraInfo() {}
+};
+
+class ClobberingInfo : public ExtraInfo {
+  SmallVector ClobberingInstrs;
+
+public:
+  ClobberingInfo(const ArrayRef Instrs)
+  : ClobberingInstrs(Instrs) {}
+
+  void print(raw_ostream &OS, const MCInstReference Location) const override;
+};
+
+/// A brief version of a report that can be further augmented with the details.
+///
+/// It is common for a particular type of gadget detector to be tied to some
+/// specific kind of analysis. If an issue is returned by that detector, it may
+/// be further augmented with the detailed info in an analysis-specific way,
+/// or just be left as-is (f.e. if a free-form warning was reported).
+template  struct BriefReport {
+  BriefReport(std::shared_ptr Issue,
+  const std::optional RequestedDetails)
+  : Issue(Issue), RequestedDetails(RequestedDetails) {}
+
+  std::shared_ptr Issue;
+  std::optional RequestedDetails;
+};
+
+/// A detailed version of a report.
+struct DetailedReport {
+  DetailedReport(std::shared_ptr Issue,
+ std::shared_ptr Details)
+  : Issue(Issue), Details(Details) {}
+
+  std::shared_ptr Issue;
+  std::shared_ptr Details;
+};
+
 struct FunctionAnalysisResult {
-  std::vector> Diagnostics;
+  std::vector Diagnostics;
+};
+
+/// A helper class storing per-function context to be instantiated by Analysis.
+class FunctionAnalysis {
+  BinaryContext &BC;
+  BinaryFunction &BF;
+  MCPlusBuilder::AllocatorIdTy AllocatorId;
+  FunctionAnalysisResult Result;
+
+  bool PacRetGadgetsOnly;
+
+  void findUnsafeUses(SmallVector> &Reports);
+  void augmentUnsafeUseReports(const ArrayRef> Reports);
+
+public:
+  FunctionAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy 
AllocatorId,
+ 

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect authentication oracles (PR #135663)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/135663

>From 32f9acc0f34cb69f532f7f2c5c6808d6e19aeb60 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Sat, 5 Apr 2025 14:54:01 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect authentication oracles

Implement the detection of authentication instructions whose results can
be inspected by an attacker to know whether authentication succeeded.

As the properties of output registers of authentication instructions are
inspected, add a second set of analysis-related classes to iterate over
the instructions in reverse order.
---
 bolt/include/bolt/Passes/PAuthGadgetScanner.h |  12 +
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 542 +
 .../AArch64/gs-pauth-authentication-oracles.s | 723 ++
 .../AArch64/gs-pauth-debug-output.s   |  78 ++
 4 files changed, 1355 insertions(+)
 create mode 100644 
bolt/test/binary-analysis/AArch64/gs-pauth-authentication-oracles.s

diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index 9a4e9598c9f98..7c51ba42c3294 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -260,6 +260,15 @@ class ClobberingInfo : public ExtraInfo {
   void print(raw_ostream &OS, const MCInstReference Location) const override;
 };
 
+class LeakageInfo : public ExtraInfo {
+  SmallVector LeakingInstrs;
+
+public:
+  LeakageInfo(const ArrayRef Instrs) : LeakingInstrs(Instrs) 
{}
+
+  void print(raw_ostream &OS, const MCInstReference Location) const override;
+};
+
 /// A brief version of a report that can be further augmented with the details.
 ///
 /// It is common for a particular type of gadget detector to be tied to some
@@ -301,6 +310,9 @@ class FunctionAnalysis {
   void findUnsafeUses(SmallVector> &Reports);
   void augmentUnsafeUseReports(ArrayRef> Reports);
 
+  void findUnsafeDefs(SmallVector> &Reports);
+  void augmentUnsafeDefReports(const ArrayRef> Reports);
+
   /// Process the reports which do not have to be augmented, and remove them
   /// from Reports.
   void handleSimpleReports(SmallVector> &Reports);
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 0bd9a1ed5941c..fff52de8b9c73 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -712,6 +712,459 @@ SrcSafetyAnalysis::create(BinaryFunction &BF,
RegsToTrackInstsFor);
 }
 
+/// A state representing which registers are safe to be used as the destination
+/// operand of an authentication instruction.
+///
+/// Similar to SrcState, it is the analysis that should take register aliasing
+/// into account.
+///
+/// Depending on the implementation, it may be possible that an authentication
+/// instruction returns an invalid pointer on failure instead of terminating
+/// the program immediately (assuming the program will crash as soon as that
+/// pointer is dereferenced). To prevent brute-forcing the correct signature,
+/// it should be impossible for an attacker to test if a pointer is correctly
+/// signed - either the program should be terminated on authentication failure
+/// or it should be impossible to tell whether authentication succeeded or not.
+///
+/// For that reason, a restricted set of operations is allowed on any register
+/// containing a value derived from the result of an authentication instruction
+/// until that register is either wiped or checked not to contain a result of a
+/// failed authentication.
+///
+/// Specifically, the safety property for a register is computed by iterating
+/// the instructions in backward order: the source register Xn of an 
instruction
+/// Inst is safe if at least one of the following is true:
+/// * Inst checks if Xn contains the result of a successful authentication and
+///   terminates the program on failure. Note that Inst can either naturally
+///   dereference Xn (load, branch, return, etc. instructions) or be the first
+///   instruction of an explicit checking sequence.
+/// * Inst performs safe address arithmetic AND both source and result
+///   registers, as well as any temporary registers, must be safe after
+///   execution of Inst (temporaries are not used on AArch64 and thus not
+///   currently supported/allowed).
+///   See MCPlusBuilder::analyzeAddressArithmeticsForPtrAuth for the details.
+/// * Inst fully overwrites Xn with an unrelated value.
+struct DstState {
+  /// The set of registers whose values cannot be inspected by an attacker in
+  /// a way usable as an authentication oracle. The results of authentication
+  /// instructions should be written to such registers.
+  BitVector CannotEscapeUnchecked;
+
+  std::vector> FirstInstLeakingReg;
+
+  /// Construct an empty state.
+  DstState() {}
+
+  DstState(unsigned NumRegs, unsigned NumRegsToTrack)
+  : CannotE

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: do not crash on debug-printing CFI instructions (PR #136151)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/136151

>From 323acbd89b97976a4553e781bee42d6e4560908d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 15 Apr 2025 21:47:18 +0300
Subject: [PATCH] [BOLT] Gadget scanner: do not crash on debug-printing CFI
 instructions

Some instruction-printing code used under LLVM_DEBUG does not handle CFI
instructions well. While CFI instructions seem to be harmless for the
correctness of the analysis results, they do not convey any useful
information to the analysis either, so skip them early.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 16 ++
 .../AArch64/gs-pauth-debug-output.s   | 32 +++
 2 files changed, 48 insertions(+)

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 849272cac73d2..f7ac0b67d00da 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -431,6 +431,9 @@ class SrcSafetyAnalysis {
   }
 
   SrcState computeNext(const MCInst &Point, const SrcState &Cur) {
+if (BC.MIB->isCFI(Point))
+  return Cur;
+
 SrcStatePrinter P(BC);
 LLVM_DEBUG({
   dbgs() << "  SrcSafetyAnalysis::ComputeNext(";
@@ -670,6 +673,8 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
+  if (BC.MIB->isCFI(Inst))
+continue;
 
   // If there is a label before this instruction, it is possible that it
   // can be jumped-to, thus conservatively resetting S. As an exception,
@@ -947,6 +952,9 @@ class DstSafetyAnalysis {
   }
 
   DstState computeNext(const MCInst &Point, const DstState &Cur) {
+if (BC.MIB->isCFI(Point))
+  return Cur;
+
 DstStatePrinter P(BC);
 LLVM_DEBUG({
   dbgs() << "  DstSafetyAnalysis::ComputeNext(";
@@ -1123,6 +1131,8 @@ class CFGUnawareDstSafetyAnalysis : public 
DstSafetyAnalysis {
 DstState S = createUnsafeState();
 for (auto &I : llvm::reverse(BF.instrs())) {
   MCInst &Inst = I.second;
+  if (BC.MIB->isCFI(Inst))
+continue;
 
   // If Inst can change the control flow, we cannot be sure that the next
   // instruction (to be executed in analyzed program) is the one processed
@@ -1319,6 +1329,9 @@ void FunctionAnalysis::findUnsafeUses(
   });
 
   iterateOverInstrs(BF, [&](MCInstReference Inst) {
+if (BC.MIB->isCFI(Inst))
+  return;
+
 const SrcState &S = Analysis->getStateBefore(Inst);
 
 // If non-empty state was never propagated from the entry basic block
@@ -1382,6 +1395,9 @@ void FunctionAnalysis::findUnsafeDefs(
   });
 
   iterateOverInstrs(BF, [&](MCInstReference Inst) {
+if (BC.MIB->isCFI(Inst))
+  return;
+
 const DstState &S = Analysis->getStateAfter(Inst);
 
 if (auto Report = shouldReportAuthOracle(BC, Inst, S))
diff --git a/bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s 
b/bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s
index fd55880921d06..07b61bea77e94 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s
@@ -329,6 +329,38 @@ auth_oracle:
 // PAUTH-EMPTY:
 // PAUTH-NEXT:   Attaching leakage info to: :  autia   x0, x1 
# DataflowDstSafetyAnalysis: dst-state
 
+// Gadget scanner should not crash on CFI instructions, including when 
debug-printing them.
+// Note that the particular debug output is not checked, but BOLT should be
+// compiled with assertions enabled to support -debug-only argument.
+
+.globl  cfi_inst_df
+.type   cfi_inst_df,@function
+cfi_inst_df:
+.cfi_startproc
+sub sp, sp, #16
+.cfi_def_cfa_offset 16
+add sp, sp, #16
+.cfi_def_cfa_offset 0
+ret
+.size   cfi_inst_df, .-cfi_inst_df
+.cfi_endproc
+
+.globl  cfi_inst_nocfg
+.type   cfi_inst_nocfg,@function
+cfi_inst_nocfg:
+.cfi_startproc
+sub sp, sp, #16
+.cfi_def_cfa_offset 16
+
+adr x0, 1f
+br  x0
+1:
+add sp, sp, #16
+.cfi_def_cfa_offset 0
+ret
+.size   cfi_inst_nocfg, .-cfi_inst_nocfg
+.cfi_endproc
+
 // CHECK-LABEL:Analyzing function main, AllocatorId = 1
 .globl  main
 .type   main,@function

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/135661

>From b811de675aa768e8a7c944b7a25d137eb4b21985 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 14 Apr 2025 14:35:56 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: use more appropriate types (NFC)

* use more flexible `const ArrayRef` and `StringRef` types instead of
  `const std::vector &` and `const std::string &`, correspondingly,
  for function arguments
* return plain `const SrcState &` instead of `ErrorOr`
  from `SrcSafetyAnalysis::getStateBefore`, as absent state is not
  handled gracefully by any caller
---
 bolt/include/bolt/Passes/PAuthGadgetScanner.h |  8 +---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 39 ---
 2 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index 6765e2aff414f..3e39b64e59e0f 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -12,7 +12,6 @@
 #include "bolt/Core/BinaryContext.h"
 #include "bolt/Core/BinaryFunction.h"
 #include "bolt/Passes/BinaryPasses.h"
-#include "llvm/ADT/SmallSet.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
 
@@ -199,9 +198,6 @@ raw_ostream &operator<<(raw_ostream &OS, const 
MCInstReference &);
 
 namespace PAuthGadgetScanner {
 
-class SrcSafetyAnalysis;
-struct SrcState;
-
 /// Description of a gadget kind that can be detected. Intended to be
 /// statically allocated to be attached to reports by reference.
 class GadgetKind {
@@ -210,7 +206,7 @@ class GadgetKind {
 public:
   GadgetKind(const char *Description) : Description(Description) {}
 
-  const StringRef getDescription() const { return Description; }
+  StringRef getDescription() const { return Description; }
 };
 
 /// Base report located at some instruction, without any additional 
information.
@@ -261,7 +257,7 @@ struct GadgetReport : public Report {
 /// Report with a free-form message attached.
 struct GenericReport : public Report {
   std::string Text;
-  GenericReport(MCInstReference Location, const std::string &Text)
+  GenericReport(MCInstReference Location, StringRef Text)
   : Report(Location), Text(Text) {}
   virtual void generateReport(raw_ostream &OS,
   const BinaryContext &BC) const override;
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 4f601558dec4e..339673b600765 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -91,14 +91,14 @@ class TrackedRegisters {
   const std::vector Registers;
   std::vector RegToIndexMapping;
 
-  static size_t getMappingSize(const std::vector &RegsToTrack) {
+  static size_t getMappingSize(const ArrayRef RegsToTrack) {
 if (RegsToTrack.empty())
   return 0;
 return 1 + *llvm::max_element(RegsToTrack);
   }
 
 public:
-  TrackedRegisters(const std::vector &RegsToTrack)
+  TrackedRegisters(const ArrayRef RegsToTrack)
   : Registers(RegsToTrack),
 RegToIndexMapping(getMappingSize(RegsToTrack), NoIndex) {
 for (unsigned I = 0; I < RegsToTrack.size(); ++I)
@@ -234,7 +234,7 @@ struct SrcState {
 
 static void printLastInsts(
 raw_ostream &OS,
-const std::vector> &LastInstWritingReg) {
+const ArrayRef> LastInstWritingReg) {
   OS << "Insts: ";
   for (unsigned I = 0; I < LastInstWritingReg.size(); ++I) {
 auto &Set = LastInstWritingReg[I];
@@ -295,7 +295,7 @@ void SrcStatePrinter::print(raw_ostream &OS, const SrcState 
&S) const {
 class SrcSafetyAnalysis {
 public:
   SrcSafetyAnalysis(BinaryFunction &BF,
-const std::vector &RegsToTrackInstsFor)
+const ArrayRef RegsToTrackInstsFor)
   : BC(BF.getBinaryContext()), NumRegs(BC.MRI->getNumRegs()),
 RegsToTrackInstsFor(RegsToTrackInstsFor) {}
 
@@ -303,11 +303,10 @@ class SrcSafetyAnalysis {
 
   static std::shared_ptr
   create(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId,
- const std::vector &RegsToTrackInstsFor);
+ const ArrayRef RegsToTrackInstsFor);
 
   virtual void run() = 0;
-  virtual ErrorOr
-  getStateBefore(const MCInst &Inst) const = 0;
+  virtual const SrcState &getStateBefore(const MCInst &Inst) const = 0;
 
 protected:
   BinaryContext &BC;
@@ -347,7 +346,7 @@ class SrcSafetyAnalysis {
   }
 
   BitVector getClobberedRegs(const MCInst &Point) const {
-BitVector Clobbered(NumRegs, false);
+BitVector Clobbered(NumRegs);
 // Assume a call can clobber all registers, including callee-saved
 // registers. There's a good chance that callee-saved registers will be
 // saved on the stack at some point during execution of the callee.
@@ -408,8 +407,7 @@ class SrcSafetyAnalysis {
 
   // FirstCheckerInst should belong to the same basic block, meaning
   // it was deterministically processed a few steps before this 
ins

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: do not crash on debug-printing CFI instructions (PR #136151)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/136151

>From 323acbd89b97976a4553e781bee42d6e4560908d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 15 Apr 2025 21:47:18 +0300
Subject: [PATCH] [BOLT] Gadget scanner: do not crash on debug-printing CFI
 instructions

Some instruction-printing code used under LLVM_DEBUG does not handle CFI
instructions well. While CFI instructions seem to be harmless for the
correctness of the analysis results, they do not convey any useful
information to the analysis either, so skip them early.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 16 ++
 .../AArch64/gs-pauth-debug-output.s   | 32 +++
 2 files changed, 48 insertions(+)

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 849272cac73d2..f7ac0b67d00da 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -431,6 +431,9 @@ class SrcSafetyAnalysis {
   }
 
   SrcState computeNext(const MCInst &Point, const SrcState &Cur) {
+if (BC.MIB->isCFI(Point))
+  return Cur;
+
 SrcStatePrinter P(BC);
 LLVM_DEBUG({
   dbgs() << "  SrcSafetyAnalysis::ComputeNext(";
@@ -670,6 +673,8 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
+  if (BC.MIB->isCFI(Inst))
+continue;
 
   // If there is a label before this instruction, it is possible that it
   // can be jumped-to, thus conservatively resetting S. As an exception,
@@ -947,6 +952,9 @@ class DstSafetyAnalysis {
   }
 
   DstState computeNext(const MCInst &Point, const DstState &Cur) {
+if (BC.MIB->isCFI(Point))
+  return Cur;
+
 DstStatePrinter P(BC);
 LLVM_DEBUG({
   dbgs() << "  DstSafetyAnalysis::ComputeNext(";
@@ -1123,6 +1131,8 @@ class CFGUnawareDstSafetyAnalysis : public 
DstSafetyAnalysis {
 DstState S = createUnsafeState();
 for (auto &I : llvm::reverse(BF.instrs())) {
   MCInst &Inst = I.second;
+  if (BC.MIB->isCFI(Inst))
+continue;
 
   // If Inst can change the control flow, we cannot be sure that the next
   // instruction (to be executed in analyzed program) is the one processed
@@ -1319,6 +1329,9 @@ void FunctionAnalysis::findUnsafeUses(
   });
 
   iterateOverInstrs(BF, [&](MCInstReference Inst) {
+if (BC.MIB->isCFI(Inst))
+  return;
+
 const SrcState &S = Analysis->getStateBefore(Inst);
 
 // If non-empty state was never propagated from the entry basic block
@@ -1382,6 +1395,9 @@ void FunctionAnalysis::findUnsafeDefs(
   });
 
   iterateOverInstrs(BF, [&](MCInstReference Inst) {
+if (BC.MIB->isCFI(Inst))
+  return;
+
 const DstState &S = Analysis->getStateAfter(Inst);
 
 if (auto Report = shouldReportAuthOracle(BC, Inst, S))
diff --git a/bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s 
b/bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s
index fd55880921d06..07b61bea77e94 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s
@@ -329,6 +329,38 @@ auth_oracle:
 // PAUTH-EMPTY:
 // PAUTH-NEXT:   Attaching leakage info to: :  autia   x0, x1 
# DataflowDstSafetyAnalysis: dst-state
 
+// Gadget scanner should not crash on CFI instructions, including when 
debug-printing them.
+// Note that the particular debug output is not checked, but BOLT should be
+// compiled with assertions enabled to support -debug-only argument.
+
+.globl  cfi_inst_df
+.type   cfi_inst_df,@function
+cfi_inst_df:
+.cfi_startproc
+sub sp, sp, #16
+.cfi_def_cfa_offset 16
+add sp, sp, #16
+.cfi_def_cfa_offset 0
+ret
+.size   cfi_inst_df, .-cfi_inst_df
+.cfi_endproc
+
+.globl  cfi_inst_nocfg
+.type   cfi_inst_nocfg,@function
+cfi_inst_nocfg:
+.cfi_startproc
+sub sp, sp, #16
+.cfi_def_cfa_offset 16
+
+adr x0, 1f
+br  x0
+1:
+add sp, sp, #16
+.cfi_def_cfa_offset 0
+ret
+.size   cfi_inst_nocfg, .-cfi_inst_nocfg
+.cfi_endproc
+
 // CHECK-LABEL:Analyzing function main, AllocatorId = 1
 .globl  main
 .type   main,@function

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/135661

>From b811de675aa768e8a7c944b7a25d137eb4b21985 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 14 Apr 2025 14:35:56 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: use more appropriate types (NFC)

* use more flexible `const ArrayRef` and `StringRef` types instead of
  `const std::vector &` and `const std::string &`, correspondingly,
  for function arguments
* return plain `const SrcState &` instead of `ErrorOr`
  from `SrcSafetyAnalysis::getStateBefore`, as absent state is not
  handled gracefully by any caller
---
 bolt/include/bolt/Passes/PAuthGadgetScanner.h |  8 +---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 39 ---
 2 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index 6765e2aff414f..3e39b64e59e0f 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -12,7 +12,6 @@
 #include "bolt/Core/BinaryContext.h"
 #include "bolt/Core/BinaryFunction.h"
 #include "bolt/Passes/BinaryPasses.h"
-#include "llvm/ADT/SmallSet.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
 
@@ -199,9 +198,6 @@ raw_ostream &operator<<(raw_ostream &OS, const 
MCInstReference &);
 
 namespace PAuthGadgetScanner {
 
-class SrcSafetyAnalysis;
-struct SrcState;
-
 /// Description of a gadget kind that can be detected. Intended to be
 /// statically allocated to be attached to reports by reference.
 class GadgetKind {
@@ -210,7 +206,7 @@ class GadgetKind {
 public:
   GadgetKind(const char *Description) : Description(Description) {}
 
-  const StringRef getDescription() const { return Description; }
+  StringRef getDescription() const { return Description; }
 };
 
 /// Base report located at some instruction, without any additional 
information.
@@ -261,7 +257,7 @@ struct GadgetReport : public Report {
 /// Report with a free-form message attached.
 struct GenericReport : public Report {
   std::string Text;
-  GenericReport(MCInstReference Location, const std::string &Text)
+  GenericReport(MCInstReference Location, StringRef Text)
   : Report(Location), Text(Text) {}
   virtual void generateReport(raw_ostream &OS,
   const BinaryContext &BC) const override;
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 4f601558dec4e..339673b600765 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -91,14 +91,14 @@ class TrackedRegisters {
   const std::vector Registers;
   std::vector RegToIndexMapping;
 
-  static size_t getMappingSize(const std::vector &RegsToTrack) {
+  static size_t getMappingSize(const ArrayRef RegsToTrack) {
 if (RegsToTrack.empty())
   return 0;
 return 1 + *llvm::max_element(RegsToTrack);
   }
 
 public:
-  TrackedRegisters(const std::vector &RegsToTrack)
+  TrackedRegisters(const ArrayRef RegsToTrack)
   : Registers(RegsToTrack),
 RegToIndexMapping(getMappingSize(RegsToTrack), NoIndex) {
 for (unsigned I = 0; I < RegsToTrack.size(); ++I)
@@ -234,7 +234,7 @@ struct SrcState {
 
 static void printLastInsts(
 raw_ostream &OS,
-const std::vector> &LastInstWritingReg) {
+const ArrayRef> LastInstWritingReg) {
   OS << "Insts: ";
   for (unsigned I = 0; I < LastInstWritingReg.size(); ++I) {
 auto &Set = LastInstWritingReg[I];
@@ -295,7 +295,7 @@ void SrcStatePrinter::print(raw_ostream &OS, const SrcState 
&S) const {
 class SrcSafetyAnalysis {
 public:
   SrcSafetyAnalysis(BinaryFunction &BF,
-const std::vector &RegsToTrackInstsFor)
+const ArrayRef RegsToTrackInstsFor)
   : BC(BF.getBinaryContext()), NumRegs(BC.MRI->getNumRegs()),
 RegsToTrackInstsFor(RegsToTrackInstsFor) {}
 
@@ -303,11 +303,10 @@ class SrcSafetyAnalysis {
 
   static std::shared_ptr
   create(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId,
- const std::vector &RegsToTrackInstsFor);
+ const ArrayRef RegsToTrackInstsFor);
 
   virtual void run() = 0;
-  virtual ErrorOr
-  getStateBefore(const MCInst &Inst) const = 0;
+  virtual const SrcState &getStateBefore(const MCInst &Inst) const = 0;
 
 protected:
   BinaryContext &BC;
@@ -347,7 +346,7 @@ class SrcSafetyAnalysis {
   }
 
   BitVector getClobberedRegs(const MCInst &Point) const {
-BitVector Clobbered(NumRegs, false);
+BitVector Clobbered(NumRegs);
 // Assume a call can clobber all registers, including callee-saved
 // registers. There's a good chance that callee-saved registers will be
 // saved on the stack at some point during execution of the callee.
@@ -408,8 +407,7 @@ class SrcSafetyAnalysis {
 
   // FirstCheckerInst should belong to the same basic block, meaning
   // it was deterministically processed a few steps before this 
ins

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/135662

>From 1419bfe2de741c5832de0ceb1b1765ba7b92022d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 14 Apr 2025 15:08:54 +0300
Subject: [PATCH 1/3] [BOLT] Gadget scanner: refactor issue reporting

Remove `getAffectedRegisters` and `setOverwritingInstrs` methods from
the base `Report` class. Instead, make `Report` always represent the
brief version of the report. When an issue is detected on the first run
of the analysis, return an optional request for extra details to attach
to the report on the second run.
---
 bolt/include/bolt/Passes/PAuthGadgetScanner.h | 102 ++---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 200 ++
 .../AArch64/gs-pauth-debug-output.s   |   8 +-
 3 files changed, 187 insertions(+), 123 deletions(-)

diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h 
b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index 4c1bef3d2265f..3b6c1f6af94a0 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -219,11 +219,6 @@ struct Report {
   virtual void generateReport(raw_ostream &OS,
   const BinaryContext &BC) const = 0;
 
-  // The two methods below are called by Analysis::computeDetailedInfo when
-  // iterating over the reports.
-  virtual ArrayRef getAffectedRegisters() const { return {}; }
-  virtual void setOverwritingInstrs(ArrayRef Instrs) {}
-
   void printBasicInfo(raw_ostream &OS, const BinaryContext &BC,
   StringRef IssueKind) const;
 };
@@ -231,27 +226,11 @@ struct Report {
 struct GadgetReport : public Report {
   // The particular kind of gadget that is detected.
   const GadgetKind &Kind;
-  // The set of registers related to this gadget report (possibly empty).
-  SmallVector AffectedRegisters;
-  // The instructions that clobber the affected registers.
-  // There is no one-to-one correspondence with AffectedRegisters: for example,
-  // the same register can be overwritten by different instructions in 
different
-  // preceding basic blocks.
-  SmallVector OverwritingInstrs;
-
-  GadgetReport(const GadgetKind &Kind, MCInstReference Location,
-   MCPhysReg AffectedRegister)
-  : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {}
-
-  void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
 
-  ArrayRef getAffectedRegisters() const override {
-return AffectedRegisters;
-  }
+  GadgetReport(const GadgetKind &Kind, MCInstReference Location)
+  : Report(Location), Kind(Kind) {}
 
-  void setOverwritingInstrs(ArrayRef Instrs) override {
-OverwritingInstrs.assign(Instrs.begin(), Instrs.end());
-  }
+  void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
 };
 
 /// Report with a free-form message attached.
@@ -263,8 +242,75 @@ struct GenericReport : public Report {
   const BinaryContext &BC) const override;
 };
 
+/// An information about an issue collected on the slower, detailed,
+/// run of an analysis.
+class ExtraInfo {
+public:
+  virtual void print(raw_ostream &OS, const MCInstReference Location) const = 
0;
+
+  virtual ~ExtraInfo() {}
+};
+
+class ClobberingInfo : public ExtraInfo {
+  SmallVector ClobberingInstrs;
+
+public:
+  ClobberingInfo(const ArrayRef Instrs)
+  : ClobberingInstrs(Instrs) {}
+
+  void print(raw_ostream &OS, const MCInstReference Location) const override;
+};
+
+/// A brief version of a report that can be further augmented with the details.
+///
+/// It is common for a particular type of gadget detector to be tied to some
+/// specific kind of analysis. If an issue is returned by that detector, it may
+/// be further augmented with the detailed info in an analysis-specific way,
+/// or just be left as-is (f.e. if a free-form warning was reported).
+template  struct BriefReport {
+  BriefReport(std::shared_ptr Issue,
+  const std::optional RequestedDetails)
+  : Issue(Issue), RequestedDetails(RequestedDetails) {}
+
+  std::shared_ptr Issue;
+  std::optional RequestedDetails;
+};
+
+/// A detailed version of a report.
+struct DetailedReport {
+  DetailedReport(std::shared_ptr Issue,
+ std::shared_ptr Details)
+  : Issue(Issue), Details(Details) {}
+
+  std::shared_ptr Issue;
+  std::shared_ptr Details;
+};
+
 struct FunctionAnalysisResult {
-  std::vector> Diagnostics;
+  std::vector Diagnostics;
+};
+
+/// A helper class storing per-function context to be instantiated by Analysis.
+class FunctionAnalysis {
+  BinaryContext &BC;
+  BinaryFunction &BF;
+  MCPlusBuilder::AllocatorIdTy AllocatorId;
+  FunctionAnalysisResult Result;
+
+  bool PacRetGadgetsOnly;
+
+  void findUnsafeUses(SmallVector> &Reports);
+  void augmentUnsafeUseReports(const ArrayRef> Reports);
+
+public:
+  FunctionAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy 
AllocatorId,
+ 

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Extend common::AtomicDefaultMemOrderType enumeration (PR #136312)

2025-04-22 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah approved this pull request.

LGTM, thanks

https://github.com/llvm/llvm-project/pull/136312
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [HEXAGON] Fix corner cases for hwloops pass (#135439) (PR #135657)

2025-04-22 Thread Ikhlas Ajbar via llvm-branch-commits

iajbar wrote:

LGTM.


https://github.com/llvm/llvm-project/pull/135657
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DirectX] Adding support for Root Descriptor in Obj2yaml/Yaml2Obj (PR #136732)

2025-04-22 Thread via llvm-branch-commits

https://github.com/joaosaffran updated 
https://github.com/llvm/llvm-project/pull/136732

>From 156975528eefad5be199165fcfce6398f6a79ab8 Mon Sep 17 00:00:00 2001
From: joaosaffran 
Date: Fri, 18 Apr 2025 00:41:37 +
Subject: [PATCH 1/4] adding support for root descriptor

---
 llvm/include/llvm/BinaryFormat/DXContainer.h  | 25 
 .../BinaryFormat/DXContainerConstants.def | 15 +
 .../llvm/MC/DXContainerRootSignature.h|  2 +
 llvm/include/llvm/Object/DXContainer.h| 39 
 .../include/llvm/ObjectYAML/DXContainerYAML.h | 36 ++-
 llvm/lib/MC/DXContainerRootSignature.cpp  | 26 
 llvm/lib/ObjectYAML/DXContainerEmitter.cpp| 19 +-
 llvm/lib/ObjectYAML/DXContainerYAML.cpp   | 63 +--
 .../RootSignature-MultipleParameters.yaml | 20 --
 9 files changed, 233 insertions(+), 12 deletions(-)

diff --git a/llvm/include/llvm/BinaryFormat/DXContainer.h 
b/llvm/include/llvm/BinaryFormat/DXContainer.h
index 455657980bf40..d6e585c94fed1 100644
--- a/llvm/include/llvm/BinaryFormat/DXContainer.h
+++ b/llvm/include/llvm/BinaryFormat/DXContainer.h
@@ -18,6 +18,7 @@
 #include "llvm/Support/SwapByteOrder.h"
 #include "llvm/TargetParser/Triple.h"
 
+#include 
 #include 
 
 namespace llvm {
@@ -158,6 +159,11 @@ enum class RootElementFlag : uint32_t {
 #include "DXContainerConstants.def"
 };
 
+#define ROOT_DESCRIPTOR_FLAG(Num, Val) Val = 1ull << Num,
+enum class RootDescriptorFlag : uint32_t {
+#include "DXContainerConstants.def"
+};
+
 #define ROOT_PARAMETER(Val, Enum) Enum = Val,
 enum class RootParameterType : uint32_t {
 #include "DXContainerConstants.def"
@@ -594,6 +600,25 @@ struct RootConstants {
 sys::swapByteOrder(Num32BitValues);
   }
 };
+struct RootDescriptor_V1_0 {
+  uint32_t ShaderRegister;
+  uint32_t RegisterSpace;
+  void swapBytes() {
+sys::swapByteOrder(ShaderRegister);
+sys::swapByteOrder(RegisterSpace);
+  }
+};
+
+struct RootDescriptor_V1_1 {
+  uint32_t ShaderRegister;
+  uint32_t RegisterSpace;
+  uint32_t Flags;
+  void swapBytes() {
+sys::swapByteOrder(ShaderRegister);
+sys::swapByteOrder(RegisterSpace);
+sys::swapByteOrder(Flags);
+  }
+};
 
 struct RootParameterHeader {
   uint32_t ParameterType;
diff --git a/llvm/include/llvm/BinaryFormat/DXContainerConstants.def 
b/llvm/include/llvm/BinaryFormat/DXContainerConstants.def
index 590ded5e8c899..6840901460ced 100644
--- a/llvm/include/llvm/BinaryFormat/DXContainerConstants.def
+++ b/llvm/include/llvm/BinaryFormat/DXContainerConstants.def
@@ -72,9 +72,24 @@ ROOT_ELEMENT_FLAG(11, SamplerHeapDirectlyIndexed)
 #undef ROOT_ELEMENT_FLAG
 #endif // ROOT_ELEMENT_FLAG
 
+ 
+// ROOT_ELEMENT_FLAG(bit offset for the flag, name).
+#ifdef ROOT_DESCRIPTOR_FLAG
+
+ROOT_DESCRIPTOR_FLAG(0, NONE)
+ROOT_DESCRIPTOR_FLAG(2, DATA_VOLATILE)
+ROOT_DESCRIPTOR_FLAG(4, DATA_STATIC_WHILE_SET_AT_EXECUTE)
+ROOT_DESCRIPTOR_FLAG(8, DATA_STATIC)
+#undef ROOT_DESCRIPTOR_FLAG
+#endif // ROOT_DESCRIPTOR_FLAG
+
+
 #ifdef ROOT_PARAMETER
 
 ROOT_PARAMETER(1, Constants32Bit)
+ROOT_PARAMETER(2, CBV)
+ROOT_PARAMETER(3, SRV)
+ROOT_PARAMETER(4, UAV)
 #undef ROOT_PARAMETER
 #endif // ROOT_PARAMETER
 
diff --git a/llvm/include/llvm/MC/DXContainerRootSignature.h 
b/llvm/include/llvm/MC/DXContainerRootSignature.h
index 6d3329a2c6ce9..5e3bc9b873f82 100644
--- a/llvm/include/llvm/MC/DXContainerRootSignature.h
+++ b/llvm/include/llvm/MC/DXContainerRootSignature.h
@@ -19,6 +19,8 @@ struct RootParameter {
   dxbc::RootParameterHeader Header;
   union {
 dxbc::RootConstants Constants;
+dxbc::RootDescriptor_V1_0 Descriptor_V10;
+dxbc::RootDescriptor_V1_1 Descriptor_V11;
   };
 };
 struct RootSignatureDesc {
diff --git a/llvm/include/llvm/Object/DXContainer.h 
b/llvm/include/llvm/Object/DXContainer.h
index e8287ce078365..7812906700fe3 100644
--- a/llvm/include/llvm/Object/DXContainer.h
+++ b/llvm/include/llvm/Object/DXContainer.h
@@ -15,6 +15,7 @@
 #ifndef LLVM_OBJECT_DXCONTAINER_H
 #define LLVM_OBJECT_DXCONTAINER_H
 
+#include "llvm/ADT/STLForwardCompat.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/BinaryFormat/DXContainer.h"
@@ -149,6 +150,36 @@ struct RootConstantView : RootParameterView {
   }
 };
 
+struct RootDescriptorView_V1_0 : RootParameterView {
+  static bool classof(const RootParameterView *V) {
+return (V->Header.ParameterType ==
+llvm::to_underlying(dxbc::RootParameterType::CBV) ||
+V->Header.ParameterType ==
+llvm::to_underlying(dxbc::RootParameterType::SRV) ||
+V->Header.ParameterType ==
+llvm::to_underlying(dxbc::RootParameterType::UAV));
+  }
+
+  llvm::Expected read() {
+return readParameter();
+  }
+};
+
+struct RootDescriptorView_V1_1 : RootParameterView {
+  static bool classof(const RootParameterView *V) {
+return (V->Header.ParameterType ==
+llvm::to_underlying(dxbc::RootParameterType::CBV) ||
+V->Header.Par

[llvm-branch-commits] [llvm] [DirectX] Adding support for Root Descriptor in Obj2yaml/Yaml2Obj (PR #136732)

2025-04-22 Thread via llvm-branch-commits

https://github.com/joaosaffran edited 
https://github.com/llvm/llvm-project/pull/136732
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DirectX] Adding support for Root Descriptor in Obj2yaml/Yaml2Obj (PR #136732)

2025-04-22 Thread via llvm-branch-commits

https://github.com/joaosaffran edited 
https://github.com/llvm/llvm-project/pull/136732
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DirectX] Adding support for Root Descriptor in Obj2yaml/Yaml2Obj (PR #136732)

2025-04-22 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-backend-directx

@llvm/pr-subscribers-mc

Author: None (joaosaffran)


Changes



---

Patch is 29.79 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/136732.diff


13 Files Affected:

- (modified) llvm/include/llvm/BinaryFormat/DXContainer.h (+24) 
- (modified) llvm/include/llvm/BinaryFormat/DXContainerConstants.def (+15) 
- (modified) llvm/include/llvm/MC/DXContainerRootSignature.h (+2) 
- (modified) llvm/include/llvm/Object/DXContainer.h (+44-3) 
- (modified) llvm/include/llvm/ObjectYAML/DXContainerYAML.h (+32-2) 
- (modified) llvm/lib/MC/DXContainerRootSignature.cpp (+25) 
- (modified) llvm/lib/ObjectYAML/DXContainerEmitter.cpp (+17) 
- (modified) llvm/lib/ObjectYAML/DXContainerYAML.cpp (+48-4) 
- (added) llvm/test/ObjectYAML/DXContainer/RootSignature-Descriptor1.0.yaml 
(+45) 
- (added) llvm/test/ObjectYAML/DXContainer/RootSignature-Descriptor1.1.yaml 
(+47) 
- (modified) 
llvm/test/ObjectYAML/DXContainer/RootSignature-MultipleParameters.yaml (+16-4) 
- (modified) llvm/unittests/Object/DXContainerTest.cpp (+91) 
- (modified) llvm/unittests/ObjectYAML/DXContainerYAMLTest.cpp (+103) 


``diff
diff --git a/llvm/include/llvm/BinaryFormat/DXContainer.h 
b/llvm/include/llvm/BinaryFormat/DXContainer.h
index 455657980bf40..2e9b3089a6da5 100644
--- a/llvm/include/llvm/BinaryFormat/DXContainer.h
+++ b/llvm/include/llvm/BinaryFormat/DXContainer.h
@@ -158,6 +158,11 @@ enum class RootElementFlag : uint32_t {
 #include "DXContainerConstants.def"
 };
 
+#define ROOT_DESCRIPTOR_FLAG(Num, Val) Val = 1ull << Num,
+enum class RootDescriptorFlag : uint32_t {
+#include "DXContainerConstants.def"
+};
+
 #define ROOT_PARAMETER(Val, Enum) Enum = Val,
 enum class RootParameterType : uint32_t {
 #include "DXContainerConstants.def"
@@ -594,6 +599,25 @@ struct RootConstants {
 sys::swapByteOrder(Num32BitValues);
   }
 };
+struct RootDescriptor_V1_0 {
+  uint32_t ShaderRegister;
+  uint32_t RegisterSpace;
+  void swapBytes() {
+sys::swapByteOrder(ShaderRegister);
+sys::swapByteOrder(RegisterSpace);
+  }
+};
+
+struct RootDescriptor_V1_1 {
+  uint32_t ShaderRegister;
+  uint32_t RegisterSpace;
+  uint32_t Flags;
+  void swapBytes() {
+sys::swapByteOrder(ShaderRegister);
+sys::swapByteOrder(RegisterSpace);
+sys::swapByteOrder(Flags);
+  }
+};
 
 struct RootParameterHeader {
   uint32_t ParameterType;
diff --git a/llvm/include/llvm/BinaryFormat/DXContainerConstants.def 
b/llvm/include/llvm/BinaryFormat/DXContainerConstants.def
index 590ded5e8c899..bd9bd760547dc 100644
--- a/llvm/include/llvm/BinaryFormat/DXContainerConstants.def
+++ b/llvm/include/llvm/BinaryFormat/DXContainerConstants.def
@@ -72,9 +72,24 @@ ROOT_ELEMENT_FLAG(11, SamplerHeapDirectlyIndexed)
 #undef ROOT_ELEMENT_FLAG
 #endif // ROOT_ELEMENT_FLAG
 
+ 
+// ROOT_ELEMENT_FLAG(bit offset for the flag, name).
+#ifdef ROOT_DESCRIPTOR_FLAG
+
+ROOT_DESCRIPTOR_FLAG(0, NONE)
+ROOT_DESCRIPTOR_FLAG(1, DATA_VOLATILE)
+ROOT_DESCRIPTOR_FLAG(2, DATA_STATIC_WHILE_SET_AT_EXECUTE)
+ROOT_DESCRIPTOR_FLAG(3, DATA_STATIC)
+#undef ROOT_DESCRIPTOR_FLAG
+#endif // ROOT_DESCRIPTOR_FLAG
+
+
 #ifdef ROOT_PARAMETER
 
 ROOT_PARAMETER(1, Constants32Bit)
+ROOT_PARAMETER(2, CBV)
+ROOT_PARAMETER(3, SRV)
+ROOT_PARAMETER(4, UAV)
 #undef ROOT_PARAMETER
 #endif // ROOT_PARAMETER
 
diff --git a/llvm/include/llvm/MC/DXContainerRootSignature.h 
b/llvm/include/llvm/MC/DXContainerRootSignature.h
index 6d3329a2c6ce9..5e3bc9b873f82 100644
--- a/llvm/include/llvm/MC/DXContainerRootSignature.h
+++ b/llvm/include/llvm/MC/DXContainerRootSignature.h
@@ -19,6 +19,8 @@ struct RootParameter {
   dxbc::RootParameterHeader Header;
   union {
 dxbc::RootConstants Constants;
+dxbc::RootDescriptor_V1_0 Descriptor_V10;
+dxbc::RootDescriptor_V1_1 Descriptor_V11;
   };
 };
 struct RootSignatureDesc {
diff --git a/llvm/include/llvm/Object/DXContainer.h 
b/llvm/include/llvm/Object/DXContainer.h
index e8287ce078365..7b64c8ee15ae6 100644
--- a/llvm/include/llvm/Object/DXContainer.h
+++ b/llvm/include/llvm/Object/DXContainer.h
@@ -120,9 +120,10 @@ template  struct ViewArray {
 namespace DirectX {
 struct RootParameterView {
   const dxbc::RootParameterHeader &Header;
+  uint32_t Version;
   StringRef ParamData;
-  RootParameterView(const dxbc::RootParameterHeader &H, StringRef P)
-  : Header(H), ParamData(P) {}
+  RootParameterView(uint32_t V, const dxbc::RootParameterHeader &H, StringRef 
P)
+  : Header(H), Version(V), ParamData(P) {}
 
   template  Expected readParameter() {
 T Struct;
@@ -149,6 +150,38 @@ struct RootConstantView : RootParameterView {
   }
 };
 
+struct RootDescriptorView_V1_0 : RootParameterView {
+  static bool classof(const RootParameterView *V) {
+return (V->Version == 1 &&
+(V->Header.ParameterType ==
+ llvm::to_underlying(dxbc::RootParameterType::CBV) ||
+ V->Header.ParameterType ==
+ llvm::to_underlying(dxbc::Roo

[llvm-branch-commits] [llvm] [DirectX] Adding support for Root Descriptor in Obj2yaml/Yaml2Obj (PR #136732)

2025-04-22 Thread via llvm-branch-commits

https://github.com/joaosaffran ready_for_review 
https://github.com/llvm/llvm-project/pull/136732
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DirectX] Adding support for Root Descriptor in Obj2yaml/Yaml2Obj (PR #136732)

2025-04-22 Thread via llvm-branch-commits


@@ -73,24 +75,50 @@ struct ShaderHash {
   std::vector Digest;
 };
 
-#define ROOT_ELEMENT_FLAG(Num, Val) bool Val = false;
-
 struct RootConstantsYaml {
   uint32_t ShaderRegister;
   uint32_t RegisterSpace;
   uint32_t Num32BitValues;
 };
 
+#define ROOT_DESCRIPTOR_FLAG(Num, Val) bool Val = false;
+struct RootDescriptorYaml {
+  RootDescriptorYaml() = default;
+
+  uint32_t ShaderRegister;
+  uint32_t RegisterSpace;
+
+  uint32_t getEncodedFlags();
+
+#include "llvm/BinaryFormat/DXContainerConstants.def"
+};
+
 struct RootParameterYamlDesc {
   uint32_t Type;
   uint32_t Visibility;
   uint32_t Offset;
+  RootParameterYamlDesc(){};
+  RootParameterYamlDesc(uint32_t T) : Type(T) {

joaosaffran wrote:

This constructor is being added to allow proper initialization of Root 
Parameter data in the union below.

https://github.com/llvm/llvm-project/pull/136732
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits


@@ -2873,14 +2890,33 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 
   llvm::Metadata *MD = CreateMetadataIdentifierForType(FD->getType());
   F->addTypeMetadata(0, MD);
-  F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+  // Add the generalized identifier if not added already.
+  if (!HasExistingGeneralizedTypeMD(F))

Prabhuk wrote:

This was my previous implementation. I noticed that the 
CreateFunctionTypeMetadataForIcall was called twice and changed the 
implementation.

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] LiveRangeShrink: Early exit when encountering a code motion barrier. (PR #136806)

2025-04-22 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-regalloc

Author: Peter Collingbourne (pcc)


Changes

Without this, we end up with quadratic behavior affecting functions with
large numbers of code motion barriers, such as CFI jump tables.


---
Full diff: https://github.com/llvm/llvm-project/pull/136806.diff


1 Files Affected:

- (modified) llvm/lib/CodeGen/LiveRangeShrink.cpp (+12-3) 


``diff
diff --git a/llvm/lib/CodeGen/LiveRangeShrink.cpp 
b/llvm/lib/CodeGen/LiveRangeShrink.cpp
index dae7e14e54aae..46e837c086c5d 100644
--- a/llvm/lib/CodeGen/LiveRangeShrink.cpp
+++ b/llvm/lib/CodeGen/LiveRangeShrink.cpp
@@ -95,14 +95,24 @@ static MachineInstr *FindDominatedInstruction(MachineInstr 
&New,
   return Old;
 }
 
+static bool isCodeMotionBarrier(MachineInstr &MI) {
+  return MI.hasUnmodeledSideEffects() && !MI.isPseudoProbe();
+}
+
 /// Builds Instruction to its dominating order number map \p M by traversing
 /// from instruction \p Start.
 static void BuildInstOrderMap(MachineBasicBlock::iterator Start,
   InstOrderMap &M) {
   M.clear();
   unsigned i = 0;
-  for (MachineInstr &I : make_range(Start, Start->getParent()->end()))
+  bool SawStore = false;
+  for (MachineInstr &I : make_range(Start, Start->getParent()->end())) {
+if (I.mayStore())
+  SawStore = true;
+if (!I.isSafeToMove(SawStore) && isCodeMotionBarrier(I))
+  break;
 M[&I] = i++;
+  }
 }
 
 bool LiveRangeShrink::runOnMachineFunction(MachineFunction &MF) {
@@ -166,8 +176,7 @@ bool LiveRangeShrink::runOnMachineFunction(MachineFunction 
&MF) {
 // If MI has side effects, it should become a barrier for code motion.
 // IOM is rebuild from the next instruction to prevent later
 // instructions from being moved before this MI.
-if (MI.hasUnmodeledSideEffects() && !MI.isPseudoProbe() &&
-Next != MBB.end()) {
+if (isCodeMotionBarrier(MI) && Next != MBB.end()) {
   BuildInstOrderMap(Next, IOM);
   SawStore = false;
 }

``




https://github.com/llvm/llvm-project/pull/136806
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] LiveRangeShrink: Early exit when encountering a code motion barrier. (PR #136806)

2025-04-22 Thread Peter Collingbourne via llvm-branch-commits

https://github.com/pcc created https://github.com/llvm/llvm-project/pull/136806

Without this, we end up with quadratic behavior affecting functions with
large numbers of code motion barriers, such as CFI jump tables.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits


@@ -2873,14 +2890,33 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 
   llvm::Metadata *MD = CreateMetadataIdentifierForType(FD->getType());
   F->addTypeMetadata(0, MD);
-  F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+  // Add the generalized identifier if not added already.
+  if (!HasExistingGeneralizedTypeMD(F))
+F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 
   // Emit a hash-based bit set entry for cross-DSO calls.
   if (CodeGenOpts.SanitizeCfiCrossDso)
 if (auto CrossDsoTypeId = CreateCrossDsoCfiTypeId(MD))
   F->addTypeMetadata(0, llvm::ConstantAsMetadata::get(CrossDsoTypeId));
 }
 
+void CodeGenModule::CreateCalleeTypeMetadataForIcall(const QualType &QT,
+ llvm::CallBase *CB) {
+  // Only if needed for call graph section and only for indirect calls.
+  if (!CodeGenOpts.CallGraphSection || !CB->isIndirectCall())
+return;
+
+  llvm::Metadata *TypeIdMD = CreateMetadataIdentifierGeneralized(QT);
+  llvm::MDTuple *TypeTuple = llvm::MDTuple::get(
+  getLLVMContext(), {llvm::ConstantAsMetadata::get(llvm::ConstantInt::get(
+ llvm::Type::getInt64Ty(getLLVMContext()), 0)),
+ TypeIdMD});
+  SmallVector TypeIdMDList;
+  TypeIdMDList.push_back(TypeTuple);
+  llvm::MDTuple *MDN = llvm::MDNode::get(getLLVMContext(), TypeIdMDList);

Prabhuk wrote:

That worked! Thank you.

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits


@@ -2813,7 +2814,17 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const 
NamedDecl *ND) {
 
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
llvm::Function *F) {
-  // Only if we are checking indirect calls.
+  bool EmittedMDIdGeneralized = false;
+  if (CodeGenOpts.CallGraphSection &&
+  (!F->hasLocalLinkage() ||
+   F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/true,
+/*IgnoreAssumeLikeCalls=*/true,
+/*IgnoreLLVMUsed=*/false))) {
+F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+EmittedMDIdGeneralized = true;

Prabhuk wrote:

This code has been refactored. Marking this as resolved.

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits


@@ -1622,6 +1622,9 @@ class CodeGenModule : public CodeGenTypeCache {
   void CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
   llvm::Function *F);
 
+  /// Create and attach type metadata to the given call.
+  void CreateCalleeTypeMetadataForIcall(const QualType &QT, llvm::CallBase 
*CB);

Prabhuk wrote:

Ah the capitalization! nevermind the previous comment. I'll fix the API names.

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/117036


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/117036


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 82f8060383d42a33ef1a2786ca72fe92cbdfa2d8 
11c09133406aacae0579f3454cb306c0f316007b --extensions c,cpp,h -- 
clang/test/CodeGen/call-graph-section-templates.cpp 
clang/test/CodeGen/call-graph-section-virtual-methods.cpp 
clang/test/CodeGen/call-graph-section.c 
clang/test/CodeGen/call-graph-section.cpp clang/lib/CodeGen/CGCall.cpp 
clang/lib/CodeGen/CodeGenModule.cpp clang/lib/CodeGen/CodeGenModule.h
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index 43cd240557..a28e1700bd 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2909,7 +2909,7 @@ void 
CodeGenModule::CreateCalleeTypeMetadataForIcall(const QualType &QT,
   getLLVMContext(), {llvm::ConstantAsMetadata::get(llvm::ConstantInt::get(
  llvm::Type::getInt64Ty(getLLVMContext()), 0)),
  TypeIdMD});
-  llvm::MDTuple *MDN = llvm::MDNode::get(getLLVMContext(), { TypeTuple });
+  llvm::MDTuple *MDN = llvm::MDNode::get(getLLVMContext(), {TypeTuple});
   CB->setMetadata(llvm::LLVMContext::MD_callee_type, MDN);
 }
 

``




https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [clang-format] Correctly annotate kw_operator in using decls (#136545) (PR #136808)

2025-04-22 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-format

Author: None (llvmbot)


Changes

Backport 037657de7e5ccd4a37054829874a209b82fb8be7

Requested by: @owenca

---
Full diff: https://github.com/llvm/llvm-project/pull/136808.diff


2 Files Affected:

- (modified) clang/lib/Format/TokenAnnotator.cpp (+4-2) 
- (modified) clang/unittests/Format/TokenAnnotatorTest.cpp (+5) 


``diff
diff --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index 44580d8624684..11b941c5a0411 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -3961,8 +3961,10 @@ void 
TokenAnnotator::calculateFormattingInformation(AnnotatedLine &Line) const {
   FormatToken *AfterLastAttribute = nullptr;
   FormatToken *ClosingParen = nullptr;
 
-  for (auto *Tok = FirstNonComment ? FirstNonComment->Next : nullptr; Tok;
-   Tok = Tok->Next) {
+  for (auto *Tok = FirstNonComment && FirstNonComment->isNot(tok::kw_using)
+   ? FirstNonComment->Next
+   : nullptr;
+   Tok; Tok = Tok->Next) {
 if (Tok->is(TT_StartOfName))
   SeenName = true;
 if (Tok->Previous->EndsCppAttributeGroup)
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp 
b/clang/unittests/Format/TokenAnnotatorTest.cpp
index b7b8a21b726b6..757db66c3e298 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -1073,6 +1073,11 @@ TEST_F(TokenAnnotatorTest, 
UnderstandsOverloadedOperators) {
   ASSERT_EQ(Tokens.size(), 11u) << Tokens;
   EXPECT_TOKEN(Tokens[3], tok::identifier, TT_FunctionDeclarationName);
   EXPECT_TOKEN(Tokens[7], tok::l_paren, TT_OverloadedOperatorLParen);
+
+  Tokens = annotate("using std::operator==;");
+  ASSERT_EQ(Tokens.size(), 7u) << Tokens;
+  // Not TT_FunctionDeclarationName.
+  EXPECT_TOKEN(Tokens[3], tok::kw_operator, TT_Unknown);
 }
 
 TEST_F(TokenAnnotatorTest, OverloadedOperatorInTemplate) {

``




https://github.com/llvm/llvm-project/pull/136808
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [clang-format] Correctly annotate kw_operator in using decls (#136545) (PR #136808)

2025-04-22 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/136808

Backport 037657de7e5ccd4a37054829874a209b82fb8be7

Requested by: @owenca

>From 38c4df051750c677518f61ae9b072a0b015be2b9 Mon Sep 17 00:00:00 2001
From: Owen Pan 
Date: Tue, 22 Apr 2025 21:08:56 -0700
Subject: [PATCH] [clang-format] Correctly annotate kw_operator in using decls
 (#136545)

Fix #136541

(cherry picked from commit 037657de7e5ccd4a37054829874a209b82fb8be7)
---
 clang/lib/Format/TokenAnnotator.cpp   | 6 --
 clang/unittests/Format/TokenAnnotatorTest.cpp | 5 +
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index 44580d8624684..11b941c5a0411 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -3961,8 +3961,10 @@ void 
TokenAnnotator::calculateFormattingInformation(AnnotatedLine &Line) const {
   FormatToken *AfterLastAttribute = nullptr;
   FormatToken *ClosingParen = nullptr;
 
-  for (auto *Tok = FirstNonComment ? FirstNonComment->Next : nullptr; Tok;
-   Tok = Tok->Next) {
+  for (auto *Tok = FirstNonComment && FirstNonComment->isNot(tok::kw_using)
+   ? FirstNonComment->Next
+   : nullptr;
+   Tok; Tok = Tok->Next) {
 if (Tok->is(TT_StartOfName))
   SeenName = true;
 if (Tok->Previous->EndsCppAttributeGroup)
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp 
b/clang/unittests/Format/TokenAnnotatorTest.cpp
index b7b8a21b726b6..757db66c3e298 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -1073,6 +1073,11 @@ TEST_F(TokenAnnotatorTest, 
UnderstandsOverloadedOperators) {
   ASSERT_EQ(Tokens.size(), 11u) << Tokens;
   EXPECT_TOKEN(Tokens[3], tok::identifier, TT_FunctionDeclarationName);
   EXPECT_TOKEN(Tokens[7], tok::l_paren, TT_OverloadedOperatorLParen);
+
+  Tokens = annotate("using std::operator==;");
+  ASSERT_EQ(Tokens.size(), 7u) << Tokens;
+  // Not TT_FunctionDeclarationName.
+  EXPECT_TOKEN(Tokens[3], tok::kw_operator, TT_Unknown);
 }
 
 TEST_F(TokenAnnotatorTest, OverloadedOperatorInTemplate) {

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [clang-format] Correctly annotate kw_operator in using decls (#136545) (PR #136808)

2025-04-22 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/136808
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [clang-format] Correctly annotate kw_operator in using decls (#136545) (PR #136808)

2025-04-22 Thread via llvm-branch-commits

llvmbot wrote:

@HazardyKnusperkeks What do you think about merging this PR to the release 
branch?

https://github.com/llvm/llvm-project/pull/136808
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [llvm] Introduce callee_type metadata (PR #87573)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87573

>From a8a5848885e12c771f12cfa33b4dbc6a0272e925 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:04 -0700
Subject: [PATCH 01/10] Update clang/lib/CodeGen/CodeGenModule.cpp

Cleaner if checks.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index e19bbee996f58..ff1586d2fa8ab 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2711,7 +2711,7 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT,
llvm::CallBase *CB) {
   // Only if needed for call graph section and only for indirect calls.
-  if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall()))
+  if (!CodeGenOpts.CallGraphSection || !CB || !CB->isIndirectCall())
 return;
 
   auto *MD = CreateMetadataIdentifierGeneralized(QT);

>From 019b2ca5e1c263183ed114e0b967b4e77b4a17a8 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:31 -0700
Subject: [PATCH 02/10] Update clang/lib/CodeGen/CodeGenModule.cpp

Update the comments as suggested.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index ff1586d2fa8ab..5635a87d2358a 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2680,9 +2680,9 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
   bool EmittedMDIdGeneralized = false;
   if (CodeGenOpts.CallGraphSection &&
   (!F->hasLocalLinkage() ||
-   F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true,
-/* IgnoreAssumeLikeCalls */ true,
-/* IgnoreLLVMUsed */ false))) {
+   F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/ true,
+/*IgnoreAssumeLikeCalls=*/ true,
+/*IgnoreLLVMUsed=*/ false))) {
 F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 EmittedMDIdGeneralized = true;
   }

>From 99242900c51778abd4b7e7f4361b09202b7abcda Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 29 Apr 2024 11:53:40 -0700
Subject: [PATCH 03/10] dyn_cast to isa

Created using spr 1.3.6-beta.1
---
 clang/lib/CodeGen/CGCall.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 526a63b24ff83..45033ced1d834 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5713,8 +5713,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) {
   if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl)) 
{
 // Type id metadata is set only for C/C++ contexts.
-if (dyn_cast(FD) || dyn_cast(FD) ||
-dyn_cast(FD)) {
+if (isa(FD) || isa(FD) ||
+isa(FD)) {
   CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke);
 }
   }

>From 24882b15939b781bcf28d87fdf4f6e8834b6cfde Mon Sep 17 00:00:00 2001
From: prabhukr 
Date: Tue, 10 Dec 2024 14:54:27 -0800
Subject: [PATCH 04/10] Address review comments. Break llvm and clang patches.

Created using spr 1.3.6-beta.1
---
 llvm/lib/IR/Verifier.cpp  | 7 +++
 llvm/test/Verifier/operand-bundles.ll | 4 ++--
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 0ad7ba555bfad..b72672e7b8e56 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -3707,10 +3707,9 @@ void Verifier::visitCallBase(CallBase &Call) {
 if (Intrinsic::ID ID = (Intrinsic::ID)F->getIntrinsicID())
   visitIntrinsicCall(ID, Call);
 
-  // Verify that a callsite has at most one "deopt", at most one "funclet", at
-  // most one "gc-transition", at most one "cfguardtarget", at most one "type",
-  // at most one "preallocated" operand bundle, and at most one "ptrauth"
-  // operand bundle.
+  // Verify that a callsite has at most one operand bundle for each of the
+  // following: "deopt", "funclet", "gc-transition", "cfguardtarget", "type",
+  // "preallocated", and "ptrauth".
   bool FoundDeoptBundle = false, FoundFuncletBundle = false,
FoundGCTransitionBundle = false, FoundCFGuardTargetBundle = false,
FoundPreallocatedBundle = false, FoundGCLiveBundle = false,
diff --git a/llvm/test/Verifier/operand-bundles.ll 
b/llvm/t

[llvm-branch-commits] [llvm] [llvm] Add option to emit `callgraph` section (PR #87574)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87574

>From 1d7ee612e408ee7e64e984eb08e6d7089a435d09 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran 
Date: Sun, 2 Feb 2025 00:58:49 +
Subject: [PATCH 1/4] Simplify MIR test.

Created using spr 1.3.6-beta.1
---
 .../CodeGen/MIR/X86/call-site-info-typeid.mir | 21 ++-
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir 
b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
index 5ab797bfcc18f..a99ee50a608fb 100644
--- a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
+++ b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
@@ -8,11 +8,6 @@
 # CHECK-NEXT: 123456789 }
 
 --- |
-  ; ModuleID = 'test.ll'
-  source_filename = "test.ll"
-  target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
-  target triple = "x86_64-unknown-linux-gnu"
-  
   define dso_local void @foo(i8 signext %a) {
   entry:
 ret void
@@ -21,10 +16,10 @@
   define dso_local i32 @main() {
   entry:
 %retval = alloca i32, align 4
-%fp = alloca void (i8)*, align 8
-store i32 0, i32* %retval, align 4
-store void (i8)* @foo, void (i8)** %fp, align 8
-%0 = load void (i8)*, void (i8)** %fp, align 8
+%fp = alloca ptr, align 8
+store i32 0, ptr %retval, align 4
+store ptr @foo, ptr %fp, align 8
+%0 = load ptr, ptr %fp, align 8
 call void %0(i8 signext 97)
 ret i32 0
   }
@@ -42,12 +37,8 @@ body: |
 name:main
 tracksRegLiveness: true
 stack:
-  - { id: 0, name: retval, type: default, offset: 0, size: 4, alignment: 4, 
-  stack-id: default, callee-saved-register: '', callee-saved-restored: 
true, 
-  debug-info-variable: '', debug-info-expression: '', debug-info-location: 
'' }
-  - { id: 1, name: fp, type: default, offset: 0, size: 8, alignment: 8, 
-  stack-id: default, callee-saved-register: '', callee-saved-restored: 
true, 
-  debug-info-variable: '', debug-info-expression: '', debug-info-location: 
'' }
+  - { id: 0, name: retval, size: 4, alignment: 4 }
+  - { id: 1, name: fp, size: 8, alignment: 8 }
 callSites:
   - { bb: 0, offset: 6, fwdArgRegs: [], typeId: 
 123456789 }

>From 86e2c9dc37170499252ed50c6bbef2931e106fbb Mon Sep 17 00:00:00 2001
From: prabhukr 
Date: Thu, 13 Mar 2025 01:03:40 +
Subject: [PATCH 2/4] Add requested tests part 1.

Created using spr 1.3.6-beta.1
---
 ...te-info-ambiguous-indirect-call-typeid.mir | 145 ++
 .../call-site-info-direct-calls-typeid.mir| 145 ++
 2 files changed, 290 insertions(+)
 create mode 100644 
llvm/test/CodeGen/MIR/X86/call-site-info-ambiguous-indirect-call-typeid.mir
 create mode 100644 
llvm/test/CodeGen/MIR/X86/call-site-info-direct-calls-typeid.mir

diff --git 
a/llvm/test/CodeGen/MIR/X86/call-site-info-ambiguous-indirect-call-typeid.mir 
b/llvm/test/CodeGen/MIR/X86/call-site-info-ambiguous-indirect-call-typeid.mir
new file mode 100644
index 0..9d1b099cc9093
--- /dev/null
+++ 
b/llvm/test/CodeGen/MIR/X86/call-site-info-ambiguous-indirect-call-typeid.mir
@@ -0,0 +1,145 @@
+# Test MIR printer and parser for type id field in callSites. It is used
+# for propogating call site type identifiers to emit in the call graph section.
+
+# RUN: llc --call-graph-section %s -run-pass=none -o - | FileCheck %s
+# CHECK: name: main
+# CHECK: callSites:
+# CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: []
+# CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
+# CHECK-NEXT: 1234567890 }
+
+--- |  
+  ; Function Attrs: mustprogress noinline nounwind optnone uwtable
+  define dso_local noundef i32 @_Z3addii(i32 noundef %a, i32 noundef %b) #0 
!type !6 !type !6 {
+  entry:
+%a.addr = alloca i32, align 4
+%b.addr = alloca i32, align 4
+store i32 %a, ptr %a.addr, align 4
+store i32 %b, ptr %b.addr, align 4
+%0 = load i32, ptr %a.addr, align 4
+%1 = load i32, ptr %b.addr, align 4
+%add = add nsw i32 %0, %1
+ret i32 %add
+  }
+  
+  ; Function Attrs: mustprogress noinline nounwind optnone uwtable
+  define dso_local noundef i32 @_Z8multiplyii(i32 noundef %a, i32 noundef %b) 
#0 !type !6 !type !6 {
+  entry:
+%a.addr = alloca i32, align 4
+%b.addr = alloca i32, align 4
+store i32 %a, ptr %a.addr, align 4
+store i32 %b, ptr %b.addr, align 4
+%0 = load i32, ptr %a.addr, align 4
+%1 = load i32, ptr %b.addr, align 4
+%mul = mul nsw i32 %0, %1
+ret i32 %mul
+  }
+  
+  ; Function Attrs: mustprogress noinline nounwind optnone uwtable
+  define dso_local noundef ptr @_Z13get_operationb(i1 noundef zeroext 
%is_addition) #0 !type !7 !type !7 {
+  entry:
+%is_addition.addr = alloca i8, align 1
+%storedv = zext i1 %is_addition to i8
+store i8 %storedv, ptr %is_addition.addr, align 1
+%0 = load i8, ptr %is_addition.addr, align 1
+%loadedv = trunc i8 %0 to i1
+br i1 %loade

[llvm-branch-commits] [llvm] [llvm] Add option to emit `callgraph` section (PR #87574)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87574

>From 1d7ee612e408ee7e64e984eb08e6d7089a435d09 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran 
Date: Sun, 2 Feb 2025 00:58:49 +
Subject: [PATCH 1/4] Simplify MIR test.

Created using spr 1.3.6-beta.1
---
 .../CodeGen/MIR/X86/call-site-info-typeid.mir | 21 ++-
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir 
b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
index 5ab797bfcc18f..a99ee50a608fb 100644
--- a/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
+++ b/llvm/test/CodeGen/MIR/X86/call-site-info-typeid.mir
@@ -8,11 +8,6 @@
 # CHECK-NEXT: 123456789 }
 
 --- |
-  ; ModuleID = 'test.ll'
-  source_filename = "test.ll"
-  target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
-  target triple = "x86_64-unknown-linux-gnu"
-  
   define dso_local void @foo(i8 signext %a) {
   entry:
 ret void
@@ -21,10 +16,10 @@
   define dso_local i32 @main() {
   entry:
 %retval = alloca i32, align 4
-%fp = alloca void (i8)*, align 8
-store i32 0, i32* %retval, align 4
-store void (i8)* @foo, void (i8)** %fp, align 8
-%0 = load void (i8)*, void (i8)** %fp, align 8
+%fp = alloca ptr, align 8
+store i32 0, ptr %retval, align 4
+store ptr @foo, ptr %fp, align 8
+%0 = load ptr, ptr %fp, align 8
 call void %0(i8 signext 97)
 ret i32 0
   }
@@ -42,12 +37,8 @@ body: |
 name:main
 tracksRegLiveness: true
 stack:
-  - { id: 0, name: retval, type: default, offset: 0, size: 4, alignment: 4, 
-  stack-id: default, callee-saved-register: '', callee-saved-restored: 
true, 
-  debug-info-variable: '', debug-info-expression: '', debug-info-location: 
'' }
-  - { id: 1, name: fp, type: default, offset: 0, size: 8, alignment: 8, 
-  stack-id: default, callee-saved-register: '', callee-saved-restored: 
true, 
-  debug-info-variable: '', debug-info-expression: '', debug-info-location: 
'' }
+  - { id: 0, name: retval, size: 4, alignment: 4 }
+  - { id: 1, name: fp, size: 8, alignment: 8 }
 callSites:
   - { bb: 0, offset: 6, fwdArgRegs: [], typeId: 
 123456789 }

>From 86e2c9dc37170499252ed50c6bbef2931e106fbb Mon Sep 17 00:00:00 2001
From: prabhukr 
Date: Thu, 13 Mar 2025 01:03:40 +
Subject: [PATCH 2/4] Add requested tests part 1.

Created using spr 1.3.6-beta.1
---
 ...te-info-ambiguous-indirect-call-typeid.mir | 145 ++
 .../call-site-info-direct-calls-typeid.mir| 145 ++
 2 files changed, 290 insertions(+)
 create mode 100644 
llvm/test/CodeGen/MIR/X86/call-site-info-ambiguous-indirect-call-typeid.mir
 create mode 100644 
llvm/test/CodeGen/MIR/X86/call-site-info-direct-calls-typeid.mir

diff --git 
a/llvm/test/CodeGen/MIR/X86/call-site-info-ambiguous-indirect-call-typeid.mir 
b/llvm/test/CodeGen/MIR/X86/call-site-info-ambiguous-indirect-call-typeid.mir
new file mode 100644
index 0..9d1b099cc9093
--- /dev/null
+++ 
b/llvm/test/CodeGen/MIR/X86/call-site-info-ambiguous-indirect-call-typeid.mir
@@ -0,0 +1,145 @@
+# Test MIR printer and parser for type id field in callSites. It is used
+# for propogating call site type identifiers to emit in the call graph section.
+
+# RUN: llc --call-graph-section %s -run-pass=none -o - | FileCheck %s
+# CHECK: name: main
+# CHECK: callSites:
+# CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: []
+# CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
+# CHECK-NEXT: 1234567890 }
+
+--- |  
+  ; Function Attrs: mustprogress noinline nounwind optnone uwtable
+  define dso_local noundef i32 @_Z3addii(i32 noundef %a, i32 noundef %b) #0 
!type !6 !type !6 {
+  entry:
+%a.addr = alloca i32, align 4
+%b.addr = alloca i32, align 4
+store i32 %a, ptr %a.addr, align 4
+store i32 %b, ptr %b.addr, align 4
+%0 = load i32, ptr %a.addr, align 4
+%1 = load i32, ptr %b.addr, align 4
+%add = add nsw i32 %0, %1
+ret i32 %add
+  }
+  
+  ; Function Attrs: mustprogress noinline nounwind optnone uwtable
+  define dso_local noundef i32 @_Z8multiplyii(i32 noundef %a, i32 noundef %b) 
#0 !type !6 !type !6 {
+  entry:
+%a.addr = alloca i32, align 4
+%b.addr = alloca i32, align 4
+store i32 %a, ptr %a.addr, align 4
+store i32 %b, ptr %b.addr, align 4
+%0 = load i32, ptr %a.addr, align 4
+%1 = load i32, ptr %b.addr, align 4
+%mul = mul nsw i32 %0, %1
+ret i32 %mul
+  }
+  
+  ; Function Attrs: mustprogress noinline nounwind optnone uwtable
+  define dso_local noundef ptr @_Z13get_operationb(i1 noundef zeroext 
%is_addition) #0 !type !7 !type !7 {
+  entry:
+%is_addition.addr = alloca i8, align 1
+%storedv = zext i1 %is_addition to i8
+store i8 %storedv, ptr %is_addition.addr, align 1
+%0 = load i8, ptr %is_addition.addr, align 1
+%loadedv = trunc i8 %0 to i1
+br i1 %loade

[llvm-branch-commits] [llvm] [llvm] Extract and propagate indirect call type id (PR #87575)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87575

>From 1a8d810d352fbe84c0521c7614689b60ade693c8 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran 
Date: Tue, 19 Nov 2024 15:25:34 -0800
Subject: [PATCH 1/5] Fixed the tests and addressed most of the review
 comments.

Created using spr 1.3.6-beta.1
---
 llvm/include/llvm/CodeGen/MachineFunction.h   | 15 +++--
 .../CodeGen/AArch64/call-site-info-typeid.ll  | 28 +++--
 .../test/CodeGen/ARM/call-site-info-typeid.ll | 28 +++--
 .../CodeGen/MIR/X86/call-site-info-typeid.ll  | 58 ---
 .../CodeGen/MIR/X86/call-site-info-typeid.mir | 13 ++---
 .../CodeGen/Mips/call-site-info-typeid.ll | 28 +++--
 .../test/CodeGen/X86/call-site-info-typeid.ll | 28 +++--
 7 files changed, 71 insertions(+), 127 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineFunction.h 
b/llvm/include/llvm/CodeGen/MachineFunction.h
index bb0b87a3a04a3..44633df38a651 100644
--- a/llvm/include/llvm/CodeGen/MachineFunction.h
+++ b/llvm/include/llvm/CodeGen/MachineFunction.h
@@ -493,7 +493,7 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
 /// Callee type id.
 ConstantInt *TypeId = nullptr;
 
-CallSiteInfo() {}
+CallSiteInfo() = default;
 
 /// Extracts the numeric type id from the CallBase's type operand bundle,
 /// and sets TypeId. This is used as type id for the indirect call in the
@@ -503,12 +503,11 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
   if (!CB.isIndirectCall())
 return;
 
-  auto Opt = CB.getOperandBundle(LLVMContext::OB_type);
-  if (!Opt.has_value()) {
-errs() << "warning: cannot find indirect call type operand bundle for  
"
-  "call graph section\n";
+  std::optional Opt =
+  CB.getOperandBundle(LLVMContext::OB_type);
+  // Return if the operand bundle for call graph section cannot be found.
+  if (!Opt.has_value())
 return;
-  }
 
   // Get generalized type id string
   auto OB = Opt.value();
@@ -520,9 +519,9 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
  "invalid type identifier");
 
   // Compute numeric type id from generalized type id string
-  uint64_t TypeIdVal = llvm::MD5Hash(TypeIdStr->getString());
+  uint64_t TypeIdVal = MD5Hash(TypeIdStr->getString());
   IntegerType *Int64Ty = Type::getInt64Ty(CB.getContext());
-  TypeId = llvm::ConstantInt::get(Int64Ty, TypeIdVal, /*IsSigned=*/false);
+  TypeId = ConstantInt::get(Int64Ty, TypeIdVal, /*IsSigned=*/false);
 }
   };
 
diff --git a/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll 
b/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll
index f0a6b44755c5c..f3b98c2c7a395 100644
--- a/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll
+++ b/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll
@@ -1,14 +1,9 @@
-; Tests that call site type ids can be extracted and set from type operand
-; bundles.
+;; Tests that call site type ids can be extracted and set from type operand
+;; bundles.
 
-; Verify the exact typeId value to ensure it is not garbage but the value
-; computed as the type id from the type operand bundle.
-; RUN: llc --call-graph-section -mtriple aarch64-linux-gnu %s 
-stop-before=finalize-isel -o - | FileCheck %s
-
-; ModuleID = 'test.c'
-source_filename = "test.c"
-target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
-target triple = "aarch64-unknown-linux-gnu"
+;; Verify the exact typeId value to ensure it is not garbage but the value
+;; computed as the type id from the type operand bundle.
+; RUN: llc --call-graph-section -mtriple aarch64-linux-gnu < %s 
-stop-before=finalize-isel -o - | FileCheck %s
 
 define dso_local void @foo(i8 signext %a) !type !3 {
 entry:
@@ -19,10 +14,10 @@ entry:
 define dso_local i32 @main() !type !4 {
 entry:
   %retval = alloca i32, align 4
-  %fp = alloca void (i8)*, align 8
-  store i32 0, i32* %retval, align 4
-  store void (i8)* @foo, void (i8)** %fp, align 8
-  %0 = load void (i8)*, void (i8)** %fp, align 8
+  %fp = alloca ptr, align 8
+  store i32 0, ptr %retval, align 4
+  store ptr @foo, ptr %fp, align 8
+  %0 = load ptr, ptr %fp, align 8
   ; CHECK: callSites:
   ; CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
   ; CHECK-NEXT: 7854600665770582568 }
@@ -30,10 +25,5 @@ entry:
   ret i32 0
 }
 
-!llvm.module.flags = !{!0, !1, !2}
-
-!0 = !{i32 1, !"wchar_size", i32 4}
-!1 = !{i32 7, !"uwtable", i32 1}
-!2 = !{i32 7, !"frame-pointer", i32 2}
 !3 = !{i64 0, !"_ZTSFvcE.generalized"}
 !4 = !{i64 0, !"_ZTSFiE.generalized"}
diff --git a/llvm/test/CodeGen/ARM/call-site-info-typeid.ll 
b/llvm/test/CodeGen/ARM/call-site-info-typeid.ll
index ec7f8a425051b..9feeef9a564cc 100644
--- a/llvm/test/CodeGen/ARM/call-site-info-typeid.ll
+++ b/llvm/test/CodeGen/ARM/call-site-info-typeid.ll
@@ -1,14 +1,9 @@
-; Tests that call site type ids can be extracted and set from type operand
-; bundles.
+;; Tests that ca

[llvm-branch-commits] [llvm] [llvm] Extract and propagate indirect call type id (PR #87575)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87575

>From 1a8d810d352fbe84c0521c7614689b60ade693c8 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran 
Date: Tue, 19 Nov 2024 15:25:34 -0800
Subject: [PATCH 1/5] Fixed the tests and addressed most of the review
 comments.

Created using spr 1.3.6-beta.1
---
 llvm/include/llvm/CodeGen/MachineFunction.h   | 15 +++--
 .../CodeGen/AArch64/call-site-info-typeid.ll  | 28 +++--
 .../test/CodeGen/ARM/call-site-info-typeid.ll | 28 +++--
 .../CodeGen/MIR/X86/call-site-info-typeid.ll  | 58 ---
 .../CodeGen/MIR/X86/call-site-info-typeid.mir | 13 ++---
 .../CodeGen/Mips/call-site-info-typeid.ll | 28 +++--
 .../test/CodeGen/X86/call-site-info-typeid.ll | 28 +++--
 7 files changed, 71 insertions(+), 127 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineFunction.h 
b/llvm/include/llvm/CodeGen/MachineFunction.h
index bb0b87a3a04a3..44633df38a651 100644
--- a/llvm/include/llvm/CodeGen/MachineFunction.h
+++ b/llvm/include/llvm/CodeGen/MachineFunction.h
@@ -493,7 +493,7 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
 /// Callee type id.
 ConstantInt *TypeId = nullptr;
 
-CallSiteInfo() {}
+CallSiteInfo() = default;
 
 /// Extracts the numeric type id from the CallBase's type operand bundle,
 /// and sets TypeId. This is used as type id for the indirect call in the
@@ -503,12 +503,11 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
   if (!CB.isIndirectCall())
 return;
 
-  auto Opt = CB.getOperandBundle(LLVMContext::OB_type);
-  if (!Opt.has_value()) {
-errs() << "warning: cannot find indirect call type operand bundle for  
"
-  "call graph section\n";
+  std::optional Opt =
+  CB.getOperandBundle(LLVMContext::OB_type);
+  // Return if the operand bundle for call graph section cannot be found.
+  if (!Opt.has_value())
 return;
-  }
 
   // Get generalized type id string
   auto OB = Opt.value();
@@ -520,9 +519,9 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
  "invalid type identifier");
 
   // Compute numeric type id from generalized type id string
-  uint64_t TypeIdVal = llvm::MD5Hash(TypeIdStr->getString());
+  uint64_t TypeIdVal = MD5Hash(TypeIdStr->getString());
   IntegerType *Int64Ty = Type::getInt64Ty(CB.getContext());
-  TypeId = llvm::ConstantInt::get(Int64Ty, TypeIdVal, /*IsSigned=*/false);
+  TypeId = ConstantInt::get(Int64Ty, TypeIdVal, /*IsSigned=*/false);
 }
   };
 
diff --git a/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll 
b/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll
index f0a6b44755c5c..f3b98c2c7a395 100644
--- a/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll
+++ b/llvm/test/CodeGen/AArch64/call-site-info-typeid.ll
@@ -1,14 +1,9 @@
-; Tests that call site type ids can be extracted and set from type operand
-; bundles.
+;; Tests that call site type ids can be extracted and set from type operand
+;; bundles.
 
-; Verify the exact typeId value to ensure it is not garbage but the value
-; computed as the type id from the type operand bundle.
-; RUN: llc --call-graph-section -mtriple aarch64-linux-gnu %s 
-stop-before=finalize-isel -o - | FileCheck %s
-
-; ModuleID = 'test.c'
-source_filename = "test.c"
-target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
-target triple = "aarch64-unknown-linux-gnu"
+;; Verify the exact typeId value to ensure it is not garbage but the value
+;; computed as the type id from the type operand bundle.
+; RUN: llc --call-graph-section -mtriple aarch64-linux-gnu < %s 
-stop-before=finalize-isel -o - | FileCheck %s
 
 define dso_local void @foo(i8 signext %a) !type !3 {
 entry:
@@ -19,10 +14,10 @@ entry:
 define dso_local i32 @main() !type !4 {
 entry:
   %retval = alloca i32, align 4
-  %fp = alloca void (i8)*, align 8
-  store i32 0, i32* %retval, align 4
-  store void (i8)* @foo, void (i8)** %fp, align 8
-  %0 = load void (i8)*, void (i8)** %fp, align 8
+  %fp = alloca ptr, align 8
+  store i32 0, ptr %retval, align 4
+  store ptr @foo, ptr %fp, align 8
+  %0 = load ptr, ptr %fp, align 8
   ; CHECK: callSites:
   ; CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: [], typeId:
   ; CHECK-NEXT: 7854600665770582568 }
@@ -30,10 +25,5 @@ entry:
   ret i32 0
 }
 
-!llvm.module.flags = !{!0, !1, !2}
-
-!0 = !{i32 1, !"wchar_size", i32 4}
-!1 = !{i32 7, !"uwtable", i32 1}
-!2 = !{i32 7, !"frame-pointer", i32 2}
 !3 = !{i64 0, !"_ZTSFvcE.generalized"}
 !4 = !{i64 0, !"_ZTSFiE.generalized"}
diff --git a/llvm/test/CodeGen/ARM/call-site-info-typeid.ll 
b/llvm/test/CodeGen/ARM/call-site-info-typeid.ll
index ec7f8a425051b..9feeef9a564cc 100644
--- a/llvm/test/CodeGen/ARM/call-site-info-typeid.ll
+++ b/llvm/test/CodeGen/ARM/call-site-info-typeid.ll
@@ -1,14 +1,9 @@
-; Tests that call site type ids can be extracted and set from type operand
-; bundles.
+;; Tests that ca

[llvm-branch-commits] [clang] [clang] Introduce CallGraphSection option (PR #117037)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/117037

>From 6a12be2c5b60a95a06875b0b2c4f14228d1fa882 Mon Sep 17 00:00:00 2001
From: prabhukr 
Date: Wed, 12 Mar 2025 23:30:01 +
Subject: [PATCH] Fix EOF newlines.

Created using spr 1.3.6-beta.1
---
 clang/test/Driver/call-graph-section.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/test/Driver/call-graph-section.c 
b/clang/test/Driver/call-graph-section.c
index 108446729d857..5832aa6754137 100644
--- a/clang/test/Driver/call-graph-section.c
+++ b/clang/test/Driver/call-graph-section.c
@@ -2,4 +2,4 @@
 // RUN: %clang -### -S -fcall-graph-section -fno-call-graph-section %s 2>&1 | 
FileCheck --check-prefix=NO-CALL-GRAPH-SECTION %s
 
 // CALL-GRAPH-SECTION: "-fcall-graph-section"
-// NO-CALL-GRAPH-SECTION-NOT: "-fcall-graph-section"
\ No newline at end of file
+// NO-CALL-GRAPH-SECTION-NOT: "-fcall-graph-section"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang] Introduce CallGraphSection option (PR #117037)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/117037

>From 6a12be2c5b60a95a06875b0b2c4f14228d1fa882 Mon Sep 17 00:00:00 2001
From: prabhukr 
Date: Wed, 12 Mar 2025 23:30:01 +
Subject: [PATCH] Fix EOF newlines.

Created using spr 1.3.6-beta.1
---
 clang/test/Driver/call-graph-section.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/test/Driver/call-graph-section.c 
b/clang/test/Driver/call-graph-section.c
index 108446729d857..5832aa6754137 100644
--- a/clang/test/Driver/call-graph-section.c
+++ b/clang/test/Driver/call-graph-section.c
@@ -2,4 +2,4 @@
 // RUN: %clang -### -S -fcall-graph-section -fno-call-graph-section %s 2>&1 | 
FileCheck --check-prefix=NO-CALL-GRAPH-SECTION %s
 
 // CALL-GRAPH-SECTION: "-fcall-graph-section"
-// NO-CALL-GRAPH-SECTION-NOT: "-fcall-graph-section"
\ No newline at end of file
+// NO-CALL-GRAPH-SECTION-NOT: "-fcall-graph-section"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm][AsmPrinter] Emit call graph section (PR #87576)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87576

>From 6b67376bd5e1f21606017c83cc67f2186ba36a33 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran 
Date: Thu, 13 Mar 2025 01:41:04 +
Subject: [PATCH 1/4] Updated the test as reviewers suggested.

Created using spr 1.3.6-beta.1
---
 llvm/test/CodeGen/X86/call-graph-section.ll | 66 +++
 llvm/test/CodeGen/call-graph-section.ll | 73 -
 2 files changed, 66 insertions(+), 73 deletions(-)
 create mode 100644 llvm/test/CodeGen/X86/call-graph-section.ll
 delete mode 100644 llvm/test/CodeGen/call-graph-section.ll

diff --git a/llvm/test/CodeGen/X86/call-graph-section.ll 
b/llvm/test/CodeGen/X86/call-graph-section.ll
new file mode 100644
index 0..a77a2b8051ed3
--- /dev/null
+++ b/llvm/test/CodeGen/X86/call-graph-section.ll
@@ -0,0 +1,66 @@
+;; Tests that we store the type identifiers in .callgraph section of the 
binary.
+
+; RUN: llc --call-graph-section -filetype=obj -o - < %s | \
+; RUN: llvm-readelf -x .callgraph - | FileCheck %s
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local void @foo() #0 !type !4 {
+entry:
+  ret void
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @bar(i8 signext %a) #0 !type !5 {
+entry:
+  %a.addr = alloca i8, align 1
+  store i8 %a, ptr %a.addr, align 1
+  ret i32 0
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local ptr @baz(ptr %a) #0 !type !6 {
+entry:
+  %a.addr = alloca ptr, align 8
+  store ptr %a, ptr %a.addr, align 8
+  ret ptr null
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local void @main() #0 !type !7 {
+entry:
+  %retval = alloca i32, align 4
+  %fp_foo = alloca ptr, align 8
+  %a = alloca i8, align 1
+  %fp_bar = alloca ptr, align 8
+  %fp_baz = alloca ptr, align 8
+  store i32 0, ptr %retval, align 4
+  store ptr @foo, ptr %fp_foo, align 8
+  %0 = load ptr, ptr %fp_foo, align 8
+  call void (...) %0() [ "callee_type"(metadata !"_ZTSFvE.generalized") ]
+  store ptr @bar, ptr %fp_bar, align 8
+  %1 = load ptr, ptr %fp_bar, align 8
+  %2 = load i8, ptr %a, align 1
+  %call = call i32 %1(i8 signext %2) [ "callee_type"(metadata 
!"_ZTSFicE.generalized") ]
+  store ptr @baz, ptr %fp_baz, align 8
+  %3 = load ptr, ptr %fp_baz, align 8
+  %call1 = call ptr %3(ptr %a) [ "callee_type"(metadata 
!"_ZTSFPvS_E.generalized") ]
+  call void @foo() [ "callee_type"(metadata !"_ZTSFvE.generalized") ]
+  %4 = load i8, ptr %a, align 1
+  %call2 = call i32 @bar(i8 signext %4) [ "callee_type"(metadata 
!"_ZTSFicE.generalized") ]
+  %call3 = call ptr @baz(ptr %a) [ "callee_type"(metadata 
!"_ZTSFPvS_E.generalized") ]
+  ret void
+}
+
+;; Check that the numeric type id (md5 hash) for the below type ids are emitted
+;; to the callgraph section.
+
+; CHECK: Hex dump of section '.callgraph':
+
+; CHECK-DAG: 2444f731 f5eecb3e
+!4 = !{i64 0, !"_ZTSFvE.generalized"}
+; CHECK-DAG: 5486bc59 814b8e30
+!5 = !{i64 0, !"_ZTSFicE.generalized"}
+; CHECK-DAG: 7ade6814 f897fd77
+!6 = !{i64 0, !"_ZTSFPvS_E.generalized"}
+; CHECK-DAG: caaf769a 600968fa
+!7 = !{i64 0, !"_ZTSFiE.generalized"}
diff --git a/llvm/test/CodeGen/call-graph-section.ll 
b/llvm/test/CodeGen/call-graph-section.ll
deleted file mode 100644
index bb158d11e82c9..0
--- a/llvm/test/CodeGen/call-graph-section.ll
+++ /dev/null
@@ -1,73 +0,0 @@
-; Tests that we store the type identifiers in .callgraph section of the binary.
-
-; RUN: llc --call-graph-section -filetype=obj -o - < %s | \
-; RUN: llvm-readelf -x .callgraph - | FileCheck %s
-
-target triple = "x86_64-unknown-linux-gnu"
-
-define dso_local void @foo() #0 !type !4 {
-entry:
-  ret void
-}
-
-define dso_local i32 @bar(i8 signext %a) #0 !type !5 {
-entry:
-  %a.addr = alloca i8, align 1
-  store i8 %a, i8* %a.addr, align 1
-  ret i32 0
-}
-
-define dso_local i32* @baz(i8* %a) #0 !type !6 {
-entry:
-  %a.addr = alloca i8*, align 8
-  store i8* %a, i8** %a.addr, align 8
-  ret i32* null
-}
-
-define dso_local i32 @main() #0 !type !7 {
-entry:
-  %retval = alloca i32, align 4
-  %fp_foo = alloca void (...)*, align 8
-  %a = alloca i8, align 1
-  %fp_bar = alloca i32 (i8)*, align 8
-  %fp_baz = alloca i32* (i8*)*, align 8
-  store i32 0, i32* %retval, align 4
-  store void (...)* bitcast (void ()* @foo to void (...)*), void (...)** 
%fp_foo, align 8
-  %0 = load void (...)*, void (...)** %fp_foo, align 8
-  call void (...) %0() [ "callee_type"(metadata !"_ZTSFvE.generalized") ]
-  store i32 (i8)* @bar, i32 (i8)** %fp_bar, align 8
-  %1 = load i32 (i8)*, i32 (i8)** %fp_bar, align 8
-  %2 = load i8, i8* %a, align 1
-  %call = call i32 %1(i8 signext %2) [ "callee_type"(metadata 
!"_ZTSFicE.generalized") ]
-  store i32* (i8*)* @baz, i32* (i8*)** %fp_baz, align 8
-  %3 = load i32* (i8*)*, i32* (i8*)** %fp_baz, align 8
-  %call1 = call i32* %3(i8* %a) [ "callee_type"(metadata 
!"_ZTSFPvS_E.generalized") ]
-  call void @foo() [ "callee_type"(meta

[llvm-branch-commits] [llvm] [llvm][AsmPrinter] Emit call graph section (PR #87576)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87576

>From 6b67376bd5e1f21606017c83cc67f2186ba36a33 Mon Sep 17 00:00:00 2001
From: Necip Fazil Yildiran 
Date: Thu, 13 Mar 2025 01:41:04 +
Subject: [PATCH 1/4] Updated the test as reviewers suggested.

Created using spr 1.3.6-beta.1
---
 llvm/test/CodeGen/X86/call-graph-section.ll | 66 +++
 llvm/test/CodeGen/call-graph-section.ll | 73 -
 2 files changed, 66 insertions(+), 73 deletions(-)
 create mode 100644 llvm/test/CodeGen/X86/call-graph-section.ll
 delete mode 100644 llvm/test/CodeGen/call-graph-section.ll

diff --git a/llvm/test/CodeGen/X86/call-graph-section.ll 
b/llvm/test/CodeGen/X86/call-graph-section.ll
new file mode 100644
index 0..a77a2b8051ed3
--- /dev/null
+++ b/llvm/test/CodeGen/X86/call-graph-section.ll
@@ -0,0 +1,66 @@
+;; Tests that we store the type identifiers in .callgraph section of the 
binary.
+
+; RUN: llc --call-graph-section -filetype=obj -o - < %s | \
+; RUN: llvm-readelf -x .callgraph - | FileCheck %s
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local void @foo() #0 !type !4 {
+entry:
+  ret void
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local i32 @bar(i8 signext %a) #0 !type !5 {
+entry:
+  %a.addr = alloca i8, align 1
+  store i8 %a, ptr %a.addr, align 1
+  ret i32 0
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local ptr @baz(ptr %a) #0 !type !6 {
+entry:
+  %a.addr = alloca ptr, align 8
+  store ptr %a, ptr %a.addr, align 8
+  ret ptr null
+}
+
+; Function Attrs: noinline nounwind optnone uwtable
+define dso_local void @main() #0 !type !7 {
+entry:
+  %retval = alloca i32, align 4
+  %fp_foo = alloca ptr, align 8
+  %a = alloca i8, align 1
+  %fp_bar = alloca ptr, align 8
+  %fp_baz = alloca ptr, align 8
+  store i32 0, ptr %retval, align 4
+  store ptr @foo, ptr %fp_foo, align 8
+  %0 = load ptr, ptr %fp_foo, align 8
+  call void (...) %0() [ "callee_type"(metadata !"_ZTSFvE.generalized") ]
+  store ptr @bar, ptr %fp_bar, align 8
+  %1 = load ptr, ptr %fp_bar, align 8
+  %2 = load i8, ptr %a, align 1
+  %call = call i32 %1(i8 signext %2) [ "callee_type"(metadata 
!"_ZTSFicE.generalized") ]
+  store ptr @baz, ptr %fp_baz, align 8
+  %3 = load ptr, ptr %fp_baz, align 8
+  %call1 = call ptr %3(ptr %a) [ "callee_type"(metadata 
!"_ZTSFPvS_E.generalized") ]
+  call void @foo() [ "callee_type"(metadata !"_ZTSFvE.generalized") ]
+  %4 = load i8, ptr %a, align 1
+  %call2 = call i32 @bar(i8 signext %4) [ "callee_type"(metadata 
!"_ZTSFicE.generalized") ]
+  %call3 = call ptr @baz(ptr %a) [ "callee_type"(metadata 
!"_ZTSFPvS_E.generalized") ]
+  ret void
+}
+
+;; Check that the numeric type id (md5 hash) for the below type ids are emitted
+;; to the callgraph section.
+
+; CHECK: Hex dump of section '.callgraph':
+
+; CHECK-DAG: 2444f731 f5eecb3e
+!4 = !{i64 0, !"_ZTSFvE.generalized"}
+; CHECK-DAG: 5486bc59 814b8e30
+!5 = !{i64 0, !"_ZTSFicE.generalized"}
+; CHECK-DAG: 7ade6814 f897fd77
+!6 = !{i64 0, !"_ZTSFPvS_E.generalized"}
+; CHECK-DAG: caaf769a 600968fa
+!7 = !{i64 0, !"_ZTSFiE.generalized"}
diff --git a/llvm/test/CodeGen/call-graph-section.ll 
b/llvm/test/CodeGen/call-graph-section.ll
deleted file mode 100644
index bb158d11e82c9..0
--- a/llvm/test/CodeGen/call-graph-section.ll
+++ /dev/null
@@ -1,73 +0,0 @@
-; Tests that we store the type identifiers in .callgraph section of the binary.
-
-; RUN: llc --call-graph-section -filetype=obj -o - < %s | \
-; RUN: llvm-readelf -x .callgraph - | FileCheck %s
-
-target triple = "x86_64-unknown-linux-gnu"
-
-define dso_local void @foo() #0 !type !4 {
-entry:
-  ret void
-}
-
-define dso_local i32 @bar(i8 signext %a) #0 !type !5 {
-entry:
-  %a.addr = alloca i8, align 1
-  store i8 %a, i8* %a.addr, align 1
-  ret i32 0
-}
-
-define dso_local i32* @baz(i8* %a) #0 !type !6 {
-entry:
-  %a.addr = alloca i8*, align 8
-  store i8* %a, i8** %a.addr, align 8
-  ret i32* null
-}
-
-define dso_local i32 @main() #0 !type !7 {
-entry:
-  %retval = alloca i32, align 4
-  %fp_foo = alloca void (...)*, align 8
-  %a = alloca i8, align 1
-  %fp_bar = alloca i32 (i8)*, align 8
-  %fp_baz = alloca i32* (i8*)*, align 8
-  store i32 0, i32* %retval, align 4
-  store void (...)* bitcast (void ()* @foo to void (...)*), void (...)** 
%fp_foo, align 8
-  %0 = load void (...)*, void (...)** %fp_foo, align 8
-  call void (...) %0() [ "callee_type"(metadata !"_ZTSFvE.generalized") ]
-  store i32 (i8)* @bar, i32 (i8)** %fp_bar, align 8
-  %1 = load i32 (i8)*, i32 (i8)** %fp_bar, align 8
-  %2 = load i8, i8* %a, align 1
-  %call = call i32 %1(i8 signext %2) [ "callee_type"(metadata 
!"_ZTSFicE.generalized") ]
-  store i32* (i8*)* @baz, i32* (i8*)** %fp_baz, align 8
-  %3 = load i32* (i8*)*, i32* (i8*)** %fp_baz, align 8
-  %call1 = call i32* %3(i8* %a) [ "callee_type"(metadata 
!"_ZTSFPvS_E.generalized") ]
-  call void @foo() [ "callee_type"(meta

[llvm-branch-commits] [clang] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/117036

>From b7fbe09b32ff02d4f7c52d82fbf8b5cd28138852 Mon Sep 17 00:00:00 2001
From: prabhukr 
Date: Wed, 23 Apr 2025 04:05:47 +
Subject: [PATCH] Address review comments.

Created using spr 1.3.6-beta.1
---
 clang/lib/CodeGen/CGCall.cpp|  8 
 clang/lib/CodeGen/CodeGenModule.cpp | 10 +-
 clang/lib/CodeGen/CodeGenModule.h   |  4 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 185ee1a970aac..d8ab7140f7943 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5780,19 +5780,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   if (callOrInvoke) {
 *callOrInvoke = CI;
 if (CGM.getCodeGenOpts().CallGraphSection) {
-  assert((TargetDecl && TargetDecl->getFunctionType() ||
-  Callee.getAbstractInfo().getCalleeFunctionProtoType()) &&
- "cannot find callsite type");
   QualType CST;
   if (TargetDecl && TargetDecl->getFunctionType())
 CST = QualType(TargetDecl->getFunctionType(), 0);
   else if (const auto *FPT =
Callee.getAbstractInfo().getCalleeFunctionProtoType())
 CST = QualType(FPT, 0);
+  else
+llvm_unreachable(
+"Cannot find the callee type to generate callee_type metadata.");
 
   // Set type identifier metadata of indirect calls for call graph section.
   if (!CST.isNull())
-CGM.CreateCalleeTypeMetadataForIcall(CST, *callOrInvoke);
+CGM.createCalleeTypeMetadataForIcall(CST, *callOrInvoke);
 }
   }
 
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index 43cd2405571cf..2fc99639a75cb 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2654,7 +2654,7 @@ void 
CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
   // Skip available_externally functions. They won't be codegen'ed in the
   // current module anyway.
   if (getContext().GetGVALinkageForFunction(FD) != GVA_AvailableExternally)
-CreateFunctionTypeMetadataForIcall(FD, F);
+createFunctionTypeMetadataForIcall(FD, F);
 }
   }
 
@@ -2868,7 +2868,7 @@ static bool hasExistingGeneralizedTypeMD(llvm::Function 
*F) {
   return MD->hasGeneralizedMDString();
 }
 
-void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
+void CodeGenModule::createFunctionTypeMetadataForIcall(const FunctionDecl *FD,
llvm::Function *F) {
   if (CodeGenOpts.CallGraphSection && !hasExistingGeneralizedTypeMD(F) &&
   (!F->hasLocalLinkage() ||
@@ -2898,7 +2898,7 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
   F->addTypeMetadata(0, llvm::ConstantAsMetadata::get(CrossDsoTypeId));
 }
 
-void CodeGenModule::CreateCalleeTypeMetadataForIcall(const QualType &QT,
+void CodeGenModule::createCalleeTypeMetadataForIcall(const QualType &QT,
  llvm::CallBase *CB) {
   // Only if needed for call graph section and only for indirect calls.
   if (!CodeGenOpts.CallGraphSection || !CB->isIndirectCall())
@@ -2909,7 +2909,7 @@ void 
CodeGenModule::CreateCalleeTypeMetadataForIcall(const QualType &QT,
   getLLVMContext(), {llvm::ConstantAsMetadata::get(llvm::ConstantInt::get(
  llvm::Type::getInt64Ty(getLLVMContext()), 0)),
  TypeIdMD});
-  llvm::MDTuple *MDN = llvm::MDNode::get(getLLVMContext(), { TypeTuple });
+  llvm::MDTuple *MDN = llvm::MDNode::get(getLLVMContext(), {TypeTuple});
   CB->setMetadata(llvm::LLVMContext::MD_callee_type, MDN);
 }
 
@@ -3041,7 +3041,7 @@ void CodeGenModule::SetFunctionAttributes(GlobalDecl GD, 
llvm::Function *F,
   // jump table.
   if (!CodeGenOpts.SanitizeCfiCrossDso ||
   !CodeGenOpts.SanitizeCfiCanonicalJumpTables)
-CreateFunctionTypeMetadataForIcall(FD, F);
+createFunctionTypeMetadataForIcall(FD, F);
 
   if (LangOpts.Sanitize.has(SanitizerKind::KCFI))
 setKCFIType(FD, F);
diff --git a/clang/lib/CodeGen/CodeGenModule.h 
b/clang/lib/CodeGen/CodeGenModule.h
index dfbe4388349dd..4b53f0f241b52 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -1619,11 +1619,11 @@ class CodeGenModule : public CodeGenTypeCache {
   llvm::Metadata *CreateMetadataIdentifierGeneralized(QualType T);
 
   /// Create and attach type metadata to the given function.
-  void CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
+  void createFunctionTypeMetadataForIcall(const FunctionDecl *FD,
   llvm::Function *F);
 
   /// Create and attach type metadata to the given call.
-  void CreateCalleeTypeMetadataForIcall(const QualType &QT, llvm::CallBase 
*CB);
+  void createCa

[llvm-branch-commits] [clang] [clang] Introduce CallGraphSection option (PR #117037)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits

https://github.com/Prabhuk ready_for_review 
https://github.com/llvm/llvm-project/pull/117037
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang] Introduce CallGraphSection option (PR #117037)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits


@@ -0,0 +1,251 @@
+==
+Call Graph Section
+==
+
+Introduction
+
+
+With ``-fcall-graph-section``, the compiler will create a call graph section 
+in the object file. It will include type identifiers for indirect calls and 
+targets. This information can be used to map indirect calls to their receivers 
+with matching types. A complete and high-precision call graph can be 

Prabhuk wrote:

The documentation changes are moved out of this patch to #119468 

https://github.com/llvm/llvm-project/pull/117037
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang] Introduce CallGraphSection option (PR #117037)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits


@@ -0,0 +1,251 @@
+==
+Call Graph Section
+==
+
+Introduction
+
+
+With ``-fcall-graph-section``, the compiler will create a call graph section 
+in the object file. It will include type identifiers for indirect calls and 
+targets. This information can be used to map indirect calls to their receivers 
+with matching types. A complete and high-precision call graph can be 
+reconstructed by complementing this information with disassembly 
+(see ``llvm-objdump --call-graph-info``).
+
+Semantics
+=
+
+A coarse-grained, type-agnostic call graph may allow indirect calls to target
+any function in the program. This approach ensures completeness since no
+indirect call edge is missing. However, it is generally poor in precision
+due to having unneeded edges.
+
+A call graph section provides type identifiers for indirect calls and targets.
+This information can be used to restrict the receivers of an indirect target to
+indirect calls with matching type. Consequently, the precision for indirect
+call edges are improved while maintaining the completeness.
+
+The ``llvm-objdump`` utility provides a ``--call-graph-info`` option to extract
+full call graph information by parsing the content of the call graph section
+and disassembling the program for complementary information, e.g., direct
+calls.
+
+Section layout
+==
+
+A call graph section consists of zero or more call graph entries.
+Each entry contains information on a function and its indirect calls.
+
+An entry of a call graph section has the following layout in the binary:
+
++-+---+
+| Element | Content
   |
++=+===+
+| FormatVersionNumber | Format version number. 
   |
++-+---+
+| FunctionEntryPc | Function entry address.
   |
++-+---+---+
+| | A flag whether the function is an | - 0: not an 
indirect target   |
+| FunctionKind| indirect target, and if so,   | - 1: indirect 
target, unknown id  |
+| | whether its type id is known. | - 2: indirect 
target, known id|
++-+---+---+
+| FunctionTypeId  | Type id for the indirect target. Present only when 
FunctionKind is 2. |
++-+---+
+| CallSiteCount   | Number of type id to indirect call site mappings that 
follow. |
++-+---+
+| CallSiteList| List of type id and indirect call site pc pairs.   
   |
++-+---+
+
+Each element in an entry (including each element of the contained lists and
+pairs) occupies 64-bit space.
+
+The format version number is repeated per entry to support concatenation of
+call graph sections with different format versions by the linker.
+
+As of now, the only supported format version is described above and has version
+number 0.
+
+Type identifiers
+
+
+The type for an indirect call or target is the function signature.
+The mapping from a type to an identifier is an ABI detail.
+In the current experimental implementation, an identifier of type T is
+computed as follows:
+
+  -  Obtain the generalized mangled name for “typeinfo name for T”.
+  -  Compute MD5 hash of the name as a string.
+  -  Reinterpret the first 8 bytes of the hash as a little-endian 64-bit 
integer.
+
+To avoid mismatched pointer types, generalizations are applied.
+Pointers in return and argument types are treated as equivalent as long as the
+qualifiers for the type they point to match.
+For example, ``char*``, ``char**``, and ``int*`` are considered equivalent
+types. However, ``char*`` and ``const char*`` are considered separate types.

Prabhuk wrote:

The documentation is in a new PR #119468

https://github.com/llvm/llvm-project/pull/117037
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm] Extract and propagate indirect call type id (PR #87575)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits


@@ -7922,6 +7923,10 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
 *DAG.getContext());
   RetCCInfo.AnalyzeCallResult(Ins, RetCC);
 
+  // Set type id for call site info.
+  if (MF.getTarget().Options.EmitCallGraphSection && CB && 
CB->isIndirectCall())
+CSInfo = MachineFunction::CallSiteInfo(*CB);

Prabhuk wrote:

Can you please help me understand the concerns here? If the lib function is 
called indirectly, there will be no callsite_type metadata set for those calls. 
Will an additional check to see if there is a "callee_type" metadata before 
initializing CSInfo address the use case you thinking about?

https://github.com/llvm/llvm-project/pull/87575
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm][AsmPrinter] Emit call graph section (PR #87576)

2025-04-22 Thread Prabhu Rajasekaran via llvm-branch-commits


@@ -0,0 +1,50 @@
+;; Tests that we store the type identifiers in .callgraph section of the 
binary.
+
+; RUN: llc --call-graph-section -filetype=obj -o - < %s | \
+; RUN: llvm-readelf -x .callgraph - | FileCheck %s
+
+declare void @foo()
+
+declare noundef i32 @bar(i8 signext)
+
+declare noundef ptr @baz(ptr)
+
+define dso_local void @main() {
+entry:
+  %retval = alloca i32, align 4
+  %fp_foo = alloca ptr, align 8
+  %a = alloca i8, align 1
+  %fp_bar = alloca ptr, align 8
+  %fp_baz = alloca ptr, align 8
+  store i32 0, ptr %retval, align 4
+  store ptr @foo, ptr %fp_foo, align 8
+  %fp_foo_val = load ptr, ptr %fp_foo, align 8
+  call void (...) %fp_foo_val(), !callee_type !1
+  store ptr @bar, ptr %fp_bar, align 8
+  %fp_bar_val = load ptr, ptr %fp_bar, align 8
+  %a_val = load i8, ptr %a, align 1
+  %call_fp_bar = call i32 %fp_bar_val(i8 signext %a_val), !callee_type !3
+  store ptr @baz, ptr %fp_baz, align 8
+  %fp_baz_val = load ptr, ptr %fp_baz, align 8
+  %call_fp_baz = call ptr %fp_baz_val(ptr %a), !callee_type !5
+  call void @foo()
+  %a_val_2 = load i8, ptr %a, align 1
+  %call_bar = call i32 @bar(i8 signext %a_val_2)

Prabhuk wrote:

#119468 PR has the detailed documentation. It talks about the type metadata 
being added to indirect callsites alone. I can make an explicit callout to the 
fact that no metadata will be attached to the direct call sites.

https://github.com/llvm/llvm-project/pull/87576
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GlobalISel] Fix miscompile when narrowing vector load/stores to non-byte-sized types (PR #136739)

2025-04-22 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler created 
https://github.com/llvm/llvm-project/pull/136739

LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized types, 
because this would require (un)packing of bits across byte boundaries.

Precommit tests: #134904

>From e88a6e177837b478b4dc20def1b59f193b950965 Mon Sep 17 00:00:00 2001
From: Tobias Stadler 
Date: Wed, 9 Apr 2025 13:32:02 +0100
Subject: [PATCH] [GlobalISel] Fix miscompile when narrowing vector load/stores
 to non-byte-sized types

LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized
types, because this would require (un)packing of bits across byte
boundaries.
---
 .../CodeGen/GlobalISel/LegalizerHelper.cpp|   5 +
 .../GlobalISel/legalize-load-store-vector.mir | 104 +---
 .../test/CodeGen/AArch64/fptosi-sat-vector.ll | 490 +++---
 .../test/CodeGen/AArch64/fptoui-sat-vector.ll | 424 ++-
 4 files changed, 368 insertions(+), 655 deletions(-)

diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp 
b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index 0aa853389bf1a..4052060271331 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -5210,6 +5210,11 @@ LegalizerHelper::reduceLoadStoreWidth(GLoadStore 
&LdStMI, unsigned TypeIdx,
   if (TypeIdx != 0)
 return UnableToLegalize;
 
+  if (!NarrowTy.isByteSized()) {
+LLVM_DEBUG(dbgs() << "Can't narrow load/store to non-byte-sized type\n");
+return UnableToLegalize;
+  }
+
   // This implementation doesn't work for atomics. Give up instead of doing
   // something invalid.
   if (LdStMI.isAtomic())
diff --git 
a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
index 221980ff2c42e..3a2c57ab50147 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
@@ -2,9 +2,11 @@
 # RUN: llc -O0 -mtriple=aarch64 -verify-machineinstrs -run-pass=legalizer 
-global-isel-abort=0 -pass-remarks-missed='gisel.*' -o - %s 2> %t.err | 
FileCheck %s
 # RUN: FileCheck -check-prefix=ERR %s < %t.err
 
-# ERR: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in 
function: load-narrow-scalar-high-bits)
+# ERR: remark: :0:0: unable to legalize instruction: G_STORE 
%{{[0-9]+}}:_(<8 x s9>), %{{[0-9]+}}:_(p0) :: (store (<8 x s9>), align 16) (in 
function: store-narrow-non-byte-sized)
+# ERR-NEXT: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(<8 x s9>) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<8 x s9>), align 
16) (in function: load-narrow-non-byte-sized)
+# ERR-NEXT: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in 
function: load-narrow-scalar-high-bits)
 
-# FIXME: Scalarized stores for non-byte-sized vector elements store incorrect 
partial values.
+# FIXME: Non-byte-sized vector elements cause fallback in 
LegalizerHelper::reduceLoadStoreWidth
 ---
 name:store-narrow-non-byte-sized
 tracksRegLiveness: true
@@ -15,60 +17,10 @@ body: |
 ; CHECK: liveins: $x8
 ; CHECK-NEXT: {{  $}}
 ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x8
-; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 256
-; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 511
-; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]]
-; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[AND]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC]](s16), [[COPY]](p0) :: (store (s16), align 
16)
-; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
-; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
-; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 257
-; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C3]](s32)
-; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[COPY3]]
-; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[AND1]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC1]](s16), [[PTR_ADD]](p0) :: (store (s16) 
into unknown-address + 1, align 1)
-; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
-; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)
-; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[COPY5]]
-; CHECK-NEXT: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[AND2]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC2]](s16), [[PTR_ADD1]](p0) :: (store (s16) 
into unknown-address + 2)
-; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
-; CHECK-NEXT: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)
- 

[llvm-branch-commits] [llvm] release/20.x: [HEXAGON] Fix corner cases for hwloops pass (#135439) (PR #135657)

2025-04-22 Thread Brian Cain via llvm-branch-commits

androm3da wrote:

@iajbar, please review.

https://github.com/llvm/llvm-project/pull/135657
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GlobalISel] Fix miscompile when narrowing vector load/stores to non-byte-sized types (PR #136739)

2025-04-22 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: Tobias Stadler (tobias-stadler)


Changes

LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized types, 
because this would require (un)packing of bits across byte boundaries.

Precommit tests: #134904

---

Patch is 49.42 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/136739.diff


4 Files Affected:

- (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+5) 
- (modified) 
llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir (+12-92) 
- (modified) llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll (+192-298) 
- (modified) llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll (+159-265) 


``diff
diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp 
b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index 0aa853389bf1a..4052060271331 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -5210,6 +5210,11 @@ LegalizerHelper::reduceLoadStoreWidth(GLoadStore 
&LdStMI, unsigned TypeIdx,
   if (TypeIdx != 0)
 return UnableToLegalize;
 
+  if (!NarrowTy.isByteSized()) {
+LLVM_DEBUG(dbgs() << "Can't narrow load/store to non-byte-sized type\n");
+return UnableToLegalize;
+  }
+
   // This implementation doesn't work for atomics. Give up instead of doing
   // something invalid.
   if (LdStMI.isAtomic())
diff --git 
a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
index 221980ff2c42e..3a2c57ab50147 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
@@ -2,9 +2,11 @@
 # RUN: llc -O0 -mtriple=aarch64 -verify-machineinstrs -run-pass=legalizer 
-global-isel-abort=0 -pass-remarks-missed='gisel.*' -o - %s 2> %t.err | 
FileCheck %s
 # RUN: FileCheck -check-prefix=ERR %s < %t.err
 
-# ERR: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in 
function: load-narrow-scalar-high-bits)
+# ERR: remark: :0:0: unable to legalize instruction: G_STORE 
%{{[0-9]+}}:_(<8 x s9>), %{{[0-9]+}}:_(p0) :: (store (<8 x s9>), align 16) (in 
function: store-narrow-non-byte-sized)
+# ERR-NEXT: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(<8 x s9>) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<8 x s9>), align 
16) (in function: load-narrow-non-byte-sized)
+# ERR-NEXT: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in 
function: load-narrow-scalar-high-bits)
 
-# FIXME: Scalarized stores for non-byte-sized vector elements store incorrect 
partial values.
+# FIXME: Non-byte-sized vector elements cause fallback in 
LegalizerHelper::reduceLoadStoreWidth
 ---
 name:store-narrow-non-byte-sized
 tracksRegLiveness: true
@@ -15,60 +17,10 @@ body: |
 ; CHECK: liveins: $x8
 ; CHECK-NEXT: {{  $}}
 ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x8
-; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 256
-; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 511
-; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]]
-; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[AND]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC]](s16), [[COPY]](p0) :: (store (s16), align 
16)
-; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
-; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
-; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 257
-; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C3]](s32)
-; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[COPY3]]
-; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[AND1]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC1]](s16), [[PTR_ADD]](p0) :: (store (s16) 
into unknown-address + 1, align 1)
-; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
-; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)
-; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[COPY5]]
-; CHECK-NEXT: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[AND2]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC2]](s16), [[PTR_ADD1]](p0) :: (store (s16) 
into unknown-address + 2)
-; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
-; CHECK-NEXT: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)
-; CHECK-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C3]](s32)
-; CHECK-NEXT: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[COPY7]]
-; CHECK-NEXT: [[T

[llvm-branch-commits] [llvm] release/20.x: [HEXAGON] Fix corner cases for hwloops pass (#135439) (PR #135657)

2025-04-22 Thread Ikhlas Ajbar via llvm-branch-commits

https://github.com/iajbar approved this pull request.


https://github.com/llvm/llvm-project/pull/135657
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][NPM] Add isRequired to passes missing it (PR #134033)

2025-04-22 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/134033
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Paul Kirth via llvm-branch-commits


@@ -2873,14 +2890,33 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 
   llvm::Metadata *MD = CreateMetadataIdentifierForType(FD->getType());
   F->addTypeMetadata(0, MD);
-  F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+  // Add the generalized identifier if not added already.
+  if (!HasExistingGeneralizedTypeMD(F))
+F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 
   // Emit a hash-based bit set entry for cross-DSO calls.
   if (CodeGenOpts.SanitizeCfiCrossDso)
 if (auto CrossDsoTypeId = CreateCrossDsoCfiTypeId(MD))
   F->addTypeMetadata(0, llvm::ConstantAsMetadata::get(CrossDsoTypeId));
 }
 
+void CodeGenModule::CreateCalleeTypeMetadataForIcall(const QualType &QT,
+ llvm::CallBase *CB) {
+  // Only if needed for call graph section and only for indirect calls.
+  if (!CodeGenOpts.CallGraphSection || !CB->isIndirectCall())
+return;
+
+  llvm::Metadata *TypeIdMD = CreateMetadataIdentifierGeneralized(QT);
+  llvm::MDTuple *TypeTuple = llvm::MDTuple::get(
+  getLLVMContext(), {llvm::ConstantAsMetadata::get(llvm::ConstantInt::get(
+ llvm::Type::getInt64Ty(getLLVMContext()), 0)),
+ TypeIdMD});
+  SmallVector TypeIdMDList;
+  TypeIdMDList.push_back(TypeTuple);
+  llvm::MDTuple *MDN = llvm::MDNode::get(getLLVMContext(), TypeIdMDList);

ilovepi wrote:

```suggestion
  llvm::MDTuple *MDN = llvm::MDNode::get(getLLVMContext(), { TypeTuple });
```
I think you can get away w/o the SmallVec, right?

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Paul Kirth via llvm-branch-commits


@@ -2860,9 +2861,25 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const 
NamedDecl *ND) {
 GV->setLinkage(llvm::GlobalValue::ExternalWeakLinkage);
 }
 
+static bool HasExistingGeneralizedTypeMD(llvm::Function *F) {
+  llvm::MDNode *MD = F->getMetadata(llvm::LLVMContext::MD_type);
+  if (!MD || !isa(MD->getOperand(1)))
+return false;
+
+  llvm::MDString *TypeIdStr = cast(MD->getOperand(1));
+  return TypeIdStr->getString().ends_with(".generalized");
+}
+
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
llvm::Function *F) {
-  // Only if we are checking indirect calls.
+  if (CodeGenOpts.CallGraphSection && !HasExistingGeneralizedTypeMD(F) &&

ilovepi wrote:

Does this *ever* get called on a function that already has metadata? Seems like 
this code assumes it will only be called on a function once... so I'd guess 
`hasExistingGeneralizedTypeMD() == false` for all uses of this function. If 
you're worried, maybe it could be an assert instead? 

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Paul Kirth via llvm-branch-commits


@@ -2860,9 +2861,25 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const 
NamedDecl *ND) {
 GV->setLinkage(llvm::GlobalValue::ExternalWeakLinkage);
 }
 
+static bool HasExistingGeneralizedTypeMD(llvm::Function *F) {

ilovepi wrote:

```suggestion
static bool hasExistingGeneralizedTypeMD(llvm::Function *F) {
```
https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Paul Kirth via llvm-branch-commits


@@ -2873,14 +2890,33 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 
   llvm::Metadata *MD = CreateMetadataIdentifierForType(FD->getType());
   F->addTypeMetadata(0, MD);
-  F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+  // Add the generalized identifier if not added already.
+  if (!HasExistingGeneralizedTypeMD(F))

ilovepi wrote:

You can probably forgo the call to `hasExistingGeneralizedTypeMD()` altogether. 
The only way that this code would execute is if we're doing CFI and we'd add 
metadata. As the code is written that will always happen if you reach this 
point. Therefor, the MD won't exist, unless you've added it. You can track that 
w/ a local variable `bool AddedTypeMD`, and set it to true if you modify 
anything. 

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Paul Kirth via llvm-branch-commits


@@ -0,0 +1,83 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets.
+

ilovepi wrote:

```suggestion
REQUIRES: x86-registered-target
```
I think you need a requires line here, if you're gong to set the triple. Or 
this could be under `CodeGen/X86`. Maybe its OK because we stop at IR 
generation? I don't recall precisely, but there are at least some files that 
only emit LLVM IR and use REQUIRES in this folder, so it seems safe to do it 
that way.

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] callee_type metadata for indirect calls (PR #117036)

2025-04-22 Thread Paul Kirth via llvm-branch-commits


@@ -0,0 +1,83 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section \

ilovepi wrote:

Should this run w/ `-disable-llvm-passes`?

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko commented:

Partially addressed the comments.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [HLSL] Allow resource annotations to specify only register space (PR #135287)

2025-04-22 Thread Helena Kotas via llvm-branch-commits

https://github.com/hekota updated 
https://github.com/llvm/llvm-project/pull/135287

>From 6fb658a603f116f29e635e4802a8d77b896150ff Mon Sep 17 00:00:00 2001
From: Helena Kotas 
Date: Thu, 10 Apr 2025 17:31:57 -0700
Subject: [PATCH 1/5] [HLSL] Allow register annotations to specify only `space`

Specifying only `space` in a `register` annotation means the compiler should
implicitly assign a register slot to the resource from the provided virtual
register space.
---
 clang/include/clang/Basic/Attr.td | 11 ++-
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  2 +-
 clang/lib/Sema/SemaHLSL.cpp   | 74 +++
 .../test/AST/HLSL/resource_binding_attr.hlsl  |  4 +
 .../SemaHLSL/resource_binding_attr_error.hlsl |  4 +-
 test.ll   |  7 ++
 6 files changed, 64 insertions(+), 38 deletions(-)
 create mode 100644 test.ll

diff --git a/clang/include/clang/Basic/Attr.td 
b/clang/include/clang/Basic/Attr.td
index fd9e686485552..1fe37ad4e2d4d 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -4751,20 +4751,25 @@ def HLSLResourceBinding: InheritableAttr {
 
   private:
   RegisterType RegType;
-  unsigned SlotNumber;
+  int SlotNumber; // -1 if the register slot was not specified
   unsigned SpaceNumber;
 
   public:
-  void setBinding(RegisterType RT, unsigned SlotNum, unsigned SpaceNum) {
+  void setBinding(RegisterType RT, int SlotNum, unsigned SpaceNum) {
 RegType = RT;
 SlotNumber = SlotNum;
 SpaceNumber = SpaceNum;
   }
+  bool isImplicit() const {
+return SlotNumber < 0;
+  }
   RegisterType getRegisterType() const {
+assert(!isImplicit() && "binding does not have register slot");
 return RegType;
   }
   unsigned getSlotNumber() const {
-return SlotNumber;
+assert(!isImplicit() && "binding does not have register slot");
+return (unsigned)SlotNumber;
   }
   unsigned getSpaceNumber() const {
 return SpaceNumber;
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 450213fcec676..e42bb8e16e80b 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -261,7 +261,7 @@ void CGHLSLRuntime::addBuffer(const HLSLBufferDecl 
*BufDecl) {
   BufDecl->getAttr();
   // FIXME: handle implicit binding if no binding attribute is found
   // (llvm/llvm-project#110722)
-  if (RBA)
+  if (RBA && !RBA->isImplicit())
 initializeBufferFromBinding(CGM, BufGV, RBA->getSlotNumber(),
 RBA->getSpaceNumber());
 }
diff --git a/clang/lib/Sema/SemaHLSL.cpp b/clang/lib/Sema/SemaHLSL.cpp
index 27959f61f1dc3..73b8c19840026 100644
--- a/clang/lib/Sema/SemaHLSL.cpp
+++ b/clang/lib/Sema/SemaHLSL.cpp
@@ -1529,72 +1529,82 @@ void SemaHLSL::handleResourceBindingAttr(Decl *TheDecl, 
const ParsedAttr &AL) {
 diag::err_incomplete_type))
   return;
   }
-  StringRef Space = "space0";
+
   StringRef Slot = "";
+  StringRef Space = "";
+  SourceLocation SlotLoc, SpaceLoc;
 
   if (!AL.isArgIdent(0)) {
 Diag(AL.getLoc(), diag::err_attribute_argument_type)
 << AL << AANT_ArgumentIdentifier;
 return;
   }
-
   IdentifierLoc *Loc = AL.getArgAsIdent(0);
-  StringRef Str = Loc->Ident->getName();
-  SourceLocation ArgLoc = Loc->Loc;
 
-  SourceLocation SpaceArgLoc;
-  bool SpecifiedSpace = false;
   if (AL.getNumArgs() == 2) {
-SpecifiedSpace = true;
-Slot = Str;
+Slot = Loc->Ident->getName();
+SlotLoc = Loc->Loc;
+
 if (!AL.isArgIdent(1)) {
   Diag(AL.getLoc(), diag::err_attribute_argument_type)
   << AL << AANT_ArgumentIdentifier;
   return;
 }
 
-IdentifierLoc *Loc = AL.getArgAsIdent(1);
+Loc = AL.getArgAsIdent(1);
 Space = Loc->Ident->getName();
-SpaceArgLoc = Loc->Loc;
+SpaceLoc = Loc->Loc;
   } else {
-Slot = Str;
+StringRef Str = Loc->Ident->getName();
+if (Str.starts_with("space")) {
+  Space = Str;
+  SpaceLoc = Loc->Loc;
+} else {
+  Slot = Str;
+  SlotLoc = Loc->Loc;
+  Space = "space0";
+}
   }
 
-  RegisterType RegType;
-  unsigned SlotNum = 0;
+  RegisterType RegType = RegisterType::SRV;
+  int SlotNum = -1;
   unsigned SpaceNum = 0;
 
-  // Validate.
+  // Validate slot
   if (!Slot.empty()) {
 if (!convertToRegisterType(Slot, &RegType)) {
-  Diag(ArgLoc, diag::err_hlsl_binding_type_invalid) << Slot.substr(0, 1);
+  Diag(SlotLoc, diag::err_hlsl_binding_type_invalid) << Slot.substr(0, 1);
   return;
 }
 if (RegType == RegisterType::I) {
-  Diag(ArgLoc, diag::warn_hlsl_deprecated_register_type_i);
+  Diag(SlotLoc, diag::warn_hlsl_deprecated_register_type_i);
   return;
 }
-
 StringRef SlotNumStr = Slot.substr(1);
 if (SlotNumStr.getAsInteger(10, SlotNum)) {
-  Diag(ArgLoc, diag::err_hlsl_u

  1   2   >