[llvm-branch-commits] [llvm] AMDGPU: Create a dummy call sequence when emitting call error (PR #170656)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/170656 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Avoid crashing on statepoint-like pseudoinstructions (PR #170657)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/170657 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Avoid crashing on statepoint-like pseudoinstructions (PR #170657)
llvmbot wrote:
@llvm/pr-subscribers-backend-amdgpu
Author: Matt Arsenault (arsenm)
Changes
At the moment the MIR tests are somewhat redundant. The waitcnt
one is needed to ensure we actually have a load, given we are
currently just emitting an error on ExternalSymbol. The asm printer
one is more redundant for the moment, since it's stressed by the IR
test. However I am planning to change the error path for the IR test,
so it will soon not be redundant.
---
Full diff: https://github.com/llvm/llvm-project/pull/170657.diff
13 Files Affected:
- (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+12-1)
- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+11)
- (modified) llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp (+2)
- (modified) llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp (+10)
- (modified) llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp (+4-1)
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+12)
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.h (+2)
- (modified) llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp (+1-1)
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+8)
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+2)
- (added) llvm/test/CodeGen/AMDGPU/llvm.deoptimize.ll (+16)
- (added) llvm/test/CodeGen/AMDGPU/statepoint-asm-printer.mir (+40)
- (added) llvm/test/CodeGen/AMDGPU/statepoint-insert-waitcnts.mir (+64)
``diff
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 18142c2c0adf3..bdd9fee795e08 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -2350,7 +2350,18 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
/// Returns the callee operand from the given \p MI.
virtual const MachineOperand &getCalleeOperand(const MachineInstr &MI) const
{
-return MI.getOperand(0);
+assert(MI.isCall());
+
+switch (MI.getOpcode()) {
+case TargetOpcode::STATEPOINT:
+case TargetOpcode::STACKMAP:
+case TargetOpcode::PATCHPOINT:
+ return MI.getOperand(3);
+default:
+ return MI.getOperand(0);
+}
+
+llvm_unreachable("impossible call instruction");
}
/// Return the uniformity behavior of the given instruction.
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index dd8f18d3b8a6a..7998da0ea06eb 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -331,6 +331,17 @@ namespace llvm {
MachineBasicBlock *
TargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
MachineBasicBlock *MBB) const {
+ switch (MI.getOpcode()) {
+ case TargetOpcode::STATEPOINT:
+// As an implementation detail, STATEPOINT shares the STACKMAP format at
+// this point in the process. We diverge later.
+ case TargetOpcode::STACKMAP:
+ case TargetOpcode::PATCHPOINT:
+return emitPatchPoint(MI, MBB);
+ default:
+break;
+ }
+
#ifndef NDEBUG
dbgs() << "If a target marks an instruction with "
"'usesCustomInserter', it must implement "
diff --git a/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
b/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
index 46a5e44374e1c..5b8cd343557fa 100644
--- a/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
@@ -1145,6 +1145,8 @@ void
SelectionDAGBuilder::LowerCallSiteWithDeoptBundleImpl(
const CallBase *Call, SDValue Callee, const BasicBlock *EHPadBB,
bool VarArgDisallowed, bool ForceVoidReturnTy) {
StatepointLoweringInfo SI(DAG);
+ SI.CLI.CB = Call;
+
unsigned ArgBeginIndex = Call->arg_begin() - Call->op_begin();
populateCallLoweringInfo(
SI.CLI, Call, ArgBeginIndex, Call->arg_size(), Callee,
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
index bf9b4297bd435..99c1ab8d379d5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
@@ -406,6 +406,16 @@ void AMDGPUAsmPrinter::emitInstruction(const MachineInstr
*MI) {
return;
}
+unsigned Opc = MI->getOpcode();
+if (LLVM_UNLIKELY(Opc == TargetOpcode::STATEPOINT ||
+ Opc == TargetOpcode::STACKMAP ||
+ Opc == TargetOpcode::PATCHPOINT)) {
+ LLVMContext &Ctx = MI->getMF()->getFunction().getContext();
+ Ctx.emitError("unhandled statepoint-like instruction");
+ OutStreamer->emitRawComment("unsupported
statepoint/stackmap/patchpoint");
+ return;
+}
+
if (isVerbose())
if (STI.getInstrInfo()->isBlockLoadStore(MI->getOpcode()))
emitVGPRBlockComment(MI, STI.getInstrInfo(), STI.getRegisterInfo(),
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
b/llvm/lib/Target/AMDGPU/AMD
[llvm-branch-commits] [llvm] AMDGPU/PromoteAlloca: Always use i32 for indexing (PR #170511)
@@ -461,22 +461,23 @@ static Value *GEPToVectorIndex(GetElementPtrInst *GEP, AllocaInst *Alloca, return nullptr; Value *Offset = VarOffset.first; - auto *OffsetType = dyn_cast(Offset->getType()); - if (!OffsetType) + if (!isa(Offset->getType())) return nullptr; + Offset = Builder.CreateSExtOrTrunc(Offset, Builder.getIntNTy(BW)); ritter-x2a wrote: This patch changed it to signed: https://github.com/llvm/llvm-project/pull/157682 Before that, the unsigned treatment caused a bug: https://github.com/llvm/llvm-project/pull/155415#issuecomment-3244625707 GEPs with negative 32-bit indices were promoted into broken 64-bit extract-element indices. https://github.com/llvm/llvm-project/pull/170511 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [rtsan] Handle attributed IR function declarations (#169577) (PR #170641)
github-actions[bot] wrote: ⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo. Please turn off [Keep my email addresses private](https://github.com/settings/emails) setting in your account. See [LLVM Developer Policy](https://llvm.org/docs/DeveloperPolicy.html#email-addresses) and [LLVM Discourse](https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it) for more information. https://github.com/llvm/llvm-project/pull/170641 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add KnownBits simplification combines to RegBankCombiner (PR #141591)
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/141591
>From 515092eec54b06b21c27a36d35b9f99448c436d8 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Tue, 27 May 2025 12:29:02 +0200
Subject: [PATCH 1/3] [AMDGPU] Add KnownBits simplification combines to
RegBankCombiner
---
llvm/lib/Target/AMDGPU/AMDGPUCombine.td | 3 +-
llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll | 30 ++---
.../test/CodeGen/AMDGPU/GlobalISel/saddsat.ll | 61 +++---
.../test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll | 63 +++
llvm/test/CodeGen/AMDGPU/div_i128.ll | 6 +-
llvm/test/CodeGen/AMDGPU/lround.ll| 18 +++---
llvm/test/CodeGen/AMDGPU/v_sat_pk_u8_i16.ll | 16 +
7 files changed, 81 insertions(+), 116 deletions(-)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
index 3639e2b960e0a..57a3ee6d0ce04 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
@@ -250,5 +250,6 @@ def AMDGPURegBankCombiner : GICombiner<
fp_minmax_to_clamp, fp_minmax_to_med3, fmed3_intrinsic_to_clamp,
identity_combines, redundant_and, constant_fold_cast_op,
cast_of_cast_combines, sext_trunc, zext_of_shift_amount_combines,
- d16_load, lower_uniform_sbfx, lower_uniform_ubfx, form_bitfield_extract]> {
+ d16_load, lower_uniform_sbfx, lower_uniform_ubfx, form_bitfield_extract,
+ known_bits_simplifications]> {
}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
b/llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
index 518af70cbbf9f..1d8413b82fc6a 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
@@ -1744,8 +1744,12 @@ define i65 @v_lshr_i65_33(i65 %value) {
; GFX6-LABEL: v_lshr_i65_33:
; GFX6: ; %bb.0:
; GFX6-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX6-NEXT:v_mov_b32_e32 v3, v1
-; GFX6-NEXT:v_mov_b32_e32 v0, 1
+; GFX6-NEXT:v_mov_b32_e32 v3, 1
+; GFX6-NEXT:v_mov_b32_e32 v4, 0
+; GFX6-NEXT:v_and_b32_e32 v3, 1, v2
+; GFX6-NEXT:v_lshl_b64 v[2:3], v[3:4], 31
+; GFX6-NEXT:v_lshrrev_b32_e32 v0, 1, v1
+; GFX6-NEXT:v_or_b32_e32 v0, v0, v2
; GFX6-NEXT:v_mov_b32_e32 v1, 0
; GFX6-NEXT:v_and_b32_e32 v0, 1, v2
; GFX6-NEXT:v_lshl_b64 v[0:1], v[0:1], 31
@@ -1757,8 +1761,12 @@ define i65 @v_lshr_i65_33(i65 %value) {
; GFX8-LABEL: v_lshr_i65_33:
; GFX8: ; %bb.0:
; GFX8-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT:v_mov_b32_e32 v3, v1
-; GFX8-NEXT:v_mov_b32_e32 v0, 1
+; GFX8-NEXT:v_mov_b32_e32 v3, 1
+; GFX8-NEXT:v_mov_b32_e32 v4, 0
+; GFX8-NEXT:v_and_b32_e32 v3, 1, v2
+; GFX8-NEXT:v_lshlrev_b64 v[2:3], 31, v[3:4]
+; GFX8-NEXT:v_lshrrev_b32_e32 v0, 1, v1
+; GFX8-NEXT:v_or_b32_e32 v0, v0, v2
; GFX8-NEXT:v_mov_b32_e32 v1, 0
; GFX8-NEXT:v_and_b32_e32 v0, 1, v2
; GFX8-NEXT:v_lshlrev_b64 v[0:1], 31, v[0:1]
@@ -1770,8 +1778,12 @@ define i65 @v_lshr_i65_33(i65 %value) {
; GFX9-LABEL: v_lshr_i65_33:
; GFX9: ; %bb.0:
; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX9-NEXT:v_mov_b32_e32 v3, v1
-; GFX9-NEXT:v_mov_b32_e32 v0, 1
+; GFX9-NEXT:v_mov_b32_e32 v3, 1
+; GFX9-NEXT:v_mov_b32_e32 v4, 0
+; GFX9-NEXT:v_and_b32_e32 v3, 1, v2
+; GFX9-NEXT:v_lshlrev_b64 v[2:3], 31, v[3:4]
+; GFX9-NEXT:v_lshrrev_b32_e32 v0, 1, v1
+; GFX9-NEXT:v_or_b32_e32 v0, v0, v2
; GFX9-NEXT:v_mov_b32_e32 v1, 0
; GFX9-NEXT:v_and_b32_e32 v0, 1, v2
; GFX9-NEXT:v_lshlrev_b64 v[0:1], 31, v[0:1]
@@ -1783,8 +1795,10 @@ define i65 @v_lshr_i65_33(i65 %value) {
; GFX10-LABEL: v_lshr_i65_33:
; GFX10: ; %bb.0:
; GFX10-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX10-NEXT:v_mov_b32_e32 v3, v1
-; GFX10-NEXT:v_mov_b32_e32 v0, 1
+; GFX10-NEXT:v_mov_b32_e32 v3, 1
+; GFX10-NEXT:v_mov_b32_e32 v4, 0
+; GFX10-NEXT:v_and_b32_e32 v3, 1, v2
+; GFX10-NEXT:v_lshrrev_b32_e32 v0, 1, v1
; GFX10-NEXT:v_mov_b32_e32 v1, 0
; GFX10-NEXT:v_and_b32_e32 v0, 1, v2
; GFX10-NEXT:v_lshrrev_b32_e32 v2, 1, v3
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
b/llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
index 14332dfeaabd8..bf48ebb8242df 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
@@ -80,11 +80,10 @@ define amdgpu_ps i7 @s_saddsat_i7(i7 inreg %lhs, i7 inreg
%rhs) {
; GFX8-NEXT:s_min_i32 s2, s2, 0
; GFX8-NEXT:s_lshl_b32 s1, s1, 9
; GFX8-NEXT:s_sub_i32 s2, 0x8000, s2
+; GFX8-NEXT:s_sub_i32 s3, 0x7fff, s3
; GFX8-NEXT:s_sext_i32_i16 s2, s2
; GFX8-NEXT:s_sext_i32_i16 s1, s1
-; GFX8-NEXT:s_sub_i32 s3, 0x7fff, s3
; GFX8-NEXT:s_max_i32 s1, s2, s1
-; GFX8-NEXT:s_sext_i32_i16 s1, s1
; GFX8-NEXT:s_sext_i32_i16 s2, s3
; GFX8-NEXT:s_min_i32 s1, s1, s2
; GFX8-NEXT:s_add_i32 s0, s0, s1
@@ -189,11 +188,10 @@ define amdgpu_
[llvm-branch-commits] [llvm] [AMDGPU] Add KnownBits simplification combines to RegBankCombiner (PR #141591)
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/141591
>From 515092eec54b06b21c27a36d35b9f99448c436d8 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Tue, 27 May 2025 12:29:02 +0200
Subject: [PATCH 1/3] [AMDGPU] Add KnownBits simplification combines to
RegBankCombiner
---
llvm/lib/Target/AMDGPU/AMDGPUCombine.td | 3 +-
llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll | 30 ++---
.../test/CodeGen/AMDGPU/GlobalISel/saddsat.ll | 61 +++---
.../test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll | 63 +++
llvm/test/CodeGen/AMDGPU/div_i128.ll | 6 +-
llvm/test/CodeGen/AMDGPU/lround.ll| 18 +++---
llvm/test/CodeGen/AMDGPU/v_sat_pk_u8_i16.ll | 16 +
7 files changed, 81 insertions(+), 116 deletions(-)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
index 3639e2b960e0a..57a3ee6d0ce04 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
@@ -250,5 +250,6 @@ def AMDGPURegBankCombiner : GICombiner<
fp_minmax_to_clamp, fp_minmax_to_med3, fmed3_intrinsic_to_clamp,
identity_combines, redundant_and, constant_fold_cast_op,
cast_of_cast_combines, sext_trunc, zext_of_shift_amount_combines,
- d16_load, lower_uniform_sbfx, lower_uniform_ubfx, form_bitfield_extract]> {
+ d16_load, lower_uniform_sbfx, lower_uniform_ubfx, form_bitfield_extract,
+ known_bits_simplifications]> {
}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
b/llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
index 518af70cbbf9f..1d8413b82fc6a 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
@@ -1744,8 +1744,12 @@ define i65 @v_lshr_i65_33(i65 %value) {
; GFX6-LABEL: v_lshr_i65_33:
; GFX6: ; %bb.0:
; GFX6-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX6-NEXT:v_mov_b32_e32 v3, v1
-; GFX6-NEXT:v_mov_b32_e32 v0, 1
+; GFX6-NEXT:v_mov_b32_e32 v3, 1
+; GFX6-NEXT:v_mov_b32_e32 v4, 0
+; GFX6-NEXT:v_and_b32_e32 v3, 1, v2
+; GFX6-NEXT:v_lshl_b64 v[2:3], v[3:4], 31
+; GFX6-NEXT:v_lshrrev_b32_e32 v0, 1, v1
+; GFX6-NEXT:v_or_b32_e32 v0, v0, v2
; GFX6-NEXT:v_mov_b32_e32 v1, 0
; GFX6-NEXT:v_and_b32_e32 v0, 1, v2
; GFX6-NEXT:v_lshl_b64 v[0:1], v[0:1], 31
@@ -1757,8 +1761,12 @@ define i65 @v_lshr_i65_33(i65 %value) {
; GFX8-LABEL: v_lshr_i65_33:
; GFX8: ; %bb.0:
; GFX8-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT:v_mov_b32_e32 v3, v1
-; GFX8-NEXT:v_mov_b32_e32 v0, 1
+; GFX8-NEXT:v_mov_b32_e32 v3, 1
+; GFX8-NEXT:v_mov_b32_e32 v4, 0
+; GFX8-NEXT:v_and_b32_e32 v3, 1, v2
+; GFX8-NEXT:v_lshlrev_b64 v[2:3], 31, v[3:4]
+; GFX8-NEXT:v_lshrrev_b32_e32 v0, 1, v1
+; GFX8-NEXT:v_or_b32_e32 v0, v0, v2
; GFX8-NEXT:v_mov_b32_e32 v1, 0
; GFX8-NEXT:v_and_b32_e32 v0, 1, v2
; GFX8-NEXT:v_lshlrev_b64 v[0:1], 31, v[0:1]
@@ -1770,8 +1778,12 @@ define i65 @v_lshr_i65_33(i65 %value) {
; GFX9-LABEL: v_lshr_i65_33:
; GFX9: ; %bb.0:
; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX9-NEXT:v_mov_b32_e32 v3, v1
-; GFX9-NEXT:v_mov_b32_e32 v0, 1
+; GFX9-NEXT:v_mov_b32_e32 v3, 1
+; GFX9-NEXT:v_mov_b32_e32 v4, 0
+; GFX9-NEXT:v_and_b32_e32 v3, 1, v2
+; GFX9-NEXT:v_lshlrev_b64 v[2:3], 31, v[3:4]
+; GFX9-NEXT:v_lshrrev_b32_e32 v0, 1, v1
+; GFX9-NEXT:v_or_b32_e32 v0, v0, v2
; GFX9-NEXT:v_mov_b32_e32 v1, 0
; GFX9-NEXT:v_and_b32_e32 v0, 1, v2
; GFX9-NEXT:v_lshlrev_b64 v[0:1], 31, v[0:1]
@@ -1783,8 +1795,10 @@ define i65 @v_lshr_i65_33(i65 %value) {
; GFX10-LABEL: v_lshr_i65_33:
; GFX10: ; %bb.0:
; GFX10-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX10-NEXT:v_mov_b32_e32 v3, v1
-; GFX10-NEXT:v_mov_b32_e32 v0, 1
+; GFX10-NEXT:v_mov_b32_e32 v3, 1
+; GFX10-NEXT:v_mov_b32_e32 v4, 0
+; GFX10-NEXT:v_and_b32_e32 v3, 1, v2
+; GFX10-NEXT:v_lshrrev_b32_e32 v0, 1, v1
; GFX10-NEXT:v_mov_b32_e32 v1, 0
; GFX10-NEXT:v_and_b32_e32 v0, 1, v2
; GFX10-NEXT:v_lshrrev_b32_e32 v2, 1, v3
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
b/llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
index 14332dfeaabd8..bf48ebb8242df 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
@@ -80,11 +80,10 @@ define amdgpu_ps i7 @s_saddsat_i7(i7 inreg %lhs, i7 inreg
%rhs) {
; GFX8-NEXT:s_min_i32 s2, s2, 0
; GFX8-NEXT:s_lshl_b32 s1, s1, 9
; GFX8-NEXT:s_sub_i32 s2, 0x8000, s2
+; GFX8-NEXT:s_sub_i32 s3, 0x7fff, s3
; GFX8-NEXT:s_sext_i32_i16 s2, s2
; GFX8-NEXT:s_sext_i32_i16 s1, s1
-; GFX8-NEXT:s_sub_i32 s3, 0x7fff, s3
; GFX8-NEXT:s_max_i32 s1, s2, s1
-; GFX8-NEXT:s_sext_i32_i16 s1, s1
; GFX8-NEXT:s_sext_i32_i16 s2, s3
; GFX8-NEXT:s_min_i32 s1, s1, s2
; GFX8-NEXT:s_add_i32 s0, s0, s1
@@ -189,11 +188,10 @@ define amdgpu_
[llvm-branch-commits] [llvm] release/21.x: [rtsan] Handle attributed IR function declarations (#169577) (PR #170641)
https://github.com/llvmbot created
https://github.com/llvm/llvm-project/pull/170641
Backport 5d4c441
Requested by: @davidtrevelyan
>From b845b4cd771efebf7e09f0016f8c1e75a924a6fb Mon Sep 17 00:00:00 2001
From: davidtrevelyan
Date: Mon, 1 Dec 2025 20:56:43 +
Subject: [PATCH] [rtsan] Handle attributed IR function declarations (#169577)
Addresses https://github.com/llvm/llvm-project/issues/169377.
Previously, the RealtimeSanitizer pass only handled attributed function
_definitions_ in IR, and we have recently found that attributed function
_declarations_ caused it to crash. To fix the issue, we must check
whether the IR function is empty before attempting to do any
manipulation of its instructions.
This PR:
- Adds checks for whether IR `Function`s are `empty()` ~~in each
relevant~~ at the top-level RTSan pass routine
- ~~Removes the utility function `rtsanPreservedCFGAnalyses` from the
pass, whose result was unused and which would otherwise have complicated
the fix~~
(cherry picked from commit 5d4c4411f13755d5f12a83a0d6705e8501f33d5f)
---
.../Transforms/Instrumentation/RealtimeSanitizer.cpp | 3 +++
.../RealtimeSanitizer/rtsan_attrib_declare.ll | 11 +++
2 files changed, 14 insertions(+)
create mode 100644
llvm/test/Instrumentation/RealtimeSanitizer/rtsan_attrib_declare.ll
diff --git a/llvm/lib/Transforms/Instrumentation/RealtimeSanitizer.cpp
b/llvm/lib/Transforms/Instrumentation/RealtimeSanitizer.cpp
index 5ef6ffb58a7c1..667fdb746175f 100644
--- a/llvm/lib/Transforms/Instrumentation/RealtimeSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/RealtimeSanitizer.cpp
@@ -90,6 +90,9 @@ PreservedAnalyses RealtimeSanitizerPass::run(Module &M,
[&](Function *Ctor, FunctionCallee) { appendToGlobalCtors(M, Ctor, 0);
});
for (Function &F : M) {
+if (F.empty())
+ continue;
+
if (F.hasFnAttribute(Attribute::SanitizeRealtime))
runSanitizeRealtime(F);
diff --git
a/llvm/test/Instrumentation/RealtimeSanitizer/rtsan_attrib_declare.ll
b/llvm/test/Instrumentation/RealtimeSanitizer/rtsan_attrib_declare.ll
new file mode 100644
index 0..3526a010ce489
--- /dev/null
+++ b/llvm/test/Instrumentation/RealtimeSanitizer/rtsan_attrib_declare.ll
@@ -0,0 +1,11 @@
+; RUN: opt < %s -passes='rtsan' -S | FileCheck %s
+
+declare void @declared_realtime_function() sanitize_realtime #0
+
+declare void @declared_blocking_function() sanitize_realtime_blocking #0
+
+; RealtimeSanitizer pass should ignore attributed functions that are just
declarations
+; CHECK: declared_realtime_function
+; CHECK-EMPTY:
+; CHECK: declared_blocking_function
+; CHECK-EMPTY:
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [rtsan] Handle attributed IR function declarations (#169577) (PR #170641)
llvmbot wrote: @cjappl What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/170641 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [rtsan] Handle attributed IR function declarations (#169577) (PR #170641)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/170641 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [rtsan] Handle attributed IR function declarations (#169577) (PR #170641)
llvmbot wrote:
@llvm/pr-subscribers-compiler-rt-sanitizer
Author: None (llvmbot)
Changes
Backport 5d4c441
Requested by: @davidtrevelyan
---
Full diff: https://github.com/llvm/llvm-project/pull/170641.diff
2 Files Affected:
- (modified) llvm/lib/Transforms/Instrumentation/RealtimeSanitizer.cpp (+3)
- (added) llvm/test/Instrumentation/RealtimeSanitizer/rtsan_attrib_declare.ll
(+11)
``diff
diff --git a/llvm/lib/Transforms/Instrumentation/RealtimeSanitizer.cpp
b/llvm/lib/Transforms/Instrumentation/RealtimeSanitizer.cpp
index 5ef6ffb58a7c1..667fdb746175f 100644
--- a/llvm/lib/Transforms/Instrumentation/RealtimeSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/RealtimeSanitizer.cpp
@@ -90,6 +90,9 @@ PreservedAnalyses RealtimeSanitizerPass::run(Module &M,
[&](Function *Ctor, FunctionCallee) { appendToGlobalCtors(M, Ctor, 0);
});
for (Function &F : M) {
+if (F.empty())
+ continue;
+
if (F.hasFnAttribute(Attribute::SanitizeRealtime))
runSanitizeRealtime(F);
diff --git
a/llvm/test/Instrumentation/RealtimeSanitizer/rtsan_attrib_declare.ll
b/llvm/test/Instrumentation/RealtimeSanitizer/rtsan_attrib_declare.ll
new file mode 100644
index 0..3526a010ce489
--- /dev/null
+++ b/llvm/test/Instrumentation/RealtimeSanitizer/rtsan_attrib_declare.ll
@@ -0,0 +1,11 @@
+; RUN: opt < %s -passes='rtsan' -S | FileCheck %s
+
+declare void @declared_realtime_function() sanitize_realtime #0
+
+declare void @declared_blocking_function() sanitize_realtime_blocking #0
+
+; RealtimeSanitizer pass should ignore attributed functions that are just
declarations
+; CHECK: declared_realtime_function
+; CHECK-EMPTY:
+; CHECK: declared_blocking_function
+; CHECK-EMPTY:
``
https://github.com/llvm/llvm-project/pull/170641
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
https://github.com/nikic edited https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Add overload of getExternalSymbol using RTLIB::LibcallImpl (PR #170587)
https://github.com/RKSimon approved this pull request. LGTM cheers https://github.com/llvm/llvm-project/pull/170587 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Reorder struct fields to have less padding (PR #170222)
https://github.com/petrhosek approved this pull request. https://github.com/llvm/llvm-project/pull/170222 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Use static functions over the anonymous namespace (PR #170221)
https://github.com/petrhosek approved this pull request. https://github.com/llvm/llvm-project/pull/170221 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add BFX Formation Combines to RegBankCombiner (PR #141590)
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/141590 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [LifetimeSafety] Track moved declarations to prevent false positives (PR #170007)
hokein wrote:
> The lifetime safety analysis was previously generating false positives by
> warning about use-after-lifetime when the original variable was destroyed
> after being moved. This change prevents those false positives by tracking
> moved declarations and exempting them from loan expiration checks.
Just a note.
While this fixes false positives, it introduces false negatives. The pointer is
not always valid after the owner object is moved, an use-after-free example
(when the string uses short string optimization) https://godbolt.org/z/eP7PbaMEn
```
int main() {
std::string_view s;
std::string b;
{
std::string a = "12345"; // small literal, stored in the string object,
rather than the heap.
s = a;
b = std::move(a);
}
std::cout << s << "\n"; // oops, s refers to a dangling object.
}
```
https://github.com/llvm/llvm-project/pull/170007
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [ExposeObjCDirect] Setup helper functions (PR #170617)
https://github.com/DataCorrupted edited https://github.com/llvm/llvm-project/pull/170617 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add Mustache case to assets test (PR #170198)
https://github.com/evelez7 updated
https://github.com/llvm/llvm-project/pull/170198
>From d56b271edbe76ccbdf2fad3533ea966bf77e527a Mon Sep 17 00:00:00 2001
From: Erick Velez
Date: Wed, 26 Nov 2025 22:28:10 -0800
Subject: [PATCH] [clang-doc] Add Mustache case to assets test
Mustache wasn't tested in the assets lit test, which tests if
user-supplied assets are copied correctly. The Mustache HTML backend
initialy failed this test because it expected every asset, which
included Mustache templates, to be supplied. For now, we just expect
either CSS or JS to be supplied and use the default if one of them isn't
given.
We can allow custom templates in the future using the same checks.
---
clang-tools-extra/clang-doc/support/Utils.cpp | 20 +--
.../clang-doc/tool/ClangDocMain.cpp | 5 +++--
clang-tools-extra/test/clang-doc/assets.cpp | 10 ++
3 files changed, 27 insertions(+), 8 deletions(-)
diff --git a/clang-tools-extra/clang-doc/support/Utils.cpp
b/clang-tools-extra/clang-doc/support/Utils.cpp
index 6ed56033738b5..897a7ad0adb79 100644
--- a/clang-tools-extra/clang-doc/support/Utils.cpp
+++ b/clang-tools-extra/clang-doc/support/Utils.cpp
@@ -33,8 +33,20 @@ void getMustacheHtmlFiles(StringRef AssetsPath,
assert(!AssetsPath.empty());
assert(sys::fs::is_directory(AssetsPath));
- SmallString<128> DefaultStylesheet =
- appendPathPosix(AssetsPath, "clang-doc-mustache.css");
+ // TODO: Allow users to override default templates with their own. We would
+ // similarly have to check if a template file already exists in CDCtx.
+ if (CDCtx.UserStylesheets.empty()) {
+SmallString<128> DefaultStylesheet =
+appendPathPosix(AssetsPath, "clang-doc-mustache.css");
+CDCtx.UserStylesheets.insert(CDCtx.UserStylesheets.begin(),
+ DefaultStylesheet.c_str());
+ }
+
+ if (CDCtx.JsScripts.empty()) {
+SmallString<128> IndexJS = appendPathPosix(AssetsPath,
"mustache-index.js");
+CDCtx.JsScripts.insert(CDCtx.JsScripts.begin(), IndexJS.c_str());
+ }
+
SmallString<128> NamespaceTemplate =
appendPathPosix(AssetsPath, "namespace-template.mustache");
SmallString<128> ClassTemplate =
@@ -45,11 +57,7 @@ void getMustacheHtmlFiles(StringRef AssetsPath,
appendPathPosix(AssetsPath, "function-template.mustache");
SmallString<128> CommentTemplate =
appendPathPosix(AssetsPath, "comment-template.mustache");
- SmallString<128> IndexJS = appendPathPosix(AssetsPath, "mustache-index.js");
- CDCtx.JsScripts.insert(CDCtx.JsScripts.begin(), IndexJS.c_str());
- CDCtx.UserStylesheets.insert(CDCtx.UserStylesheets.begin(),
- DefaultStylesheet.c_str());
CDCtx.MustacheTemplates.insert(
{"namespace-template", NamespaceTemplate.c_str()});
CDCtx.MustacheTemplates.insert({"class-template", ClassTemplate.c_str()});
diff --git a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
index 62fa6a17df2ee..8de7c8ad6f000 100644
--- a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
+++ b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
@@ -151,6 +151,7 @@ static std::string getExecutablePath(const char *Argv0,
void *MainAddr) {
return llvm::sys::fs::getMainExecutable(Argv0, MainAddr);
}
+// TODO: Rename this, since it only gets custom CSS/JS
static llvm::Error getAssetFiles(clang::doc::ClangDocContext &CDCtx) {
using DirIt = llvm::sys::fs::directory_iterator;
std::error_code FileErr;
@@ -221,8 +222,8 @@ static llvm::Error getMustacheHtmlFiles(const char *Argv0,
llvm::outs() << "Asset path supply is not a directory: " << UserAssetPath
<< " falling back to default\n";
if (IsDir) {
-getMustacheHtmlFiles(UserAssetPath, CDCtx);
-return llvm::Error::success();
+if (auto Err = getAssetFiles(CDCtx))
+ return Err;
}
void *MainAddr = (void *)(intptr_t)getExecutablePath;
std::string ClangDocPath = getExecutablePath(Argv0, MainAddr);
diff --git a/clang-tools-extra/test/clang-doc/assets.cpp
b/clang-tools-extra/test/clang-doc/assets.cpp
index c5933e504f6b9..00d0d32213965 100644
--- a/clang-tools-extra/test/clang-doc/assets.cpp
+++ b/clang-tools-extra/test/clang-doc/assets.cpp
@@ -1,9 +1,13 @@
// RUN: rm -rf %t && mkdir %t
// RUN: clang-doc --format=html --output=%t --asset=%S/Inputs/test-assets
--executor=standalone %s --base base_dir
+// RUN: clang-doc --format=mustache --output=%t --asset=%S/Inputs/test-assets
--executor=standalone %s --base base_dir
// RUN: FileCheck %s -input-file=%t/index.html -check-prefix=INDEX
// RUN: FileCheck %s -input-file=%t/test.css -check-prefix=CSS
// RUN: FileCheck %s -input-file=%t/test.js -check-prefix=JS
+// RUN: FileCheck %s -input-file=%t/html/test.css -check-prefix=MUSTACHE-CSS
+// RUN: FileCheck %s -input-file=%t/html/test.js -check-prefix=MUSTACHE-JS
+
// INDEX:
// INDEX-NEXT:
// INDEX-NEXT: Index
@@ -19,4 +23,10 @@
// CSS-NE
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add Mustache case to assets test (PR #170198)
https://github.com/evelez7 updated
https://github.com/llvm/llvm-project/pull/170198
>From d56b271edbe76ccbdf2fad3533ea966bf77e527a Mon Sep 17 00:00:00 2001
From: Erick Velez
Date: Wed, 26 Nov 2025 22:28:10 -0800
Subject: [PATCH] [clang-doc] Add Mustache case to assets test
Mustache wasn't tested in the assets lit test, which tests if
user-supplied assets are copied correctly. The Mustache HTML backend
initialy failed this test because it expected every asset, which
included Mustache templates, to be supplied. For now, we just expect
either CSS or JS to be supplied and use the default if one of them isn't
given.
We can allow custom templates in the future using the same checks.
---
clang-tools-extra/clang-doc/support/Utils.cpp | 20 +--
.../clang-doc/tool/ClangDocMain.cpp | 5 +++--
clang-tools-extra/test/clang-doc/assets.cpp | 10 ++
3 files changed, 27 insertions(+), 8 deletions(-)
diff --git a/clang-tools-extra/clang-doc/support/Utils.cpp
b/clang-tools-extra/clang-doc/support/Utils.cpp
index 6ed56033738b5..897a7ad0adb79 100644
--- a/clang-tools-extra/clang-doc/support/Utils.cpp
+++ b/clang-tools-extra/clang-doc/support/Utils.cpp
@@ -33,8 +33,20 @@ void getMustacheHtmlFiles(StringRef AssetsPath,
assert(!AssetsPath.empty());
assert(sys::fs::is_directory(AssetsPath));
- SmallString<128> DefaultStylesheet =
- appendPathPosix(AssetsPath, "clang-doc-mustache.css");
+ // TODO: Allow users to override default templates with their own. We would
+ // similarly have to check if a template file already exists in CDCtx.
+ if (CDCtx.UserStylesheets.empty()) {
+SmallString<128> DefaultStylesheet =
+appendPathPosix(AssetsPath, "clang-doc-mustache.css");
+CDCtx.UserStylesheets.insert(CDCtx.UserStylesheets.begin(),
+ DefaultStylesheet.c_str());
+ }
+
+ if (CDCtx.JsScripts.empty()) {
+SmallString<128> IndexJS = appendPathPosix(AssetsPath,
"mustache-index.js");
+CDCtx.JsScripts.insert(CDCtx.JsScripts.begin(), IndexJS.c_str());
+ }
+
SmallString<128> NamespaceTemplate =
appendPathPosix(AssetsPath, "namespace-template.mustache");
SmallString<128> ClassTemplate =
@@ -45,11 +57,7 @@ void getMustacheHtmlFiles(StringRef AssetsPath,
appendPathPosix(AssetsPath, "function-template.mustache");
SmallString<128> CommentTemplate =
appendPathPosix(AssetsPath, "comment-template.mustache");
- SmallString<128> IndexJS = appendPathPosix(AssetsPath, "mustache-index.js");
- CDCtx.JsScripts.insert(CDCtx.JsScripts.begin(), IndexJS.c_str());
- CDCtx.UserStylesheets.insert(CDCtx.UserStylesheets.begin(),
- DefaultStylesheet.c_str());
CDCtx.MustacheTemplates.insert(
{"namespace-template", NamespaceTemplate.c_str()});
CDCtx.MustacheTemplates.insert({"class-template", ClassTemplate.c_str()});
diff --git a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
index 62fa6a17df2ee..8de7c8ad6f000 100644
--- a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
+++ b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
@@ -151,6 +151,7 @@ static std::string getExecutablePath(const char *Argv0,
void *MainAddr) {
return llvm::sys::fs::getMainExecutable(Argv0, MainAddr);
}
+// TODO: Rename this, since it only gets custom CSS/JS
static llvm::Error getAssetFiles(clang::doc::ClangDocContext &CDCtx) {
using DirIt = llvm::sys::fs::directory_iterator;
std::error_code FileErr;
@@ -221,8 +222,8 @@ static llvm::Error getMustacheHtmlFiles(const char *Argv0,
llvm::outs() << "Asset path supply is not a directory: " << UserAssetPath
<< " falling back to default\n";
if (IsDir) {
-getMustacheHtmlFiles(UserAssetPath, CDCtx);
-return llvm::Error::success();
+if (auto Err = getAssetFiles(CDCtx))
+ return Err;
}
void *MainAddr = (void *)(intptr_t)getExecutablePath;
std::string ClangDocPath = getExecutablePath(Argv0, MainAddr);
diff --git a/clang-tools-extra/test/clang-doc/assets.cpp
b/clang-tools-extra/test/clang-doc/assets.cpp
index c5933e504f6b9..00d0d32213965 100644
--- a/clang-tools-extra/test/clang-doc/assets.cpp
+++ b/clang-tools-extra/test/clang-doc/assets.cpp
@@ -1,9 +1,13 @@
// RUN: rm -rf %t && mkdir %t
// RUN: clang-doc --format=html --output=%t --asset=%S/Inputs/test-assets
--executor=standalone %s --base base_dir
+// RUN: clang-doc --format=mustache --output=%t --asset=%S/Inputs/test-assets
--executor=standalone %s --base base_dir
// RUN: FileCheck %s -input-file=%t/index.html -check-prefix=INDEX
// RUN: FileCheck %s -input-file=%t/test.css -check-prefix=CSS
// RUN: FileCheck %s -input-file=%t/test.js -check-prefix=JS
+// RUN: FileCheck %s -input-file=%t/html/test.css -check-prefix=MUSTACHE-CSS
+// RUN: FileCheck %s -input-file=%t/html/test.js -check-prefix=MUSTACHE-JS
+
// INDEX:
// INDEX-NEXT:
// INDEX-NEXT: Index
@@ -19,4 +23,10 @@
// CSS-NE
[llvm-branch-commits] [llvm] [AArch64][PAC] Factor out printing real AUT/PAC/BLRA encodings (NFC) (PR #160901)
atrosinenko wrote: Rebased the stack onto current `main` branch to resolve the conflict with recently merged #133536. https://github.com/llvm/llvm-project/pull/160901 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Replace HTML generation with Mustache backend (PR #170199)
@@ -32,461 +41,15 @@ getClangDocContext(std::vector
UserStylesheets = {},
ClangDocContext CDCtx{
{}, "test-project", {}, {}, {}, RepositoryUrl, RepositoryLinePrefix,
Base, UserStylesheets};
- CDCtx.UserStylesheets.insert(
- CDCtx.UserStylesheets.begin(),
- "../share/clang/clang-doc-default-stylesheet.css");
- CDCtx.JsScripts.emplace_back("index.js");
+ CDCtx.UserStylesheets.insert(CDCtx.UserStylesheets.begin(), "");
+ CDCtx.JsScripts.emplace_back("");
return CDCtx;
}
-TEST(HTMLGeneratorTest, emitNamespaceHTML) {
- NamespaceInfo I;
- I.Name = "Namespace";
- I.Namespace.emplace_back(EmptySID, "A", InfoType::IT_namespace);
-
- I.Children.Namespaces.emplace_back(EmptySID, "ChildNamespace",
- InfoType::IT_namespace,
- "Namespace::ChildNamespace", "Namespace");
- I.Children.Records.emplace_back(EmptySID, "ChildStruct", InfoType::IT_record,
- "Namespace::ChildStruct", "Namespace");
- I.Children.Functions.emplace_back();
- I.Children.Functions.back().Access = AccessSpecifier::AS_none;
- I.Children.Functions.back().Name = "OneFunction";
- I.Children.Enums.emplace_back();
- I.Children.Enums.back().Name = "OneEnum";
-
- auto G = getHTMLGenerator();
- assert(G);
- std::string Buffer;
- llvm::raw_string_ostream Actual(Buffer);
- ClangDocContext CDCtx = getClangDocContext({"user-provided-stylesheet.css"});
- auto Err = G->generateDocForInfo(&I, Actual, CDCtx);
- assert(!Err);
- std::string Expected = R"raw(
-
-namespace Namespace
-
-
-
-
-test-project
-
-
-
-namespace Namespace
-Namespaces
-
-
-ChildNamespace
-
-
-Records
-
-
-ChildStruct
-
-
-Functions
-
- OneFunction
- OneFunction()
-
-Enums
-
-
-
-
-enum OneEnum
-
-
-
-
-
-
-
-
-
- Namespaces
-
-
-
-
- Records
-
-
-
-
- Functions
-
-
-
-
- OneFunction
-
-
-
-
-
-
- Enums
-
-
-
-
- OneEnum
-
-
-
-
-
-
-
-
- )raw" +
- ClangDocVersion + R"raw(
-
-)raw";
-
- EXPECT_EQ(Expected, Actual.str());
-}
-
-TEST(HTMLGeneratorTest, emitRecordHTML) {
- RecordInfo I;
- I.Name = "r";
- I.Path = "X/Y/Z";
- I.Namespace.emplace_back(EmptySID, "A", InfoType::IT_namespace);
-
- I.DefLoc = Location(10, 10, "dir/test.cpp", true);
- I.Loc.emplace_back(12, 12, "test.cpp");
-
- SmallString<16> PathTo;
- llvm::sys::path::native("path/to", PathTo);
- I.Members.emplace_back(TypeInfo("int"), "X", AccessSpecifier::AS_private);
- I.TagType = TagTypeKind::Class;
- I.Parents.emplace_back(EmptySID, "F", InfoType::IT_record, "F", PathTo);
- I.VirtualParents.emplace_back(EmptySID, "G", InfoType::IT_record);
-
- I.Children.Records.emplace_back(EmptySID, "ChildStruct", InfoType::IT_record,
- "X::Y::Z::r::ChildStruct", "X/Y/Z/r");
- I.Children.Functions.emplace_back();
- I.Children.Functions.back().Name = "OneFunction";
- I.Children.Enums.emplace_back();
- I.Children.Enums.back().Name = "OneEnum";
-
- auto G = getHTMLGenerator();
- assert(G);
- std::string Buffer;
- llvm::raw_string_ostream Actual(Buffer);
- ClangDocContext CDCtx = getClangDocContext({}, "http://www.repository.com";);
- auto Err = G->generateDocForInfo(&I, Actual, CDCtx);
- assert(!Err);
- std::string Expected = R"raw(
-
-class r
-
-
-
-test-project
-
-
-
-class r
-
- Defined at line
- http://www.repository.com/dir/test.cpp#10";>10
- of file
- http://www.repository.com/dir/test.cpp";>test.cpp
-
-
- Inherits from
- F
- , G
-
-Members
-
-
-private int X
-
-
-Records
-
-
-ChildStruct
-
-
-Functions
-
- OneFunction
- public OneFunction()
-
-Enums
-
-
-
-
-enum OneEnum
-
-
-
-
-
-
-
-
-
- Members
-
-
-
-
- Records
-
-
-
-
- Functions
-
-
-
-
- OneFunction
-
-
-
-
-
-
- Enums
-
-
-
-
- OneEnum
-
-
-
-
-
-
-
-
- )raw" +
- ClangDocVersion + R"raw(
-
-)raw";
-
- EXPECT_EQ(Expected, Actual.str());
-}
-
-TEST(HTMLGeneratorTest, emitFunctionHTML) {
- FunctionInfo I;
- I.Name = "f";
[llvm-branch-commits] [clang] [clang] Use tighter lifetime bounds for C temporary arguments (PR #170518)
github-actions[bot] wrote:
:warning: C/C++ code formatter, clang-format found issues in your code.
:warning:
You can test this locally with the following command:
``bash
git-clang-format --diff origin/main HEAD --extensions c,cpp,h --
clang/lib/CodeGen/CGCall.cpp clang/lib/CodeGen/CGCall.h
clang/test/CodeGen/lifetime-invoke-c.c
clang/test/CodeGen/stack-usage-lifetimes.c
clang/test/CodeGenCXX/aggregate-lifetime-invoke.cpp
clang/test/CodeGenCXX/stack-reuse-miscompile.cpp --diff_from_common_commit
``
:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:
View the diff from clang-format here.
``diff
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 4831d55ed..0c71874c5 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4976,7 +4976,8 @@ void CodeGenFunction::EmitCallArg(CallArgList &args,
const Expr *E,
if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
EmitLifetimeStart(ArgSlotAlloca.getPointer())) {
if (E->getType().isDestructedType()) {
-pushFullExprCleanup(NormalEHLifetimeMarker,
ArgSlotAlloca);
+pushFullExprCleanup(NormalEHLifetimeMarker,
+ ArgSlotAlloca);
} else {
args.addLifetimeCleanup({ArgSlotAlloca.getPointer()});
if (getInvokeDest())
``
https://github.com/llvm/llvm-project/pull/170518
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Use tighter lifetime bounds for C temporary arguments (PR #170518)
https://github.com/ilovepi updated
https://github.com/llvm/llvm-project/pull/170518
>From acdccf174bad71ff21f820660528bcd460bb1f37 Mon Sep 17 00:00:00 2001
From: Paul Kirth
Date: Tue, 2 Dec 2025 15:14:32 -0800
Subject: [PATCH 1/5] [clang] Use tighter lifetime bounds for C temporary
arguments
In C, consecutive statements in the same scope are under
CompoundStmt/CallExpr, while in C++ they typically fall under
CompoundStmt/ExprWithCleanup. This leads to different behavior with
respect to where pushFullExprCleanUp inserts the lifetime end markers
(e.g., at the end of scope).
For these cases, we can track and insert the lifetime end markers right
after the call completes. Allowing the stack space to be reused
immediately. This partially addresses #109204 and #43598 for improving
stack usage.
---
clang/lib/CodeGen/CGCall.cpp | 18 ++
clang/lib/CodeGen/CGCall.h| 19 +++
clang/test/CodeGen/stack-usage-lifetimes.c| 12 ++--
.../CodeGenCXX/stack-reuse-miscompile.cpp | 2 +-
4 files changed, 40 insertions(+), 11 deletions(-)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 80075dd8a4cca..75e3bea3f3237 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4973,11 +4973,16 @@ void CodeGenFunction::EmitCallArg(CallArgList &args,
const Expr *E,
RawAddress ArgSlotAlloca = Address::invalid();
ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", &ArgSlotAlloca);
-// Emit a lifetime start/end for this temporary at the end of the full
-// expression.
+// Emit a lifetime start/end for this temporary. If the type has a
+// destructor, then we need to keep it alive for the full expression.
if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
-EmitLifetimeStart(ArgSlotAlloca.getPointer()))
- pushFullExprCleanup(NormalAndEHCleanup, ArgSlotAlloca);
+EmitLifetimeStart(ArgSlotAlloca.getPointer())) {
+ if (E->getType().isDestructedType()) {
+pushFullExprCleanup(NormalAndEHCleanup,
ArgSlotAlloca);
+ } else {
+args.addLifetimeCleanup({ArgSlotAlloca.getPointer()});
+ }
+}
}
args.add(EmitAnyExpr(E, ArgSlot), type);
@@ -6307,6 +6312,11 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo
&CallInfo,
for (CallLifetimeEnd &LifetimeEnd : CallLifetimeEndAfterCall)
LifetimeEnd.Emit(*this, /*Flags=*/{});
+ if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries)
+for (const CallArgList::EndLifetimeInfo < :
+ CallArgs.getLifetimeCleanups())
+ EmitLifetimeEnd(LT.Addr);
+
if (!ReturnValue.isExternallyDestructed() &&
RetTy.isDestructedType() == QualType::DK_nontrivial_c_struct)
pushDestroy(QualType::DK_nontrivial_c_struct, Ret.getAggregateAddress(),
diff --git a/clang/lib/CodeGen/CGCall.h b/clang/lib/CodeGen/CGCall.h
index 1ef8a3f114573..aab4b64d6a4a8 100644
--- a/clang/lib/CodeGen/CGCall.h
+++ b/clang/lib/CodeGen/CGCall.h
@@ -299,6 +299,10 @@ class CallArgList : public SmallVector {
llvm::Instruction *IsActiveIP;
};
+ struct EndLifetimeInfo {
+llvm::Value *Addr;
+ };
+
void add(RValue rvalue, QualType type) { push_back(CallArg(rvalue, type)); }
void addUncopiedAggregate(LValue LV, QualType type) {
@@ -312,6 +316,9 @@ class CallArgList : public SmallVector {
llvm::append_range(*this, other);
llvm::append_range(Writebacks, other.Writebacks);
llvm::append_range(CleanupsToDeactivate, other.CleanupsToDeactivate);
+LifetimeCleanups.insert(LifetimeCleanups.end(),
+other.LifetimeCleanups.begin(),
+other.LifetimeCleanups.end());
assert(!(StackBase && other.StackBase) && "can't merge stackbases");
if (!StackBase)
StackBase = other.StackBase;
@@ -352,6 +359,14 @@ class CallArgList : public SmallVector {
/// memory.
bool isUsingInAlloca() const { return StackBase; }
+ void addLifetimeCleanup(EndLifetimeInfo Info) {
+LifetimeCleanups.push_back(Info);
+ }
+
+ ArrayRef getLifetimeCleanups() const {
+return LifetimeCleanups;
+ }
+
// Support reversing writebacks for MSVC ABI.
void reverseWritebacks() {
std::reverse(Writebacks.begin(), Writebacks.end());
@@ -365,6 +380,10 @@ class CallArgList : public SmallVector {
/// occurs.
SmallVector CleanupsToDeactivate;
+ /// Lifetime information needed to call llvm.lifetime.end for any temporary
+ /// argument allocas.
+ SmallVector LifetimeCleanups;
+
/// The stacksave call. It dominates all of the argument evaluation.
llvm::CallInst *StackBase = nullptr;
};
diff --git a/clang/test/CodeGen/stack-usage-lifetimes.c
b/clang/test/CodeGen/stack-usage-lifetimes.c
index 3787a29e4ce7d..189bc9c229ca4 100644
--- a/clang/test/CodeGen/stack-usage-lifetimes.c
+++ b/clang/test/CodeGen/stack-usage-lifetimes.c
@@ -40,11 +40,11 @@ void t1(int c) {
}
void t2(void) {
- // x
[llvm-branch-commits] [clang] [clang] Use tighter lifetime bounds for C temporary arguments (PR #170518)
@@ -6291,6 +6296,11 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo
&CallInfo,
for (CallLifetimeEnd &LifetimeEnd : CallLifetimeEndAfterCall)
LifetimeEnd.Emit(*this, /*Flags=*/{});
+ if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries)
ilovepi wrote:
I've updated the implementation (and tests). LMK if that's more in line with
what you were thinking for the error handling paths.
https://github.com/llvm/llvm-project/pull/170518
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Use tighter lifetime bounds for C temporary arguments (PR #170518)
https://github.com/ilovepi updated
https://github.com/llvm/llvm-project/pull/170518
>From acdccf174bad71ff21f820660528bcd460bb1f37 Mon Sep 17 00:00:00 2001
From: Paul Kirth
Date: Tue, 2 Dec 2025 15:14:32 -0800
Subject: [PATCH 1/6] [clang] Use tighter lifetime bounds for C temporary
arguments
In C, consecutive statements in the same scope are under
CompoundStmt/CallExpr, while in C++ they typically fall under
CompoundStmt/ExprWithCleanup. This leads to different behavior with
respect to where pushFullExprCleanUp inserts the lifetime end markers
(e.g., at the end of scope).
For these cases, we can track and insert the lifetime end markers right
after the call completes. Allowing the stack space to be reused
immediately. This partially addresses #109204 and #43598 for improving
stack usage.
---
clang/lib/CodeGen/CGCall.cpp | 18 ++
clang/lib/CodeGen/CGCall.h| 19 +++
clang/test/CodeGen/stack-usage-lifetimes.c| 12 ++--
.../CodeGenCXX/stack-reuse-miscompile.cpp | 2 +-
4 files changed, 40 insertions(+), 11 deletions(-)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 80075dd8a4cca..75e3bea3f3237 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4973,11 +4973,16 @@ void CodeGenFunction::EmitCallArg(CallArgList &args,
const Expr *E,
RawAddress ArgSlotAlloca = Address::invalid();
ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", &ArgSlotAlloca);
-// Emit a lifetime start/end for this temporary at the end of the full
-// expression.
+// Emit a lifetime start/end for this temporary. If the type has a
+// destructor, then we need to keep it alive for the full expression.
if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
-EmitLifetimeStart(ArgSlotAlloca.getPointer()))
- pushFullExprCleanup(NormalAndEHCleanup, ArgSlotAlloca);
+EmitLifetimeStart(ArgSlotAlloca.getPointer())) {
+ if (E->getType().isDestructedType()) {
+pushFullExprCleanup(NormalAndEHCleanup,
ArgSlotAlloca);
+ } else {
+args.addLifetimeCleanup({ArgSlotAlloca.getPointer()});
+ }
+}
}
args.add(EmitAnyExpr(E, ArgSlot), type);
@@ -6307,6 +6312,11 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo
&CallInfo,
for (CallLifetimeEnd &LifetimeEnd : CallLifetimeEndAfterCall)
LifetimeEnd.Emit(*this, /*Flags=*/{});
+ if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries)
+for (const CallArgList::EndLifetimeInfo < :
+ CallArgs.getLifetimeCleanups())
+ EmitLifetimeEnd(LT.Addr);
+
if (!ReturnValue.isExternallyDestructed() &&
RetTy.isDestructedType() == QualType::DK_nontrivial_c_struct)
pushDestroy(QualType::DK_nontrivial_c_struct, Ret.getAggregateAddress(),
diff --git a/clang/lib/CodeGen/CGCall.h b/clang/lib/CodeGen/CGCall.h
index 1ef8a3f114573..aab4b64d6a4a8 100644
--- a/clang/lib/CodeGen/CGCall.h
+++ b/clang/lib/CodeGen/CGCall.h
@@ -299,6 +299,10 @@ class CallArgList : public SmallVector {
llvm::Instruction *IsActiveIP;
};
+ struct EndLifetimeInfo {
+llvm::Value *Addr;
+ };
+
void add(RValue rvalue, QualType type) { push_back(CallArg(rvalue, type)); }
void addUncopiedAggregate(LValue LV, QualType type) {
@@ -312,6 +316,9 @@ class CallArgList : public SmallVector {
llvm::append_range(*this, other);
llvm::append_range(Writebacks, other.Writebacks);
llvm::append_range(CleanupsToDeactivate, other.CleanupsToDeactivate);
+LifetimeCleanups.insert(LifetimeCleanups.end(),
+other.LifetimeCleanups.begin(),
+other.LifetimeCleanups.end());
assert(!(StackBase && other.StackBase) && "can't merge stackbases");
if (!StackBase)
StackBase = other.StackBase;
@@ -352,6 +359,14 @@ class CallArgList : public SmallVector {
/// memory.
bool isUsingInAlloca() const { return StackBase; }
+ void addLifetimeCleanup(EndLifetimeInfo Info) {
+LifetimeCleanups.push_back(Info);
+ }
+
+ ArrayRef getLifetimeCleanups() const {
+return LifetimeCleanups;
+ }
+
// Support reversing writebacks for MSVC ABI.
void reverseWritebacks() {
std::reverse(Writebacks.begin(), Writebacks.end());
@@ -365,6 +380,10 @@ class CallArgList : public SmallVector {
/// occurs.
SmallVector CleanupsToDeactivate;
+ /// Lifetime information needed to call llvm.lifetime.end for any temporary
+ /// argument allocas.
+ SmallVector LifetimeCleanups;
+
/// The stacksave call. It dominates all of the argument evaluation.
llvm::CallInst *StackBase = nullptr;
};
diff --git a/clang/test/CodeGen/stack-usage-lifetimes.c
b/clang/test/CodeGen/stack-usage-lifetimes.c
index 3787a29e4ce7d..189bc9c229ca4 100644
--- a/clang/test/CodeGen/stack-usage-lifetimes.c
+++ b/clang/test/CodeGen/stack-usage-lifetimes.c
@@ -40,11 +40,11 @@ void t1(int c) {
}
void t2(void) {
- // x
[llvm-branch-commits] SROA: Recognize llvm.protected.field.ptr intrinsics. (PR #151650)
fmayer wrote: Please fix tests https://github.com/llvm/llvm-project/pull/151650 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] SROA: Recognize llvm.protected.field.ptr intrinsics. (PR #151650)
pcc wrote: > Please fix tests I think those aren't real test failures (a buggy previously uploaded version of a dependent PR triggered those failures). Let me try rebasing to silence the CI. https://github.com/llvm/llvm-project/pull/151650 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/170752 >From 4752499d3a9b387ff7078baceb2522344ff875d3 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Thu, 4 Dec 2025 13:48:43 -0800 Subject: [PATCH] [LTT] Add `unknown` branch weights when lowering type tests with conditional --- llvm/lib/Transforms/IPO/LowerTypeTests.cpp | 3 +++ llvm/utils/profcheck-xfail.txt | 2 -- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp index f7aeda95e41b3..ef0bc29b03c2a 100644 --- a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp +++ b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp @@ -54,6 +54,7 @@ #include "llvm/IR/ModuleSummaryIndexYAML.h" #include "llvm/IR/Operator.h" #include "llvm/IR/PassManager.h" +#include "llvm/IR/ProfDataUtils.h" #include "llvm/IR/ReplaceConstant.h" #include "llvm/IR/Type.h" #include "llvm/IR/Use.h" @@ -803,6 +804,8 @@ Value *LowerTypeTestsModule::lowerTypeTestCall(Metadata *TypeId, CallInst *CI, } IRBuilder<> ThenB(SplitBlockAndInsertIfThen(OffsetInRange, CI, false)); + setExplicitlyUnknownBranchWeightsIfProfiled(*InitialBB->getTerminator(), + DEBUG_TYPE); // Now that we know that the offset is in range and aligned, load the // appropriate bit from the bitset. diff --git a/llvm/utils/profcheck-xfail.txt b/llvm/utils/profcheck-xfail.txt index 3cde50de7d0c1..a36cec940b605 100644 --- a/llvm/utils/profcheck-xfail.txt +++ b/llvm/utils/profcheck-xfail.txt @@ -493,8 +493,6 @@ Transforms/LowerSwitch/do-not-handle-impossible-values.ll Transforms/LowerSwitch/feature.ll Transforms/LowerSwitch/fold-popular-case-to-unreachable-default.ll Transforms/LowerSwitch/pr59316.ll -Transforms/LowerTypeTests/import.ll -Transforms/LowerTypeTests/simple.ll Transforms/MergeFunc/2011-02-08-RemoveEqual.ll Transforms/MergeFunc/apply_function_attributes.ll Transforms/MergeFunc/call-and-invoke-with-ranges-attr.ll ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
https://github.com/mtrofin edited https://github.com/llvm/llvm-project/pull/170752 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/170752
>From 48751a30a88593b7df37d10e432de27ae8b010f6 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Thu, 4 Dec 2025 13:48:43 -0800
Subject: [PATCH] [LTT] Add `unknown` branch weights when lowering type tests
with conditional
---
llvm/lib/Transforms/IPO/LowerTypeTests.cpp| 3 +++
llvm/test/Transforms/LowerTypeTests/import.ll | 21 ---
llvm/utils/profcheck-xfail.txt| 2 --
3 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
index f7aeda95e41b3..ef0bc29b03c2a 100644
--- a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+++ b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
@@ -54,6 +54,7 @@
#include "llvm/IR/ModuleSummaryIndexYAML.h"
#include "llvm/IR/Operator.h"
#include "llvm/IR/PassManager.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/ReplaceConstant.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/Use.h"
@@ -803,6 +804,8 @@ Value *LowerTypeTestsModule::lowerTypeTestCall(Metadata
*TypeId, CallInst *CI,
}
IRBuilder<> ThenB(SplitBlockAndInsertIfThen(OffsetInRange, CI, false));
+ setExplicitlyUnknownBranchWeightsIfProfiled(*InitialBB->getTerminator(),
+ DEBUG_TYPE);
// Now that we know that the offset is in range and aligned, load the
// appropriate bit from the bitset.
diff --git a/llvm/test/Transforms/LowerTypeTests/import.ll
b/llvm/test/Transforms/LowerTypeTests/import.ll
index 2aa81362415ef..7a6f863753f3c 100644
--- a/llvm/test/Transforms/LowerTypeTests/import.ll
+++ b/llvm/test/Transforms/LowerTypeTests/import.ll
@@ -86,14 +86,14 @@ define i1 @allones32(ptr %p) {
ret i1 %x
}
-define i1 @bytearray7(ptr %p) {
+define i1 @bytearray7(ptr %p) !prof !0 {
; X86-LABEL: define i1 @bytearray7(
-; X86-SAME: ptr [[P:%.*]]) {
+; X86-SAME: ptr [[P:%.*]]) !prof [[PROF6:![0-9]+]] {
; X86-NEXT:[[TMP1:%.*]] = ptrtoint ptr [[P]] to i64
; X86-NEXT:[[TMP2:%.*]] = sub i64 ptrtoint (ptr
@__typeid_bytearray7_global_addr to i64), [[TMP1]]
; X86-NEXT:[[TMP7:%.*]] = call i64 @llvm.fshr.i64(i64 [[TMP2]], i64
[[TMP2]], i64 ptrtoint (ptr @__typeid_bytearray7_align to i64))
; X86-NEXT:[[TMP8:%.*]] = icmp ule i64 [[TMP7]], ptrtoint (ptr
@__typeid_bytearray7_size_m1 to i64)
-; X86-NEXT:br i1 [[TMP8]], label [[TMP5:%.*]], label [[TMP14:%.*]]
+; X86-NEXT:br i1 [[TMP8]], label [[TMP5:%.*]], label [[TMP14:%.*]], !prof
[[PROF7:![0-9]+]]
; X86: 5:
; X86-NEXT:[[TMP10:%.*]] = getelementptr i8, ptr
@__typeid_bytearray7_byte_array, i64 [[TMP7]]
; X86-NEXT:[[TMP11:%.*]] = load i8, ptr [[TMP10]], align 1
@@ -105,12 +105,12 @@ define i1 @bytearray7(ptr %p) {
; X86-NEXT:ret i1 [[TMP15]]
;
; ARM-LABEL: define i1 @bytearray7(
-; ARM-SAME: ptr [[P:%.*]]) {
+; ARM-SAME: ptr [[P:%.*]]) !prof [[PROF0:![0-9]+]] {
; ARM-NEXT:[[TMP1:%.*]] = ptrtoint ptr [[P]] to i64
; ARM-NEXT:[[TMP2:%.*]] = sub i64 ptrtoint (ptr
@__typeid_bytearray7_global_addr to i64), [[TMP1]]
; ARM-NEXT:[[TMP5:%.*]] = call i64 @llvm.fshr.i64(i64 [[TMP2]], i64
[[TMP2]], i64 3)
; ARM-NEXT:[[TMP6:%.*]] = icmp ule i64 [[TMP5]], 43
-; ARM-NEXT:br i1 [[TMP6]], label [[TMP7:%.*]], label [[TMP12:%.*]]
+; ARM-NEXT:br i1 [[TMP6]], label [[TMP7:%.*]], label [[TMP12:%.*]], !prof
[[PROF1:![0-9]+]]
; ARM: 5:
; ARM-NEXT:[[TMP8:%.*]] = getelementptr i8, ptr
@__typeid_bytearray7_byte_array, i64 [[TMP5]]
; ARM-NEXT:[[TMP9:%.*]] = load i8, ptr [[TMP8]], align 1
@@ -255,6 +255,8 @@ define i1 @single(ptr %p) {
ret i1 %x
}
+!0 = !{!"function_entry_count", i32 10}
+
; X86: !0 = !{i64 0, i64 256}
; X86: !1 = !{i64 0, i64 64}
; X86: !2 = !{i64 -1, i64 -1}
@@ -265,13 +267,18 @@ define i1 @single(ptr %p) {
; X86: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nosync nounwind
speculatable willreturn memory(none) }
; X86: attributes #[[ATTR1:[0-9]+]] = { nocallback nocreateundeforpoison
nofree nosync nounwind speculatable willreturn memory(none) }
;.
+; ARM: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nosync nounwind
speculatable willreturn memory(none) }
+; ARM: attributes #[[ATTR1:[0-9]+]] = { nocallback nocreateundeforpoison
nofree nosync nounwind speculatable willreturn memory(none) }
+;.
; X86: [[META0]] = !{i64 0, i64 256}
; X86: [[META1]] = !{i64 0, i64 64}
; X86: [[META2]] = !{i64 -1, i64 -1}
; X86: [[META3]] = !{i64 0, i64 32}
; X86: [[META4]] = !{i64 0, i64 4294967296}
; X86: [[META5]] = !{i64 0, i64 128}
+; X86: [[PROF6]] = !{!"function_entry_count", i32 10}
+; X86: [[PROF7]] = !{!"unknown", !"lowertypetests"}
;.
-; ARM: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nosync nounwind
speculatable willreturn memory(none) }
-; ARM: attributes #[[ATTR1:[0-9]+]] = { nocallback nocreateundeforpoison
nofree nosync nounwind speculatable willreturn memory(none) }
+; ARM:
[llvm-branch-commits] [clang] [clang] Use tighter lifetime bounds for C temporary arguments (PR #170518)
@@ -4963,21 +4963,28 @@ void CodeGenFunction::EmitCallArg(CallArgList &args,
const Expr *E,
AggValueSlot ArgSlot = AggValueSlot::ignored();
// For arguments with aggregate type, create an alloca to store
- // the value. If the argument's type has a destructor, that destructor
+ // the value. If the argument's type has a destructor, that destructor
// will run at the end of the full-expression; emit matching lifetime
- // markers.
- //
- // FIXME: For types which don't have a destructor, consider using a
- // narrower lifetime bound.
+ // markers. For types which don't have a destructor, we use a narrower
+ // lifetime bound.
if (hasAggregateEvaluationKind(E->getType())) {
RawAddress ArgSlotAlloca = Address::invalid();
ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", &ArgSlotAlloca);
-// Emit a lifetime start/end for this temporary at the end of the full
-// expression.
+// Emit a lifetime start/end for this temporary. If the type has a
+// destructor, then we need to keep it alive for the full expression.
if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
-EmitLifetimeStart(ArgSlotAlloca.getPointer()))
- pushFullExprCleanup(NormalAndEHCleanup, ArgSlotAlloca);
+EmitLifetimeStart(ArgSlotAlloca.getPointer())) {
+ if (E->getType().isDestructedType()) {
+pushFullExprCleanup(NormalEHLifetimeMarker,
+ ArgSlotAlloca);
+ } else {
+args.addLifetimeCleanup({ArgSlotAlloca.getPointer()});
+if (getInvokeDest())
+ pushFullExprCleanup(CleanupKind::EHCleanup,
efriedma-quic wrote:
I think this still doesn't work quite right.
Consider the case where you have two calls which can throw in the expression.
Something like `f2(f1(X{}))`. If f1 throws an exception, you correctly end the
lifetime, but if f2 throws an exception, you end up calling lifetime.end on an
alloca where the lifetime already ended.
You can probably fudge this by deactivating the cleanup after the call. But it
would be cleaner to introduce a new scope for the call.
https://github.com/llvm/llvm-project/pull/170518
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
https://github.com/mtrofin ready_for_review https://github.com/llvm/llvm-project/pull/170752 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
llvmbot wrote:
@llvm/pr-subscribers-llvm-transforms
Author: Mircea Trofin (mtrofin)
Changes
---
Full diff: https://github.com/llvm/llvm-project/pull/170752.diff
3 Files Affected:
- (modified) llvm/lib/Transforms/IPO/LowerTypeTests.cpp (+3)
- (modified) llvm/test/Transforms/LowerTypeTests/import.ll (+14-7)
- (modified) llvm/utils/profcheck-xfail.txt (-2)
``diff
diff --git a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
index f7aeda95e41b3..ef0bc29b03c2a 100644
--- a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+++ b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
@@ -54,6 +54,7 @@
#include "llvm/IR/ModuleSummaryIndexYAML.h"
#include "llvm/IR/Operator.h"
#include "llvm/IR/PassManager.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/ReplaceConstant.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/Use.h"
@@ -803,6 +804,8 @@ Value *LowerTypeTestsModule::lowerTypeTestCall(Metadata
*TypeId, CallInst *CI,
}
IRBuilder<> ThenB(SplitBlockAndInsertIfThen(OffsetInRange, CI, false));
+ setExplicitlyUnknownBranchWeightsIfProfiled(*InitialBB->getTerminator(),
+ DEBUG_TYPE);
// Now that we know that the offset is in range and aligned, load the
// appropriate bit from the bitset.
diff --git a/llvm/test/Transforms/LowerTypeTests/import.ll
b/llvm/test/Transforms/LowerTypeTests/import.ll
index 2aa81362415ef..7a6f863753f3c 100644
--- a/llvm/test/Transforms/LowerTypeTests/import.ll
+++ b/llvm/test/Transforms/LowerTypeTests/import.ll
@@ -86,14 +86,14 @@ define i1 @allones32(ptr %p) {
ret i1 %x
}
-define i1 @bytearray7(ptr %p) {
+define i1 @bytearray7(ptr %p) !prof !0 {
; X86-LABEL: define i1 @bytearray7(
-; X86-SAME: ptr [[P:%.*]]) {
+; X86-SAME: ptr [[P:%.*]]) !prof [[PROF6:![0-9]+]] {
; X86-NEXT:[[TMP1:%.*]] = ptrtoint ptr [[P]] to i64
; X86-NEXT:[[TMP2:%.*]] = sub i64 ptrtoint (ptr
@__typeid_bytearray7_global_addr to i64), [[TMP1]]
; X86-NEXT:[[TMP7:%.*]] = call i64 @llvm.fshr.i64(i64 [[TMP2]], i64
[[TMP2]], i64 ptrtoint (ptr @__typeid_bytearray7_align to i64))
; X86-NEXT:[[TMP8:%.*]] = icmp ule i64 [[TMP7]], ptrtoint (ptr
@__typeid_bytearray7_size_m1 to i64)
-; X86-NEXT:br i1 [[TMP8]], label [[TMP5:%.*]], label [[TMP14:%.*]]
+; X86-NEXT:br i1 [[TMP8]], label [[TMP5:%.*]], label [[TMP14:%.*]], !prof
[[PROF7:![0-9]+]]
; X86: 5:
; X86-NEXT:[[TMP10:%.*]] = getelementptr i8, ptr
@__typeid_bytearray7_byte_array, i64 [[TMP7]]
; X86-NEXT:[[TMP11:%.*]] = load i8, ptr [[TMP10]], align 1
@@ -105,12 +105,12 @@ define i1 @bytearray7(ptr %p) {
; X86-NEXT:ret i1 [[TMP15]]
;
; ARM-LABEL: define i1 @bytearray7(
-; ARM-SAME: ptr [[P:%.*]]) {
+; ARM-SAME: ptr [[P:%.*]]) !prof [[PROF0:![0-9]+]] {
; ARM-NEXT:[[TMP1:%.*]] = ptrtoint ptr [[P]] to i64
; ARM-NEXT:[[TMP2:%.*]] = sub i64 ptrtoint (ptr
@__typeid_bytearray7_global_addr to i64), [[TMP1]]
; ARM-NEXT:[[TMP5:%.*]] = call i64 @llvm.fshr.i64(i64 [[TMP2]], i64
[[TMP2]], i64 3)
; ARM-NEXT:[[TMP6:%.*]] = icmp ule i64 [[TMP5]], 43
-; ARM-NEXT:br i1 [[TMP6]], label [[TMP7:%.*]], label [[TMP12:%.*]]
+; ARM-NEXT:br i1 [[TMP6]], label [[TMP7:%.*]], label [[TMP12:%.*]], !prof
[[PROF1:![0-9]+]]
; ARM: 5:
; ARM-NEXT:[[TMP8:%.*]] = getelementptr i8, ptr
@__typeid_bytearray7_byte_array, i64 [[TMP5]]
; ARM-NEXT:[[TMP9:%.*]] = load i8, ptr [[TMP8]], align 1
@@ -255,6 +255,8 @@ define i1 @single(ptr %p) {
ret i1 %x
}
+!0 = !{!"function_entry_count", i32 10}
+
; X86: !0 = !{i64 0, i64 256}
; X86: !1 = !{i64 0, i64 64}
; X86: !2 = !{i64 -1, i64 -1}
@@ -265,13 +267,18 @@ define i1 @single(ptr %p) {
; X86: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nosync nounwind
speculatable willreturn memory(none) }
; X86: attributes #[[ATTR1:[0-9]+]] = { nocallback nocreateundeforpoison
nofree nosync nounwind speculatable willreturn memory(none) }
;.
+; ARM: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nosync nounwind
speculatable willreturn memory(none) }
+; ARM: attributes #[[ATTR1:[0-9]+]] = { nocallback nocreateundeforpoison
nofree nosync nounwind speculatable willreturn memory(none) }
+;.
; X86: [[META0]] = !{i64 0, i64 256}
; X86: [[META1]] = !{i64 0, i64 64}
; X86: [[META2]] = !{i64 -1, i64 -1}
; X86: [[META3]] = !{i64 0, i64 32}
; X86: [[META4]] = !{i64 0, i64 4294967296}
; X86: [[META5]] = !{i64 0, i64 128}
+; X86: [[PROF6]] = !{!"function_entry_count", i32 10}
+; X86: [[PROF7]] = !{!"unknown", !"lowertypetests"}
;.
-; ARM: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nosync nounwind
speculatable willreturn memory(none) }
-; ARM: attributes #[[ATTR1:[0-9]+]] = { nocallback nocreateundeforpoison
nofree nosync nounwind speculatable willreturn memory(none) }
+; ARM: [[PROF0]] = !{!"function_entry_count", i32 10}
+; ARM: [[PROF1]] = !{!"unknown", !"lowertypetests"}
;.
diff --git a/llvm/utils/profcheck-xfail.txt b/llvm/utils/
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
pcc wrote: What does this function do? The frontend is expected to add unlikely annotations and we don't want this function to override them. https://github.com/llvm/llvm-project/pull/170752 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
boomanaiden154 wrote: > The frontend is expected to add unlikely annotations and we don't want this > function to override them. To the function itself? I don't see how the frontend could add unlikely annotations to a synthetic branch constructed by this pass. https://github.com/llvm/llvm-project/pull/170752 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
https://github.com/boomanaiden154 approved this pull request. https://github.com/llvm/llvm-project/pull/170752 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
pcc wrote: > > The frontend is expected to add unlikely annotations and we don't want this > > function to override them. > > To the function itself? I don't see how the frontend could add unlikely > annotations to a synthetic branch constructed by this pass. Ah, this is the case where the intrinsic isn't used by a conditional branch. Since that case is not expected to occur frequently in practice, it's fine to put anything we want on this branch. https://github.com/llvm/llvm-project/pull/170752 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LTT] Add `unknown` branch weights when lowering type tests with conditional (PR #170752)
https://github.com/pcc approved this pull request. https://github.com/llvm/llvm-project/pull/170752 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -15131,3 +15131,93 @@ bool ASTContext::useAbbreviatedThunkName(GlobalDecl
VirtualMethodDecl,
ThunksToBeAbbreviated[VirtualMethodDecl] = std::move(SimplifiedThunkNames);
return Result;
}
+
+bool ASTContext::arePFPFieldsTriviallyCopyable(const RecordDecl *RD) const {
+ bool IsPAuthSupported =
+ getTargetInfo().getTriple().getArch() == llvm::Triple::aarch64;
+ if (!IsPAuthSupported)
+return true;
+ if (getLangOpts().PointerFieldProtectionTagged)
+return !isa(RD) ||
+ cast(RD)->hasTrivialDestructor();
+ return true;
+}
+
+void ASTContext::findPFPFields(QualType Ty, CharUnits Offset,
+ std::vector &Fields,
+ bool IncludeVBases, bool IsWithinUnion) const {
+ if (auto *AT = getAsConstantArrayType(Ty)) {
+if (auto *ElemDecl = AT->getElementType()->getAsCXXRecordDecl()) {
+ const ASTRecordLayout &ElemRL = getASTRecordLayout(ElemDecl);
+ for (unsigned i = 0; i != AT->getSize(); ++i) {
+findPFPFields(AT->getElementType(), Offset + i * ElemRL.getSize(),
+ Fields, true);
+ }
+}
+ }
+ auto *Decl = Ty->getAsCXXRecordDecl();
+ // isPFPType() is inherited from bases and members (including via arrays), so
+ // we can early exit if it is false.
+ if (!Decl || !Decl->isPFPType())
+return;
+ IsWithinUnion |= Decl->isUnion();
+ const ASTRecordLayout &RL = getASTRecordLayout(Decl);
+ for (FieldDecl *field : Decl->fields()) {
+CharUnits fieldOffset =
+Offset +
toCharUnitsFromBits(RL.getFieldOffset(field->getFieldIndex()));
+if (isPFPField(field))
+ Fields.push_back({fieldOffset, field, IsWithinUnion});
+findPFPFields(field->getType(), fieldOffset, Fields, true, IsWithinUnion);
+ }
+ for (auto &Base : Decl->bases()) {
+if (Base.isVirtual())
+ continue;
+CharUnits BaseOffset =
+Offset + RL.getBaseClassOffset(Base.getType()->getAsCXXRecordDecl());
+findPFPFields(Base.getType(), BaseOffset, Fields, false, IsWithinUnion);
+ }
+ if (IncludeVBases) {
+for (auto &Base : Decl->vbases()) {
+ CharUnits BaseOffset =
+ Offset +
RL.getVBaseClassOffset(Base.getType()->getAsCXXRecordDecl());
+ findPFPFields(Base.getType(), BaseOffset, Fields, false, IsWithinUnion);
+}
+ }
+}
+
+bool ASTContext::hasPFPFields(QualType Ty) const {
+ std::vector PFPFields;
+ findPFPFields(Ty, CharUnits::Zero(), PFPFields, true);
+ return !PFPFields.empty();
+}
+
+bool ASTContext::isPFPField(const FieldDecl *FD) const {
+ auto *RD = dyn_cast(FD->getParent());
+ if (!RD || !RD->isPFPType())
+return false;
+ return FD->getType()->isPointerType() &&
+ !FD->hasAttr();
+}
+
+void ASTContext::recordMemberDataPointerEvaluation(const ValueDecl *VD) {
+ if (!getLangOpts().PointerFieldProtection)
+return;
+ auto *FD = dyn_cast(VD);
+ if (!FD)
+FD = cast(cast(VD)->chain().back());
+ if (!isPFPField(FD))
+return;
+ PFPFieldsWithEvaluatedOffset.insert(FD);
+}
+
+void ASTContext::recordOffsetOfEvaluation(const OffsetOfExpr *E) {
+ if (!getLangOpts().PointerFieldProtection || E->getNumComponents() == 0)
+return;
+ OffsetOfNode Comp = E->getComponent(E->getNumComponents() - 1);
+ if (Comp.getKind() != OffsetOfNode::Field)
+return;
+ FieldDecl *FD = Comp.getField();
fmayer wrote:
nit: for your consideration
```
if (FieldDecl *FD = Comp.getField(); isPFPField(FD))
PFPFieldsWithEvaluatedOffset.insert(FD);
```
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -4522,18 +4522,48 @@ RValue CodeGenFunction::EmitBuiltinExpr(const
GlobalDecl GD, unsigned BuiltinID,
Address Dest = EmitPointerWithAlignment(E->getArg(0));
Address Src = EmitPointerWithAlignment(E->getArg(1));
Value *SizeVal = EmitScalarExpr(E->getArg(2));
+Value *TypeSize = ConstantInt::get(
+SizeVal->getType(),
+getContext()
+.getTypeSizeInChars(E->getArg(0)->getType()->getPointeeType())
+.getQuantity());
if (BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_trivially_relocate)
- SizeVal = Builder.CreateMul(
- SizeVal,
- ConstantInt::get(
- SizeVal->getType(),
- getContext()
-
.getTypeSizeInChars(E->getArg(0)->getType()->getPointeeType())
- .getQuantity()));
+ SizeVal = Builder.CreateMul(SizeVal, TypeSize);
EmitArgCheck(TCK_Store, Dest, E->getArg(0), 0);
EmitArgCheck(TCK_Load, Src, E->getArg(1), 1);
auto *I = Builder.CreateMemMove(Dest, Src, SizeVal, false);
addInstToNewSourceAtom(I, nullptr);
+if (BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_trivially_relocate) {
+ if (getContext().hasPFPFields(
+ E->getArg(0)->getType()->getPointeeType())) {
+BasicBlock *Entry = Builder.GetInsertBlock();
+BasicBlock *Loop = createBasicBlock("loop");
fmayer wrote:
should we call this in a way that leaves some breadcrumbs that this is from
pfp? `pfp.relocate.loop` or something?
maybe also add a comment like
```
// call emitPFPTrivialRelocation for every object in the array we are
relocating?
```
(if my understanding of this loop is correct)
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -2643,6 +2643,19 @@ def CountedByOrNull : DeclOrTypeAttr {
let LangOpts = [COnly];
}
+def NoFieldProtection : DeclOrTypeAttr {
+ let Spellings = [Clang<"no_field_protection">];
fmayer wrote:
why not `pointer_field_protection`
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -15131,3 +15131,93 @@ bool ASTContext::useAbbreviatedThunkName(GlobalDecl
VirtualMethodDecl,
ThunksToBeAbbreviated[VirtualMethodDecl] = std::move(SimplifiedThunkNames);
return Result;
}
+
+bool ASTContext::arePFPFieldsTriviallyCopyable(const RecordDecl *RD) const {
+ bool IsPAuthSupported =
+ getTargetInfo().getTriple().getArch() == llvm::Triple::aarch64;
+ if (!IsPAuthSupported)
+return true;
+ if (getLangOpts().PointerFieldProtectionTagged)
+return !isa(RD) ||
+ cast(RD)->hasTrivialDestructor();
+ return true;
+}
+
+void ASTContext::findPFPFields(QualType Ty, CharUnits Offset,
+ std::vector &Fields,
+ bool IncludeVBases, bool IsWithinUnion) const {
+ if (auto *AT = getAsConstantArrayType(Ty)) {
+if (auto *ElemDecl = AT->getElementType()->getAsCXXRecordDecl()) {
+ const ASTRecordLayout &ElemRL = getASTRecordLayout(ElemDecl);
+ for (unsigned i = 0; i != AT->getSize(); ++i) {
+findPFPFields(AT->getElementType(), Offset + i * ElemRL.getSize(),
+ Fields, true);
+ }
+}
+ }
+ auto *Decl = Ty->getAsCXXRecordDecl();
+ // isPFPType() is inherited from bases and members (including via arrays), so
+ // we can early exit if it is false.
+ if (!Decl || !Decl->isPFPType())
+return;
+ IsWithinUnion |= Decl->isUnion();
+ const ASTRecordLayout &RL = getASTRecordLayout(Decl);
+ for (FieldDecl *field : Decl->fields()) {
+CharUnits fieldOffset =
+Offset +
toCharUnitsFromBits(RL.getFieldOffset(field->getFieldIndex()));
+if (isPFPField(field))
+ Fields.push_back({fieldOffset, field, IsWithinUnion});
+findPFPFields(field->getType(), fieldOffset, Fields, true, IsWithinUnion);
+ }
+ for (auto &Base : Decl->bases()) {
+if (Base.isVirtual())
+ continue;
+CharUnits BaseOffset =
+Offset + RL.getBaseClassOffset(Base.getType()->getAsCXXRecordDecl());
+findPFPFields(Base.getType(), BaseOffset, Fields, false, IsWithinUnion);
+ }
+ if (IncludeVBases) {
+for (auto &Base : Decl->vbases()) {
+ CharUnits BaseOffset =
+ Offset +
RL.getVBaseClassOffset(Base.getType()->getAsCXXRecordDecl());
+ findPFPFields(Base.getType(), BaseOffset, Fields, false, IsWithinUnion);
+}
+ }
+}
+
+bool ASTContext::hasPFPFields(QualType Ty) const {
+ std::vector PFPFields;
+ findPFPFields(Ty, CharUnits::Zero(), PFPFields, true);
fmayer wrote:
`/*IncludeVBases=*/true`
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
https://github.com/fmayer commented: leaving first batch of comments, mostly nits, but not done yet https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -15131,3 +15131,93 @@ bool ASTContext::useAbbreviatedThunkName(GlobalDecl
VirtualMethodDecl,
ThunksToBeAbbreviated[VirtualMethodDecl] = std::move(SimplifiedThunkNames);
return Result;
}
+
+bool ASTContext::arePFPFieldsTriviallyCopyable(const RecordDecl *RD) const {
+ bool IsPAuthSupported =
+ getTargetInfo().getTriple().getArch() == llvm::Triple::aarch64;
+ if (!IsPAuthSupported)
+return true;
+ if (getLangOpts().PointerFieldProtectionTagged)
+return !isa(RD) ||
+ cast(RD)->hasTrivialDestructor();
+ return true;
+}
+
+void ASTContext::findPFPFields(QualType Ty, CharUnits Offset,
+ std::vector &Fields,
+ bool IncludeVBases, bool IsWithinUnion) const {
+ if (auto *AT = getAsConstantArrayType(Ty)) {
+if (auto *ElemDecl = AT->getElementType()->getAsCXXRecordDecl()) {
+ const ASTRecordLayout &ElemRL = getASTRecordLayout(ElemDecl);
+ for (unsigned i = 0; i != AT->getSize(); ++i) {
fmayer wrote:
nit: single statement, drop `{}`
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -15131,3 +15131,93 @@ bool ASTContext::useAbbreviatedThunkName(GlobalDecl
VirtualMethodDecl,
ThunksToBeAbbreviated[VirtualMethodDecl] = std::move(SimplifiedThunkNames);
return Result;
}
+
+bool ASTContext::arePFPFieldsTriviallyCopyable(const RecordDecl *RD) const {
+ bool IsPAuthSupported =
+ getTargetInfo().getTriple().getArch() == llvm::Triple::aarch64;
+ if (!IsPAuthSupported)
+return true;
+ if (getLangOpts().PointerFieldProtectionTagged)
+return !isa(RD) ||
+ cast(RD)->hasTrivialDestructor();
+ return true;
+}
+
+void ASTContext::findPFPFields(QualType Ty, CharUnits Offset,
+ std::vector &Fields,
+ bool IncludeVBases, bool IsWithinUnion) const {
+ if (auto *AT = getAsConstantArrayType(Ty)) {
+if (auto *ElemDecl = AT->getElementType()->getAsCXXRecordDecl()) {
+ const ASTRecordLayout &ElemRL = getASTRecordLayout(ElemDecl);
+ for (unsigned i = 0; i != AT->getSize(); ++i) {
+findPFPFields(AT->getElementType(), Offset + i * ElemRL.getSize(),
+ Fields, true);
+ }
+}
+ }
+ auto *Decl = Ty->getAsCXXRecordDecl();
+ // isPFPType() is inherited from bases and members (including via arrays), so
+ // we can early exit if it is false.
+ if (!Decl || !Decl->isPFPType())
+return;
+ IsWithinUnion |= Decl->isUnion();
+ const ASTRecordLayout &RL = getASTRecordLayout(Decl);
+ for (FieldDecl *field : Decl->fields()) {
+CharUnits fieldOffset =
+Offset +
toCharUnitsFromBits(RL.getFieldOffset(field->getFieldIndex()));
+if (isPFPField(field))
+ Fields.push_back({fieldOffset, field, IsWithinUnion});
+findPFPFields(field->getType(), fieldOffset, Fields, true, IsWithinUnion);
+ }
+ for (auto &Base : Decl->bases()) {
+if (Base.isVirtual())
+ continue;
+CharUnits BaseOffset =
+Offset + RL.getBaseClassOffset(Base.getType()->getAsCXXRecordDecl());
+findPFPFields(Base.getType(), BaseOffset, Fields, false, IsWithinUnion);
+ }
+ if (IncludeVBases) {
+for (auto &Base : Decl->vbases()) {
+ CharUnits BaseOffset =
+ Offset +
RL.getVBaseClassOffset(Base.getType()->getAsCXXRecordDecl());
+ findPFPFields(Base.getType(), BaseOffset, Fields, false, IsWithinUnion);
fmayer wrote:
`/*IncludeVBases=*/false`
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -3141,6 +3141,17 @@ defm experimental_omit_vtable_rtti : BoolFOption<"experimental-omit-vtable-rtti" NegFlag, BothFlags<[], [CC1Option], " the RTTI component from virtual tables">>; +defm experimental_pointer_field_protection : BoolFOption<"experimental-pointer-field-protection", + LangOpts<"PointerFieldProtection">, DefaultFalse, + PosFlag, + NegFlag, + BothFlags<[], [ClangOption], " pointer field protection on all non-standard layout struct types">>; +defm experimental_pointer_field_protection_tagged : BoolFOption<"experimental-pointer-field-protection-tagged", fmayer wrote: as discussed offline, i would prefer if there was a flag to enable the `[[clang:pointer_field_protection]]` while this feature is experimental to avoid accidental enablement https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -15131,3 +15131,93 @@ bool ASTContext::useAbbreviatedThunkName(GlobalDecl
VirtualMethodDecl,
ThunksToBeAbbreviated[VirtualMethodDecl] = std::move(SimplifiedThunkNames);
return Result;
}
+
+bool ASTContext::arePFPFieldsTriviallyCopyable(const RecordDecl *RD) const {
+ bool IsPAuthSupported =
+ getTargetInfo().getTriple().getArch() == llvm::Triple::aarch64;
+ if (!IsPAuthSupported)
+return true;
+ if (getLangOpts().PointerFieldProtectionTagged)
+return !isa(RD) ||
+ cast(RD)->hasTrivialDestructor();
+ return true;
+}
+
+void ASTContext::findPFPFields(QualType Ty, CharUnits Offset,
+ std::vector &Fields,
+ bool IncludeVBases, bool IsWithinUnion) const {
+ if (auto *AT = getAsConstantArrayType(Ty)) {
+if (auto *ElemDecl = AT->getElementType()->getAsCXXRecordDecl()) {
+ const ASTRecordLayout &ElemRL = getASTRecordLayout(ElemDecl);
+ for (unsigned i = 0; i != AT->getSize(); ++i) {
+findPFPFields(AT->getElementType(), Offset + i * ElemRL.getSize(),
+ Fields, true);
+ }
+}
+ }
+ auto *Decl = Ty->getAsCXXRecordDecl();
+ // isPFPType() is inherited from bases and members (including via arrays), so
+ // we can early exit if it is false.
+ if (!Decl || !Decl->isPFPType())
+return;
+ IsWithinUnion |= Decl->isUnion();
+ const ASTRecordLayout &RL = getASTRecordLayout(Decl);
+ for (FieldDecl *field : Decl->fields()) {
+CharUnits fieldOffset =
+Offset +
toCharUnitsFromBits(RL.getFieldOffset(field->getFieldIndex()));
+if (isPFPField(field))
+ Fields.push_back({fieldOffset, field, IsWithinUnion});
+findPFPFields(field->getType(), fieldOffset, Fields, true, IsWithinUnion);
fmayer wrote:
`/*IncludeVBases=*/true`. maybe also leave some notes why sometimes we do and
sometimes we don't
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -1310,21 +1310,91 @@ static llvm::Value
*CoerceIntOrPtrToIntOrPtr(llvm::Value *Val, llvm::Type *Ty,
return Val;
}
+static std::vector findPFPCoercedFields(CodeGenFunction &CGF,
+ QualType SrcFETy) {
+ // Coercion directly through memory does not work if the structure has
pointer
+ // field protection because the struct in registers has a different bit
+ // pattern to the struct in memory, so we must read the elements one by one
+ // and use them to form the coerced structure.
+ std::vector PFPFields;
+ CGF.getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true);
fmayer wrote:
`/*IsWithinUnion=*/true`
And I am confused, or `isWithinUnion` is true for all of the return objects?
AFAIK in `findPFPFields` `IsWithinUnion` can only be changed to true (through
the `|=`), and never to false. And all the recursive calls either use
`isWithinUnion` or `true`.
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -15131,3 +15131,93 @@ bool ASTContext::useAbbreviatedThunkName(GlobalDecl
VirtualMethodDecl,
ThunksToBeAbbreviated[VirtualMethodDecl] = std::move(SimplifiedThunkNames);
return Result;
}
+
+bool ASTContext::arePFPFieldsTriviallyCopyable(const RecordDecl *RD) const {
+ bool IsPAuthSupported =
+ getTargetInfo().getTriple().getArch() == llvm::Triple::aarch64;
+ if (!IsPAuthSupported)
+return true;
+ if (getLangOpts().PointerFieldProtectionTagged)
+return !isa(RD) ||
+ cast(RD)->hasTrivialDestructor();
+ return true;
+}
+
+void ASTContext::findPFPFields(QualType Ty, CharUnits Offset,
+ std::vector &Fields,
+ bool IncludeVBases, bool IsWithinUnion) const {
+ if (auto *AT = getAsConstantArrayType(Ty)) {
+if (auto *ElemDecl = AT->getElementType()->getAsCXXRecordDecl()) {
+ const ASTRecordLayout &ElemRL = getASTRecordLayout(ElemDecl);
+ for (unsigned i = 0; i != AT->getSize(); ++i) {
+findPFPFields(AT->getElementType(), Offset + i * ElemRL.getSize(),
+ Fields, true);
+ }
+}
+ }
+ auto *Decl = Ty->getAsCXXRecordDecl();
+ // isPFPType() is inherited from bases and members (including via arrays), so
+ // we can early exit if it is false.
+ if (!Decl || !Decl->isPFPType())
+return;
+ IsWithinUnion |= Decl->isUnion();
+ const ASTRecordLayout &RL = getASTRecordLayout(Decl);
+ for (FieldDecl *field : Decl->fields()) {
+CharUnits fieldOffset =
+Offset +
toCharUnitsFromBits(RL.getFieldOffset(field->getFieldIndex()));
+if (isPFPField(field))
+ Fields.push_back({fieldOffset, field, IsWithinUnion});
+findPFPFields(field->getType(), fieldOffset, Fields, true, IsWithinUnion);
+ }
+ for (auto &Base : Decl->bases()) {
+if (Base.isVirtual())
+ continue;
+CharUnits BaseOffset =
+Offset + RL.getBaseClassOffset(Base.getType()->getAsCXXRecordDecl());
+findPFPFields(Base.getType(), BaseOffset, Fields, false, IsWithinUnion);
fmayer wrote:
`/*IncludeVBases=*/false`
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -0,0 +1,70 @@ + +Structure Protection + + +.. contents:: + :local: + + +Introduction + + +Structure protection is an *experimental* mitigation fmayer wrote: optional nit: maybe reflow the text in this document. `pandoc pfp.rst -o pfp2.rst` does that (plus some other stray changes) https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -15131,3 +15131,93 @@ bool ASTContext::useAbbreviatedThunkName(GlobalDecl
VirtualMethodDecl,
ThunksToBeAbbreviated[VirtualMethodDecl] = std::move(SimplifiedThunkNames);
return Result;
}
+
+bool ASTContext::arePFPFieldsTriviallyCopyable(const RecordDecl *RD) const {
+ bool IsPAuthSupported =
+ getTargetInfo().getTriple().getArch() == llvm::Triple::aarch64;
+ if (!IsPAuthSupported)
+return true;
+ if (getLangOpts().PointerFieldProtectionTagged)
+return !isa(RD) ||
+ cast(RD)->hasTrivialDestructor();
+ return true;
+}
+
+void ASTContext::findPFPFields(QualType Ty, CharUnits Offset,
+ std::vector &Fields,
+ bool IncludeVBases, bool IsWithinUnion) const {
+ if (auto *AT = getAsConstantArrayType(Ty)) {
+if (auto *ElemDecl = AT->getElementType()->getAsCXXRecordDecl()) {
+ const ASTRecordLayout &ElemRL = getASTRecordLayout(ElemDecl);
+ for (unsigned i = 0; i != AT->getSize(); ++i) {
+findPFPFields(AT->getElementType(), Offset + i * ElemRL.getSize(),
+ Fields, true);
+ }
+}
+ }
+ auto *Decl = Ty->getAsCXXRecordDecl();
+ // isPFPType() is inherited from bases and members (including via arrays), so
+ // we can early exit if it is false.
+ if (!Decl || !Decl->isPFPType())
+return;
+ IsWithinUnion |= Decl->isUnion();
+ const ASTRecordLayout &RL = getASTRecordLayout(Decl);
+ for (FieldDecl *field : Decl->fields()) {
+CharUnits fieldOffset =
+Offset +
toCharUnitsFromBits(RL.getFieldOffset(field->getFieldIndex()));
+if (isPFPField(field))
+ Fields.push_back({fieldOffset, field, IsWithinUnion});
+findPFPFields(field->getType(), fieldOffset, Fields, true, IsWithinUnion);
+ }
+ for (auto &Base : Decl->bases()) {
+if (Base.isVirtual())
+ continue;
+CharUnits BaseOffset =
+Offset + RL.getBaseClassOffset(Base.getType()->getAsCXXRecordDecl());
+findPFPFields(Base.getType(), BaseOffset, Fields, false, IsWithinUnion);
+ }
+ if (IncludeVBases) {
+for (auto &Base : Decl->vbases()) {
+ CharUnits BaseOffset =
+ Offset +
RL.getVBaseClassOffset(Base.getType()->getAsCXXRecordDecl());
+ findPFPFields(Base.getType(), BaseOffset, Fields, false, IsWithinUnion);
+}
+ }
+}
+
+bool ASTContext::hasPFPFields(QualType Ty) const {
+ std::vector PFPFields;
+ findPFPFields(Ty, CharUnits::Zero(), PFPFields, true);
+ return !PFPFields.empty();
+}
+
+bool ASTContext::isPFPField(const FieldDecl *FD) const {
+ auto *RD = dyn_cast(FD->getParent());
fmayer wrote:
nit: for your consideration
```
if (auto *RD = dyn_cast(FD->getParent()))
return RD->isPFPType() && FD->getType()->isPointerType() &&
!FD->hasAttr();
return false
```
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
https://github.com/fmayer commented: leaving first batch of comments, mostly nits, but not done yet https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
https://github.com/fmayer edited https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Add pointer field protection feature. (PR #133538)
@@ -0,0 +1,70 @@ + +Structure Protection + + +.. contents:: + :local: + + +Introduction + + +Structure protection is an *experimental* mitigation fmayer wrote: actually, nevermind. i wanted to delete this comment and forgot. it doesn't really make it consistently better https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [ExposeObjCDirect] Optimizations (PR #170619)
https://github.com/DataCorrupted updated
https://github.com/llvm/llvm-project/pull/170619
>From bbf2e85a9bc07a52c83d13af5db0d35878484b9a Mon Sep 17 00:00:00 2001
From: Peter Rong
Date: Wed, 3 Dec 2025 22:45:04 -0800
Subject: [PATCH 1/8] [ExposeObjCDirect] Optimizations
In many cases we can infer that class object has been realized
---
clang/lib/CodeGen/CGObjCRuntime.cpp | 65 -
clang/lib/CodeGen/CGObjCRuntime.h | 23 +++---
2 files changed, 82 insertions(+), 6 deletions(-)
diff --git a/clang/lib/CodeGen/CGObjCRuntime.cpp
b/clang/lib/CodeGen/CGObjCRuntime.cpp
index a4b4460fdc49c..fd227d9645ac1 100644
--- a/clang/lib/CodeGen/CGObjCRuntime.cpp
+++ b/clang/lib/CodeGen/CGObjCRuntime.cpp
@@ -415,7 +415,70 @@ bool CGObjCRuntime::canMessageReceiverBeNull(
bool CGObjCRuntime::canClassObjectBeUnrealized(
const ObjCInterfaceDecl *CalleeClassDecl, CodeGenFunction &CGF) const {
- // TODO
+ if (!CalleeClassDecl)
+return true;
+
+ // Heuristic 1: +load method on this class
+ // If the class has a +load method, it's realized when the binary is loaded.
+ ASTContext &Ctx = CGM.getContext();
+ const IdentifierInfo *LoadII = &Ctx.Idents.get("load");
+ Selector LoadSel = Ctx.Selectors.getSelector(0, &LoadII);
+
+ // TODO: if one if the child had +load, this class is guaranteed to be
+ // realized as well. We should have a translation unit specific map that
+ // precomputes all classes that are realized, and just do a lookup here.
+ // But we need to measure how expensive it is to create a map like that.
+ if (CalleeClassDecl->lookupClassMethod(LoadSel))
+return false; // This class has +load, so it's already realized
+
+ // Heuristic 2: using Self / Super
+ // If we're currently executing a method of ClassDecl (or a subclass),
+ // then ClassDecl must already be realized.
+ if (const auto *CurMethod =
+ dyn_cast_or_null(CGF.CurCodeDecl)) {
+const ObjCInterfaceDecl *CallerCalssDecl = CurMethod->getClassInterface();
+if (CallerCalssDecl && CalleeClassDecl->isSuperClassOf(CallerCalssDecl))
+ return false;
+ }
+
+ // Heuristic 3: previously realized
+ // Heuristic 3.1: Walk through the current BasicBlock looking for calls that
+ // realize the class. All heuristics in this cluster share the same
+ // implementation pattern.
+ auto *BB = CGF.Builder.GetInsertBlock();
+ if (!BB)
+return true; // No current block, assume unrealized
+
+ llvm::StringRef CalleeClassName = CalleeClassDecl->getName();
+
+ // Heuristic 3.2 / TODO: If realization happened in a dominating block, the
+ // class is realized Requires Dominator tree analysis. There should be an
+ // outer loop `for (BB: DominatingBasicBlocks)`
+ for (const auto &Inst : *BB) {
+// Check if this is a call instruction
+const auto *Call = llvm::dyn_cast(&Inst);
+if (!Call)
+ continue;
+llvm::Function *CalledFunc = Call->getCalledFunction();
+if (!CalledFunc)
+ continue;
+
+llvm::StringRef FuncNamePtr = CalledFunc->getName();
+// Skip the \01 prefix if present
+if (FuncNamePtr.starts_with("\01"))
+ FuncNamePtr = FuncNamePtr.drop_front(1);
+// Check for instance method calls: "-[ClassName methodName]"
+// or class method calls: "+[ClassName methodName]"
+// Also check for thunks: "-[ClassName methodName]_thunk"
+if ((FuncNamePtr.starts_with("-[") || FuncNamePtr.starts_with("+["))) {
+ FuncNamePtr = FuncNamePtr.drop_front(2);
+ // TODO: if the current class is the super class of the function that's
+ // used, it should've been realized as well
+ if (FuncNamePtr.starts_with(CalleeClassName))
+return false;
+}
+ }
+
// Otherwise, assume it can be unrealized.
return true;
}
diff --git a/clang/lib/CodeGen/CGObjCRuntime.h
b/clang/lib/CodeGen/CGObjCRuntime.h
index d3d4745cb77a7..b0cf04fc8553b 100644
--- a/clang/lib/CodeGen/CGObjCRuntime.h
+++ b/clang/lib/CodeGen/CGObjCRuntime.h
@@ -226,7 +226,7 @@ class CGObjCRuntime {
virtual llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD) = 0;
-/// Generates precondition checks for direct Objective-C Methods.
+ /// Generates precondition checks for direct Objective-C Methods.
/// This includes [self self] for class methods and nil checks.
virtual void GenerateDirectMethodsPreconditionCheck(
CodeGenFunction &CGF, llvm::Function *Fn, const ObjCMethodDecl *OMD,
@@ -330,10 +330,23 @@ class CGObjCRuntime {
QualType resultType,
CallArgList &callArgs);
- bool canMessageReceiverBeNull(CodeGenFunction &CGF,
-const ObjCMethodDecl *method, bool isSuper,
-const ObjCInterfaceDecl *classReceiver,
-llvm::Value *receiver);
+ /// Check if the receiver of an ObjC message send can be null.
+
[llvm-branch-commits] [clang] [ExposeObjCDirect] Optimizations (PR #170619)
https://github.com/DataCorrupted updated
https://github.com/llvm/llvm-project/pull/170619
>From bbf2e85a9bc07a52c83d13af5db0d35878484b9a Mon Sep 17 00:00:00 2001
From: Peter Rong
Date: Wed, 3 Dec 2025 22:45:04 -0800
Subject: [PATCH 1/3] [ExposeObjCDirect] Optimizations
In many cases we can infer that class object has been realized
---
clang/lib/CodeGen/CGObjCRuntime.cpp | 65 -
clang/lib/CodeGen/CGObjCRuntime.h | 23 +++---
2 files changed, 82 insertions(+), 6 deletions(-)
diff --git a/clang/lib/CodeGen/CGObjCRuntime.cpp
b/clang/lib/CodeGen/CGObjCRuntime.cpp
index a4b4460fdc49c..fd227d9645ac1 100644
--- a/clang/lib/CodeGen/CGObjCRuntime.cpp
+++ b/clang/lib/CodeGen/CGObjCRuntime.cpp
@@ -415,7 +415,70 @@ bool CGObjCRuntime::canMessageReceiverBeNull(
bool CGObjCRuntime::canClassObjectBeUnrealized(
const ObjCInterfaceDecl *CalleeClassDecl, CodeGenFunction &CGF) const {
- // TODO
+ if (!CalleeClassDecl)
+return true;
+
+ // Heuristic 1: +load method on this class
+ // If the class has a +load method, it's realized when the binary is loaded.
+ ASTContext &Ctx = CGM.getContext();
+ const IdentifierInfo *LoadII = &Ctx.Idents.get("load");
+ Selector LoadSel = Ctx.Selectors.getSelector(0, &LoadII);
+
+ // TODO: if one if the child had +load, this class is guaranteed to be
+ // realized as well. We should have a translation unit specific map that
+ // precomputes all classes that are realized, and just do a lookup here.
+ // But we need to measure how expensive it is to create a map like that.
+ if (CalleeClassDecl->lookupClassMethod(LoadSel))
+return false; // This class has +load, so it's already realized
+
+ // Heuristic 2: using Self / Super
+ // If we're currently executing a method of ClassDecl (or a subclass),
+ // then ClassDecl must already be realized.
+ if (const auto *CurMethod =
+ dyn_cast_or_null(CGF.CurCodeDecl)) {
+const ObjCInterfaceDecl *CallerCalssDecl = CurMethod->getClassInterface();
+if (CallerCalssDecl && CalleeClassDecl->isSuperClassOf(CallerCalssDecl))
+ return false;
+ }
+
+ // Heuristic 3: previously realized
+ // Heuristic 3.1: Walk through the current BasicBlock looking for calls that
+ // realize the class. All heuristics in this cluster share the same
+ // implementation pattern.
+ auto *BB = CGF.Builder.GetInsertBlock();
+ if (!BB)
+return true; // No current block, assume unrealized
+
+ llvm::StringRef CalleeClassName = CalleeClassDecl->getName();
+
+ // Heuristic 3.2 / TODO: If realization happened in a dominating block, the
+ // class is realized Requires Dominator tree analysis. There should be an
+ // outer loop `for (BB: DominatingBasicBlocks)`
+ for (const auto &Inst : *BB) {
+// Check if this is a call instruction
+const auto *Call = llvm::dyn_cast(&Inst);
+if (!Call)
+ continue;
+llvm::Function *CalledFunc = Call->getCalledFunction();
+if (!CalledFunc)
+ continue;
+
+llvm::StringRef FuncNamePtr = CalledFunc->getName();
+// Skip the \01 prefix if present
+if (FuncNamePtr.starts_with("\01"))
+ FuncNamePtr = FuncNamePtr.drop_front(1);
+// Check for instance method calls: "-[ClassName methodName]"
+// or class method calls: "+[ClassName methodName]"
+// Also check for thunks: "-[ClassName methodName]_thunk"
+if ((FuncNamePtr.starts_with("-[") || FuncNamePtr.starts_with("+["))) {
+ FuncNamePtr = FuncNamePtr.drop_front(2);
+ // TODO: if the current class is the super class of the function that's
+ // used, it should've been realized as well
+ if (FuncNamePtr.starts_with(CalleeClassName))
+return false;
+}
+ }
+
// Otherwise, assume it can be unrealized.
return true;
}
diff --git a/clang/lib/CodeGen/CGObjCRuntime.h
b/clang/lib/CodeGen/CGObjCRuntime.h
index d3d4745cb77a7..b0cf04fc8553b 100644
--- a/clang/lib/CodeGen/CGObjCRuntime.h
+++ b/clang/lib/CodeGen/CGObjCRuntime.h
@@ -226,7 +226,7 @@ class CGObjCRuntime {
virtual llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD) = 0;
-/// Generates precondition checks for direct Objective-C Methods.
+ /// Generates precondition checks for direct Objective-C Methods.
/// This includes [self self] for class methods and nil checks.
virtual void GenerateDirectMethodsPreconditionCheck(
CodeGenFunction &CGF, llvm::Function *Fn, const ObjCMethodDecl *OMD,
@@ -330,10 +330,23 @@ class CGObjCRuntime {
QualType resultType,
CallArgList &callArgs);
- bool canMessageReceiverBeNull(CodeGenFunction &CGF,
-const ObjCMethodDecl *method, bool isSuper,
-const ObjCInterfaceDecl *classReceiver,
-llvm::Value *receiver);
+ /// Check if the receiver of an ObjC message send can be null.
+
[llvm-branch-commits] [clang] [ExposeObjCDirect] Setup helper functions (PR #170617)
https://github.com/DataCorrupted updated
https://github.com/llvm/llvm-project/pull/170617
>From fb969c3e8f50f80f497ab6b1aca23537e04d172b Mon Sep 17 00:00:00 2001
From: Peter Rong
Date: Wed, 3 Dec 2025 22:35:15 -0800
Subject: [PATCH 1/3] [ExposeObjCDirect] Setup helper functions
1. GenerateDirectMethodsPreconditionCheck: Move some functionalities to a
separate functions.
Those functions will be reused if we move precondition checks into a thunk
2. Create `DirectMethodInfo`, which will be used to manage true implementation
and its thunk
---
clang/lib/CodeGen/CGObjCGNU.cpp | 9 +++
clang/lib/CodeGen/CGObjCMac.cpp | 95 ---
clang/lib/CodeGen/CGObjCRuntime.h | 6 ++
3 files changed, 88 insertions(+), 22 deletions(-)
diff --git a/clang/lib/CodeGen/CGObjCGNU.cpp b/clang/lib/CodeGen/CGObjCGNU.cpp
index 06643d4bdc211..9c814487860ac 100644
--- a/clang/lib/CodeGen/CGObjCGNU.cpp
+++ b/clang/lib/CodeGen/CGObjCGNU.cpp
@@ -600,6 +600,9 @@ class CGObjCGNU : public CGObjCRuntime {
// Map to unify direct method definitions.
llvm::DenseMap
DirectMethodDefinitions;
+ void GenerateDirectMethodsPreconditionCheck(
+ CodeGenFunction &CGF, llvm::Function *Fn, const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD) override;
void GenerateDirectMethodPrologue(CodeGenFunction &CGF, llvm::Function *Fn,
const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD) override;
@@ -4196,6 +4199,12 @@ llvm::Function *CGObjCGNU::GenerateMethod(const
ObjCMethodDecl *OMD,
return Fn;
}
+void CGObjCGNU::GenerateDirectMethodsPreconditionCheck(
+CodeGenFunction &CGF, llvm::Function *Fn, const ObjCMethodDecl *OMD,
+const ObjCContainerDecl *CD) {
+ // GNU runtime doesn't support direct calls at this time
+}
+
void CGObjCGNU::GenerateDirectMethodPrologue(CodeGenFunction &CGF,
llvm::Function *Fn,
const ObjCMethodDecl *OMD,
diff --git a/clang/lib/CodeGen/CGObjCMac.cpp b/clang/lib/CodeGen/CGObjCMac.cpp
index cb5bb403bb53b..3f4b11c634ce4 100644
--- a/clang/lib/CodeGen/CGObjCMac.cpp
+++ b/clang/lib/CodeGen/CGObjCMac.cpp
@@ -847,9 +847,19 @@ class CGObjCCommonMac : public CodeGen::CGObjCRuntime {
/// this translation unit.
llvm::DenseMap MethodDefinitions;
+ /// Information about a direct method definition
+ struct DirectMethodInfo {
+llvm::Function
+*Implementation; // The true implementation (where body is emitted)
+llvm::Function *Thunk; // The nil-check thunk (nullptr if not generated)
+
+DirectMethodInfo(llvm::Function *Impl, llvm::Function *Thunk = nullptr)
+: Implementation(Impl), Thunk(Thunk) {}
+ };
+
/// DirectMethodDefinitions - map of direct methods which have been defined
in
/// this translation unit.
- llvm::DenseMap
+ llvm::DenseMap
DirectMethodDefinitions;
/// PropertyNames - uniqued method variable names.
@@ -1053,9 +1063,20 @@ class CGObjCCommonMac : public CodeGen::CGObjCRuntime {
GenerateMethod(const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD = nullptr) override;
- llvm::Function *GenerateDirectMethod(const ObjCMethodDecl *OMD,
+ DirectMethodInfo &GenerateDirectMethod(const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD);
+ /// Generate class realization code: [self self]
+ /// This is used for class methods to ensure the class is initialized.
+ /// Returns the realized class object.
+ llvm::Value *GenerateClassRealization(CodeGenFunction &CGF,
+llvm::Value *classObject,
+const ObjCInterfaceDecl *OID);
+
+ void GenerateDirectMethodsPreconditionCheck(
+ CodeGenFunction &CGF, llvm::Function *Fn, const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD) override;
+
void GenerateDirectMethodPrologue(CodeGenFunction &CGF, llvm::Function *Fn,
const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD) override;
@@ -3847,7 +3868,9 @@ llvm::Function *CGObjCCommonMac::GenerateMethod(const
ObjCMethodDecl *OMD,
llvm::Function *Method;
if (OMD->isDirectMethod()) {
-Method = GenerateDirectMethod(OMD, CD);
+// Returns DirectMethodInfo& containing both Implementation and Thunk
+DirectMethodInfo &Info = GenerateDirectMethod(OMD, CD);
+Method = Info.Implementation; // Extract implementation for body generation
} else {
auto Name = getSymbolNameForMethod(OMD);
@@ -3863,7 +3886,7 @@ llvm::Function *CGObjCCommonMac::GenerateMethod(const
ObjCMethodDecl *OMD,
return Method;
}
-llvm::Function *
+CGObjCCommonMac::DirectMethodInfo &
CGObjCCommonMac::GenerateDirectMethod(const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD) {
auto *C
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Use static functions over the anonymous namespace (PR #170221)
https://github.com/evelez7 approved this pull request. https://github.com/llvm/llvm-project/pull/170221 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Use static functions over the anonymous namespace (PR #170221)
https://github.com/evelez7 edited https://github.com/llvm/llvm-project/pull/170221 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add Mustache case to test for DR 131697 (PR #170197)
https://github.com/evelez7 updated
https://github.com/llvm/llvm-project/pull/170197
>From 3589489cbbcfb68fc730e5fac65c91b9dbdee6f6 Mon Sep 17 00:00:00 2001
From: Erick Velez
Date: Fri, 28 Nov 2025 14:04:56 -0800
Subject: [PATCH] [clang-doc] Add Mustache case to test for DR 131697
The test for DR 131697 only requires that clang-doc doesn't crash. There
is no documentation created. However, when using Mustache, clang-doc still
expects certain
paths to exist, like the directory where assets are placed. In legacy
HTML, the `docs` directory is still created and assets are placed there
regardless of there being any Infos to document. Mustache didn't do
this, so now we create `docs/json` and `docs/html` even if there is
nothing to document.
---
clang-tools-extra/clang-doc/Generators.cpp| 21 +++
.../test/clang-doc/DR-131697.cpp | 1 +
2 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/clang-tools-extra/clang-doc/Generators.cpp
b/clang-tools-extra/clang-doc/Generators.cpp
index 667e5d5a318f0..5d76901b95833 100644
--- a/clang-tools-extra/clang-doc/Generators.cpp
+++ b/clang-tools-extra/clang-doc/Generators.cpp
@@ -84,27 +84,30 @@ Error MustacheGenerator::generateDocumentation(
return JSONGenerator.takeError();
}
- SmallString<128> JSONPath;
- sys::path::native(RootDir.str() + "/json", JSONPath);
+ SmallString<128> JSONDirPath(RootDir);
+ sys::path::append(JSONDirPath, "json");
+ if (auto EC = sys::fs::create_directories(JSONDirPath))
+return createFileError(JSONDirPath, EC);
+ SmallString<128> DocsDirPath(RootDir);
+ sys::path::append(DocsDirPath, DirName);
+ if (auto EC = sys::fs::create_directories(DocsDirPath))
+return createFileError(DocsDirPath, EC);
{
llvm::TimeTraceScope TS("Iterate JSON files");
std::error_code EC;
-sys::fs::recursive_directory_iterator JSONIter(JSONPath, EC);
+sys::fs::recursive_directory_iterator JSONIter(JSONDirPath, EC);
std::vector JSONFiles;
JSONFiles.reserve(Infos.size());
if (EC)
return createStringError("Failed to create directory iterator.");
-SmallString<128> DocsDirPath(RootDir.str() + '/' + DirName);
-sys::path::native(DocsDirPath);
-if (auto EC = sys::fs::create_directories(DocsDirPath))
- return createFileError(DocsDirPath, EC);
while (JSONIter != sys::fs::recursive_directory_iterator()) {
// create the same directory structure in the docs format dir
if (JSONIter->type() == sys::fs::file_type::directory_file) {
SmallString<128> DocsClonedPath(JSONIter->path());
-sys::path::replace_path_prefix(DocsClonedPath, JSONPath, DocsDirPath);
+sys::path::replace_path_prefix(DocsClonedPath, JSONDirPath,
+ DocsDirPath);
if (auto EC = sys::fs::create_directories(DocsClonedPath)) {
return createFileError(DocsClonedPath, EC);
}
@@ -134,7 +137,7 @@ Error MustacheGenerator::generateDocumentation(
std::error_code FileErr;
SmallString<128> DocsFilePath(JSONIter->path());
- sys::path::replace_path_prefix(DocsFilePath, JSONPath, DocsDirPath);
+ sys::path::replace_path_prefix(DocsFilePath, JSONDirPath, DocsDirPath);
sys::path::replace_extension(DocsFilePath, DirName);
raw_fd_ostream InfoOS(DocsFilePath, FileErr, sys::fs::OF_None);
if (FileErr)
diff --git a/clang-tools-extra/test/clang-doc/DR-131697.cpp
b/clang-tools-extra/test/clang-doc/DR-131697.cpp
index 9025bbf910813..06168e6642f62 100644
--- a/clang-tools-extra/test/clang-doc/DR-131697.cpp
+++ b/clang-tools-extra/test/clang-doc/DR-131697.cpp
@@ -1,6 +1,7 @@
// RUN: rm -rf %t && mkdir -p %t
// RUN: split-file %s %t
// RUN: clang-doc -format=html %t/compile_commands.json %t/main.cpp
+// RUN: clang-doc -format=mustache %t/compile_commands.json %t/main.cpp
//--- main.cpp
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add Mustache case to assets test (PR #170198)
https://github.com/evelez7 updated
https://github.com/llvm/llvm-project/pull/170198
>From 11036c35f252319203c57abaad9263d767e83afd Mon Sep 17 00:00:00 2001
From: Erick Velez
Date: Wed, 26 Nov 2025 22:28:10 -0800
Subject: [PATCH] [clang-doc] Add Mustache case to assets test
Mustache wasn't tested in the assets lit test, which tests if
user-supplied assets are copied correctly. The Mustache HTML backend
initialy failed this test because it expected every asset, which
included Mustache templates, to be supplied. For now, we just expect
either CSS or JS to be supplied and use the default if one of them isn't
given.
We can allow custom templates in the future using the same checks.
---
clang-tools-extra/clang-doc/support/Utils.cpp | 20 +--
.../clang-doc/tool/ClangDocMain.cpp | 5 +++--
clang-tools-extra/test/clang-doc/assets.cpp | 10 ++
3 files changed, 27 insertions(+), 8 deletions(-)
diff --git a/clang-tools-extra/clang-doc/support/Utils.cpp
b/clang-tools-extra/clang-doc/support/Utils.cpp
index 6ed56033738b5..897a7ad0adb79 100644
--- a/clang-tools-extra/clang-doc/support/Utils.cpp
+++ b/clang-tools-extra/clang-doc/support/Utils.cpp
@@ -33,8 +33,20 @@ void getMustacheHtmlFiles(StringRef AssetsPath,
assert(!AssetsPath.empty());
assert(sys::fs::is_directory(AssetsPath));
- SmallString<128> DefaultStylesheet =
- appendPathPosix(AssetsPath, "clang-doc-mustache.css");
+ // TODO: Allow users to override default templates with their own. We would
+ // similarly have to check if a template file already exists in CDCtx.
+ if (CDCtx.UserStylesheets.empty()) {
+SmallString<128> DefaultStylesheet =
+appendPathPosix(AssetsPath, "clang-doc-mustache.css");
+CDCtx.UserStylesheets.insert(CDCtx.UserStylesheets.begin(),
+ DefaultStylesheet.c_str());
+ }
+
+ if (CDCtx.JsScripts.empty()) {
+SmallString<128> IndexJS = appendPathPosix(AssetsPath,
"mustache-index.js");
+CDCtx.JsScripts.insert(CDCtx.JsScripts.begin(), IndexJS.c_str());
+ }
+
SmallString<128> NamespaceTemplate =
appendPathPosix(AssetsPath, "namespace-template.mustache");
SmallString<128> ClassTemplate =
@@ -45,11 +57,7 @@ void getMustacheHtmlFiles(StringRef AssetsPath,
appendPathPosix(AssetsPath, "function-template.mustache");
SmallString<128> CommentTemplate =
appendPathPosix(AssetsPath, "comment-template.mustache");
- SmallString<128> IndexJS = appendPathPosix(AssetsPath, "mustache-index.js");
- CDCtx.JsScripts.insert(CDCtx.JsScripts.begin(), IndexJS.c_str());
- CDCtx.UserStylesheets.insert(CDCtx.UserStylesheets.begin(),
- DefaultStylesheet.c_str());
CDCtx.MustacheTemplates.insert(
{"namespace-template", NamespaceTemplate.c_str()});
CDCtx.MustacheTemplates.insert({"class-template", ClassTemplate.c_str()});
diff --git a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
index 62fa6a17df2ee..8de7c8ad6f000 100644
--- a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
+++ b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
@@ -151,6 +151,7 @@ static std::string getExecutablePath(const char *Argv0,
void *MainAddr) {
return llvm::sys::fs::getMainExecutable(Argv0, MainAddr);
}
+// TODO: Rename this, since it only gets custom CSS/JS
static llvm::Error getAssetFiles(clang::doc::ClangDocContext &CDCtx) {
using DirIt = llvm::sys::fs::directory_iterator;
std::error_code FileErr;
@@ -221,8 +222,8 @@ static llvm::Error getMustacheHtmlFiles(const char *Argv0,
llvm::outs() << "Asset path supply is not a directory: " << UserAssetPath
<< " falling back to default\n";
if (IsDir) {
-getMustacheHtmlFiles(UserAssetPath, CDCtx);
-return llvm::Error::success();
+if (auto Err = getAssetFiles(CDCtx))
+ return Err;
}
void *MainAddr = (void *)(intptr_t)getExecutablePath;
std::string ClangDocPath = getExecutablePath(Argv0, MainAddr);
diff --git a/clang-tools-extra/test/clang-doc/assets.cpp
b/clang-tools-extra/test/clang-doc/assets.cpp
index c5933e504f6b9..00d0d32213965 100644
--- a/clang-tools-extra/test/clang-doc/assets.cpp
+++ b/clang-tools-extra/test/clang-doc/assets.cpp
@@ -1,9 +1,13 @@
// RUN: rm -rf %t && mkdir %t
// RUN: clang-doc --format=html --output=%t --asset=%S/Inputs/test-assets
--executor=standalone %s --base base_dir
+// RUN: clang-doc --format=mustache --output=%t --asset=%S/Inputs/test-assets
--executor=standalone %s --base base_dir
// RUN: FileCheck %s -input-file=%t/index.html -check-prefix=INDEX
// RUN: FileCheck %s -input-file=%t/test.css -check-prefix=CSS
// RUN: FileCheck %s -input-file=%t/test.js -check-prefix=JS
+// RUN: FileCheck %s -input-file=%t/html/test.css -check-prefix=MUSTACHE-CSS
+// RUN: FileCheck %s -input-file=%t/html/test.js -check-prefix=MUSTACHE-JS
+
// INDEX:
// INDEX-NEXT:
// INDEX-NEXT: Index
@@ -19,4 +23,10 @@
// CSS-NE
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add Mustache case to assets test (PR #170198)
https://github.com/evelez7 updated
https://github.com/llvm/llvm-project/pull/170198
>From 11036c35f252319203c57abaad9263d767e83afd Mon Sep 17 00:00:00 2001
From: Erick Velez
Date: Wed, 26 Nov 2025 22:28:10 -0800
Subject: [PATCH] [clang-doc] Add Mustache case to assets test
Mustache wasn't tested in the assets lit test, which tests if
user-supplied assets are copied correctly. The Mustache HTML backend
initialy failed this test because it expected every asset, which
included Mustache templates, to be supplied. For now, we just expect
either CSS or JS to be supplied and use the default if one of them isn't
given.
We can allow custom templates in the future using the same checks.
---
clang-tools-extra/clang-doc/support/Utils.cpp | 20 +--
.../clang-doc/tool/ClangDocMain.cpp | 5 +++--
clang-tools-extra/test/clang-doc/assets.cpp | 10 ++
3 files changed, 27 insertions(+), 8 deletions(-)
diff --git a/clang-tools-extra/clang-doc/support/Utils.cpp
b/clang-tools-extra/clang-doc/support/Utils.cpp
index 6ed56033738b5..897a7ad0adb79 100644
--- a/clang-tools-extra/clang-doc/support/Utils.cpp
+++ b/clang-tools-extra/clang-doc/support/Utils.cpp
@@ -33,8 +33,20 @@ void getMustacheHtmlFiles(StringRef AssetsPath,
assert(!AssetsPath.empty());
assert(sys::fs::is_directory(AssetsPath));
- SmallString<128> DefaultStylesheet =
- appendPathPosix(AssetsPath, "clang-doc-mustache.css");
+ // TODO: Allow users to override default templates with their own. We would
+ // similarly have to check if a template file already exists in CDCtx.
+ if (CDCtx.UserStylesheets.empty()) {
+SmallString<128> DefaultStylesheet =
+appendPathPosix(AssetsPath, "clang-doc-mustache.css");
+CDCtx.UserStylesheets.insert(CDCtx.UserStylesheets.begin(),
+ DefaultStylesheet.c_str());
+ }
+
+ if (CDCtx.JsScripts.empty()) {
+SmallString<128> IndexJS = appendPathPosix(AssetsPath,
"mustache-index.js");
+CDCtx.JsScripts.insert(CDCtx.JsScripts.begin(), IndexJS.c_str());
+ }
+
SmallString<128> NamespaceTemplate =
appendPathPosix(AssetsPath, "namespace-template.mustache");
SmallString<128> ClassTemplate =
@@ -45,11 +57,7 @@ void getMustacheHtmlFiles(StringRef AssetsPath,
appendPathPosix(AssetsPath, "function-template.mustache");
SmallString<128> CommentTemplate =
appendPathPosix(AssetsPath, "comment-template.mustache");
- SmallString<128> IndexJS = appendPathPosix(AssetsPath, "mustache-index.js");
- CDCtx.JsScripts.insert(CDCtx.JsScripts.begin(), IndexJS.c_str());
- CDCtx.UserStylesheets.insert(CDCtx.UserStylesheets.begin(),
- DefaultStylesheet.c_str());
CDCtx.MustacheTemplates.insert(
{"namespace-template", NamespaceTemplate.c_str()});
CDCtx.MustacheTemplates.insert({"class-template", ClassTemplate.c_str()});
diff --git a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
index 62fa6a17df2ee..8de7c8ad6f000 100644
--- a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
+++ b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
@@ -151,6 +151,7 @@ static std::string getExecutablePath(const char *Argv0,
void *MainAddr) {
return llvm::sys::fs::getMainExecutable(Argv0, MainAddr);
}
+// TODO: Rename this, since it only gets custom CSS/JS
static llvm::Error getAssetFiles(clang::doc::ClangDocContext &CDCtx) {
using DirIt = llvm::sys::fs::directory_iterator;
std::error_code FileErr;
@@ -221,8 +222,8 @@ static llvm::Error getMustacheHtmlFiles(const char *Argv0,
llvm::outs() << "Asset path supply is not a directory: " << UserAssetPath
<< " falling back to default\n";
if (IsDir) {
-getMustacheHtmlFiles(UserAssetPath, CDCtx);
-return llvm::Error::success();
+if (auto Err = getAssetFiles(CDCtx))
+ return Err;
}
void *MainAddr = (void *)(intptr_t)getExecutablePath;
std::string ClangDocPath = getExecutablePath(Argv0, MainAddr);
diff --git a/clang-tools-extra/test/clang-doc/assets.cpp
b/clang-tools-extra/test/clang-doc/assets.cpp
index c5933e504f6b9..00d0d32213965 100644
--- a/clang-tools-extra/test/clang-doc/assets.cpp
+++ b/clang-tools-extra/test/clang-doc/assets.cpp
@@ -1,9 +1,13 @@
// RUN: rm -rf %t && mkdir %t
// RUN: clang-doc --format=html --output=%t --asset=%S/Inputs/test-assets
--executor=standalone %s --base base_dir
+// RUN: clang-doc --format=mustache --output=%t --asset=%S/Inputs/test-assets
--executor=standalone %s --base base_dir
// RUN: FileCheck %s -input-file=%t/index.html -check-prefix=INDEX
// RUN: FileCheck %s -input-file=%t/test.css -check-prefix=CSS
// RUN: FileCheck %s -input-file=%t/test.js -check-prefix=JS
+// RUN: FileCheck %s -input-file=%t/html/test.css -check-prefix=MUSTACHE-CSS
+// RUN: FileCheck %s -input-file=%t/html/test.js -check-prefix=MUSTACHE-JS
+
// INDEX:
// INDEX-NEXT:
// INDEX-NEXT: Index
@@ -19,4 +23,10 @@
// CSS-NE
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Add Mustache case to test for DR 131697 (PR #170197)
https://github.com/evelez7 updated
https://github.com/llvm/llvm-project/pull/170197
>From 3589489cbbcfb68fc730e5fac65c91b9dbdee6f6 Mon Sep 17 00:00:00 2001
From: Erick Velez
Date: Fri, 28 Nov 2025 14:04:56 -0800
Subject: [PATCH] [clang-doc] Add Mustache case to test for DR 131697
The test for DR 131697 only requires that clang-doc doesn't crash. There
is no documentation created. However, when using Mustache, clang-doc still
expects certain
paths to exist, like the directory where assets are placed. In legacy
HTML, the `docs` directory is still created and assets are placed there
regardless of there being any Infos to document. Mustache didn't do
this, so now we create `docs/json` and `docs/html` even if there is
nothing to document.
---
clang-tools-extra/clang-doc/Generators.cpp| 21 +++
.../test/clang-doc/DR-131697.cpp | 1 +
2 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/clang-tools-extra/clang-doc/Generators.cpp
b/clang-tools-extra/clang-doc/Generators.cpp
index 667e5d5a318f0..5d76901b95833 100644
--- a/clang-tools-extra/clang-doc/Generators.cpp
+++ b/clang-tools-extra/clang-doc/Generators.cpp
@@ -84,27 +84,30 @@ Error MustacheGenerator::generateDocumentation(
return JSONGenerator.takeError();
}
- SmallString<128> JSONPath;
- sys::path::native(RootDir.str() + "/json", JSONPath);
+ SmallString<128> JSONDirPath(RootDir);
+ sys::path::append(JSONDirPath, "json");
+ if (auto EC = sys::fs::create_directories(JSONDirPath))
+return createFileError(JSONDirPath, EC);
+ SmallString<128> DocsDirPath(RootDir);
+ sys::path::append(DocsDirPath, DirName);
+ if (auto EC = sys::fs::create_directories(DocsDirPath))
+return createFileError(DocsDirPath, EC);
{
llvm::TimeTraceScope TS("Iterate JSON files");
std::error_code EC;
-sys::fs::recursive_directory_iterator JSONIter(JSONPath, EC);
+sys::fs::recursive_directory_iterator JSONIter(JSONDirPath, EC);
std::vector JSONFiles;
JSONFiles.reserve(Infos.size());
if (EC)
return createStringError("Failed to create directory iterator.");
-SmallString<128> DocsDirPath(RootDir.str() + '/' + DirName);
-sys::path::native(DocsDirPath);
-if (auto EC = sys::fs::create_directories(DocsDirPath))
- return createFileError(DocsDirPath, EC);
while (JSONIter != sys::fs::recursive_directory_iterator()) {
// create the same directory structure in the docs format dir
if (JSONIter->type() == sys::fs::file_type::directory_file) {
SmallString<128> DocsClonedPath(JSONIter->path());
-sys::path::replace_path_prefix(DocsClonedPath, JSONPath, DocsDirPath);
+sys::path::replace_path_prefix(DocsClonedPath, JSONDirPath,
+ DocsDirPath);
if (auto EC = sys::fs::create_directories(DocsClonedPath)) {
return createFileError(DocsClonedPath, EC);
}
@@ -134,7 +137,7 @@ Error MustacheGenerator::generateDocumentation(
std::error_code FileErr;
SmallString<128> DocsFilePath(JSONIter->path());
- sys::path::replace_path_prefix(DocsFilePath, JSONPath, DocsDirPath);
+ sys::path::replace_path_prefix(DocsFilePath, JSONDirPath, DocsDirPath);
sys::path::replace_extension(DocsFilePath, DirName);
raw_fd_ostream InfoOS(DocsFilePath, FileErr, sys::fs::OF_None);
if (FileErr)
diff --git a/clang-tools-extra/test/clang-doc/DR-131697.cpp
b/clang-tools-extra/test/clang-doc/DR-131697.cpp
index 9025bbf910813..06168e6642f62 100644
--- a/clang-tools-extra/test/clang-doc/DR-131697.cpp
+++ b/clang-tools-extra/test/clang-doc/DR-131697.cpp
@@ -1,6 +1,7 @@
// RUN: rm -rf %t && mkdir -p %t
// RUN: split-file %s %t
// RUN: clang-doc -format=html %t/compile_commands.json %t/main.cpp
+// RUN: clang-doc -format=mustache %t/compile_commands.json %t/main.cpp
//--- main.cpp
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Fix enum partial indentation (PR #170196)
https://github.com/evelez7 updated
https://github.com/llvm/llvm-project/pull/170196
>From fdf94797126631c52cbd71d8ae3d32884d09b38d Mon Sep 17 00:00:00 2001
From: Erick Velez
Date: Fri, 21 Nov 2025 19:39:37 -0800
Subject: [PATCH] [clang-doc] Fix enum partial indentation
---
.../clang-doc/assets/enum-template.mustache | 20 ++-
clang-tools-extra/test/clang-doc/enum.cpp | 128 --
.../test/clang-doc/mustache-index.cpp | 20 ++-
3 files changed, 72 insertions(+), 96 deletions(-)
diff --git a/clang-tools-extra/clang-doc/assets/enum-template.mustache
b/clang-tools-extra/clang-doc/assets/enum-template.mustache
index 53da4669d824b..ec42df99a7f4b 100644
--- a/clang-tools-extra/clang-doc/assets/enum-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/enum-template.mustache
@@ -7,22 +7,18 @@
}}
-
-
-enum {{Name}}
-
-
+enum
{{Name}}
{{! Enum Values }}
-
-Name
-Value
-{{#HasComment}}
+
+Name
+Value
+{{#HasComment}}
Comment
-{{/HasComment}}
-
+{{/HasComment}}
+
{{#Members}}
{{Name}}
@@ -34,7 +30,7 @@ enum {{Name}}
{{ValueExpr}}
{{/Value}}
{{#EnumValueComments}}
-{{>Comments}}
+{{>Comments}}
{{/EnumValueComments}}
{{/Members}}
diff --git a/clang-tools-extra/test/clang-doc/enum.cpp
b/clang-tools-extra/test/clang-doc/enum.cpp
index 159d61ab5a3b7..3ba834e0b2e70 100644
--- a/clang-tools-extra/test/clang-doc/enum.cpp
+++ b/clang-tools-extra/test/clang-doc/enum.cpp
@@ -55,11 +55,7 @@ enum Color {
// HTML-INDEX: Comment 3
// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: enum Color
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: enum
Color
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
@@ -67,18 +63,18 @@ enum Color {
// MUSTACHE-INDEX: Name
// MUSTACHE-INDEX: Value
// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Red
-// MUSTACHE-INDEX: 0
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Green
-// MUSTACHE-INDEX: 1
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Blue
-// MUSTACHE-INDEX: 2
-// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Red
+// MUSTACHE-INDEX: 0
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Green
+// MUSTACHE-INDEX: 1
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Blue
+// MUSTACHE-INDEX: 2
+// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
@@ -117,11 +113,7 @@ enum class Shapes {
// COM: FIXME: Serialize "enum class" in template
// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: enum Shapes
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: enum
Shapes
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
@@ -129,18 +121,18 @@ enum class Shapes {
// MUSTACHE-INDEX: Name
// MUSTACHE-INDEX: Value
// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Circle
-// MUSTACHE-INDEX: 0
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Rectangle
-// MUSTACHE-INDEX: 1
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Triangle
-// MUSTACHE-INDEX: 2
-// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Circle
+// MUSTACHE-INDEX: 0
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Rectangle
+// MUSTACHE-INDEX: 1
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Triangle
+// MUSTACHE-INDEX: 2
+// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
@@ -240,11 +232,7 @@ enum Car {
// HTML-VEHICLES: Comment 4
// MUSTACHE-VEHICLES:
-// MUSTACHE-VEHICLES:
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Fix enum partial indentation (PR #170196)
https://github.com/evelez7 updated
https://github.com/llvm/llvm-project/pull/170196
>From fdf94797126631c52cbd71d8ae3d32884d09b38d Mon Sep 17 00:00:00 2001
From: Erick Velez
Date: Fri, 21 Nov 2025 19:39:37 -0800
Subject: [PATCH] [clang-doc] Fix enum partial indentation
---
.../clang-doc/assets/enum-template.mustache | 20 ++-
clang-tools-extra/test/clang-doc/enum.cpp | 128 --
.../test/clang-doc/mustache-index.cpp | 20 ++-
3 files changed, 72 insertions(+), 96 deletions(-)
diff --git a/clang-tools-extra/clang-doc/assets/enum-template.mustache
b/clang-tools-extra/clang-doc/assets/enum-template.mustache
index 53da4669d824b..ec42df99a7f4b 100644
--- a/clang-tools-extra/clang-doc/assets/enum-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/enum-template.mustache
@@ -7,22 +7,18 @@
}}
-
-
-enum {{Name}}
-
-
+enum
{{Name}}
{{! Enum Values }}
-
-Name
-Value
-{{#HasComment}}
+
+Name
+Value
+{{#HasComment}}
Comment
-{{/HasComment}}
-
+{{/HasComment}}
+
{{#Members}}
{{Name}}
@@ -34,7 +30,7 @@ enum {{Name}}
{{ValueExpr}}
{{/Value}}
{{#EnumValueComments}}
-{{>Comments}}
+{{>Comments}}
{{/EnumValueComments}}
{{/Members}}
diff --git a/clang-tools-extra/test/clang-doc/enum.cpp
b/clang-tools-extra/test/clang-doc/enum.cpp
index 159d61ab5a3b7..3ba834e0b2e70 100644
--- a/clang-tools-extra/test/clang-doc/enum.cpp
+++ b/clang-tools-extra/test/clang-doc/enum.cpp
@@ -55,11 +55,7 @@ enum Color {
// HTML-INDEX: Comment 3
// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: enum Color
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: enum
Color
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
@@ -67,18 +63,18 @@ enum Color {
// MUSTACHE-INDEX: Name
// MUSTACHE-INDEX: Value
// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Red
-// MUSTACHE-INDEX: 0
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Green
-// MUSTACHE-INDEX: 1
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Blue
-// MUSTACHE-INDEX: 2
-// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Red
+// MUSTACHE-INDEX: 0
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Green
+// MUSTACHE-INDEX: 1
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Blue
+// MUSTACHE-INDEX: 2
+// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
@@ -117,11 +113,7 @@ enum class Shapes {
// COM: FIXME: Serialize "enum class" in template
// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: enum Shapes
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: enum
Shapes
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
@@ -129,18 +121,18 @@ enum class Shapes {
// MUSTACHE-INDEX: Name
// MUSTACHE-INDEX: Value
// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Circle
-// MUSTACHE-INDEX: 0
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Rectangle
-// MUSTACHE-INDEX: 1
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX:
-// MUSTACHE-INDEX: Triangle
-// MUSTACHE-INDEX: 2
-// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Circle
+// MUSTACHE-INDEX: 0
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Rectangle
+// MUSTACHE-INDEX: 1
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX:
+// MUSTACHE-INDEX: Triangle
+// MUSTACHE-INDEX: 2
+// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
// MUSTACHE-INDEX:
@@ -240,11 +232,7 @@ enum Car {
// HTML-VEHICLES: Comment 4
// MUSTACHE-VEHICLES:
-// MUSTACHE-VEHICLES:
[llvm-branch-commits] [flang] [flang][OpenMP] Generalize checks of loop construct structure (PR #170735)
https://github.com/kparzysz updated
https://github.com/llvm/llvm-project/pull/170735
>From 9a2d3dca08ab237e7e949fd5642c96cf0fba89b8 Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek
Date: Tue, 2 Dec 2025 14:59:34 -0600
Subject: [PATCH 1/2] [flang][OpenMP] Generalize checks of loop construct
structure
For an OpenMP loop construct, count how many loops will effectively be
contained in its associated block. For constructs that are loop-nest
associated this number should be 1. Report cases where this number is
different.
Take into account that the block associated with a loop construct can
contain compiler directives.
---
flang/lib/Semantics/check-omp-loop.cpp| 201 +++---
flang/lib/Semantics/check-omp-structure.h | 3 +-
flang/test/Parser/OpenMP/tile-fail.f90| 8 +-
flang/test/Semantics/OpenMP/do21.f90 | 10 +-
.../Semantics/OpenMP/loop-association.f90 | 6 +-
.../OpenMP/loop-transformation-clauses01.f90 | 16 +-
.../loop-transformation-construct01.f90 | 4 +-
.../loop-transformation-construct02.f90 | 8 +-
.../loop-transformation-construct04.f90 | 4 +-
9 files changed, 156 insertions(+), 104 deletions(-)
diff --git a/flang/lib/Semantics/check-omp-loop.cpp
b/flang/lib/Semantics/check-omp-loop.cpp
index fc4b9222d91b3..6414f0028e008 100644
--- a/flang/lib/Semantics/check-omp-loop.cpp
+++ b/flang/lib/Semantics/check-omp-loop.cpp
@@ -37,6 +37,14 @@
#include
#include
+namespace Fortran::semantics {
+static bool IsLoopTransforming(llvm::omp::Directive dir);
+static bool IsFullUnroll(const parser::OpenMPLoopConstruct &x);
+static std::optional CountGeneratedLoops(
+const parser::ExecutionPartConstruct &epc);
+static std::optional CountGeneratedLoops(const parser::Block &block);
+} // namespace Fortran::semantics
+
namespace {
using namespace Fortran;
@@ -263,22 +271,19 @@ static bool IsLoopTransforming(llvm::omp::Directive dir) {
}
void OmpStructureChecker::CheckNestedBlock(const parser::OpenMPLoopConstruct
&x,
-const parser::Block &body, size_t &nestedCount) {
+const parser::Block &body) {
for (auto &stmt : body) {
if (auto *dir{parser::Unwrap(stmt)}) {
context_.Say(dir->source,
"Compiler directives are not allowed inside OpenMP loop
constructs"_warn_en_US);
-} else if (parser::Unwrap(stmt)) {
- ++nestedCount;
} else if (auto *omp{parser::Unwrap(stmt)}) {
if (!IsLoopTransforming(omp->BeginDir().DirName().v)) {
context_.Say(omp->source,
"Only loop-transforming OpenMP constructs are allowed inside
OpenMP loop constructs"_err_en_US);
}
- ++nestedCount;
} else if (auto *block{parser::Unwrap(stmt)}) {
- CheckNestedBlock(x, std::get(block->t), nestedCount);
-} else {
+ CheckNestedBlock(x, std::get(block->t));
+} else if (!parser::Unwrap(stmt)) {
parser::CharBlock source{parser::GetSource(stmt).value_or(x.source)};
context_.Say(source,
"OpenMP loop construct can only contain DO loops or
loop-nest-generating OpenMP constructs"_err_en_US);
@@ -286,16 +291,96 @@ void OmpStructureChecker::CheckNestedBlock(const
parser::OpenMPLoopConstruct &x,
}
}
+static bool IsFullUnroll(const parser::OpenMPLoopConstruct &x) {
+ const parser::OmpDirectiveSpecification &beginSpec{x.BeginDir()};
+
+ if (beginSpec.DirName().v == llvm::omp::Directive::OMPD_unroll) {
+return llvm::none_of(beginSpec.Clauses().v, [](const parser::OmpClause &c)
{
+ return c.Id() == llvm::omp::Clause::OMPC_partial;
+});
+ }
+ return false;
+}
+
+static std::optional CountGeneratedLoops(
+const parser::ExecutionPartConstruct &epc) {
+ if (parser::Unwrap(epc)) {
+return 1;
+ }
+
+ auto &omp{DEREF(parser::Unwrap(epc))};
+ const parser::OmpDirectiveSpecification &beginSpec{omp.BeginDir()};
+ llvm::omp::Directive dir{beginSpec.DirName().v};
+
+ // TODO: Handle split, apply.
+ if (IsFullUnroll(omp)) {
+return std::nullopt;
+ }
+ if (dir == llvm::omp::Directive::OMPD_fuse) {
+auto rangeAt{
+llvm::find_if(beginSpec.Clauses().v, [](const parser::OmpClause &c) {
+ return c.Id() == llvm::omp::Clause::OMPC_looprange;
+})};
+if (rangeAt == beginSpec.Clauses().v.end()) {
+ return std::nullopt;
+}
+
+auto *loopRange{parser::Unwrap(*rangeAt)};
+std::optional count{GetIntValue(std::get<1>(loopRange->t))};
+if (!count || *count <= 0) {
+ return std::nullopt;
+}
+if (auto nestedCount{CountGeneratedLoops(std::get(omp.t))})
{
+ return 1 + *nestedCount - static_cast(*count);
+} else {
+ return std::nullopt;
+}
+ }
+
+ // For every other loop construct return 1.
+ return 1;
+}
+
+static std::optional CountGeneratedLoops(const parser::Block &block) {
+ // Count the number of loops in the associated block. If there are any
+ // malformed construct in there, getting the number may be meaningless.
+ // These issue
[llvm-branch-commits] [flang] [flang][OpenMP] Generalize checks of loop construct structure (PR #170735)
kparzysz wrote: PR stack: 1. https://github.com/llvm/llvm-project/pull/170734 2. https://github.com/llvm/llvm-project/pull/170735 (this PR) https://github.com/llvm/llvm-project/pull/170735 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Avoid crashing on statepoint-like pseudoinstructions (PR #170657)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/170657
>From 5c91afe00979438edfb664dfa69f22fff71c655d Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Thu, 4 Dec 2025 12:42:14 +0100
Subject: [PATCH] AMDGPU: Avoid crashing on statepoint-like pseudoinstructions
At the moment the MIR tests are somewhat redundant. The waitcnt
one is needed to ensure we actually have a load, given we are
currently just emitting an error on ExternalSymbol. The asm printer
one is more redundant for the moment, since it's stressed by the IR
test. However I am planning to change the error path for the IR test,
so it will soon not be redundant.
---
llvm/include/llvm/CodeGen/TargetInstrInfo.h | 13 +++-
.../CodeGen/SelectionDAG/SelectionDAGISel.cpp | 11
.../SelectionDAG/StatepointLowering.cpp | 2 +
llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp | 10 +++
.../AMDGPU/AMDGPUResourceUsageAnalysis.cpp| 5 +-
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 12
llvm/lib/Target/AMDGPU/SIISelLowering.h | 2 +
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp | 2 +-
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp| 8 +++
llvm/lib/Target/AMDGPU/SIInstrInfo.h | 2 +
llvm/test/CodeGen/AMDGPU/llvm.deoptimize.ll | 16 +
.../CodeGen/AMDGPU/statepoint-asm-printer.mir | 40
.../AMDGPU/statepoint-insert-waitcnts.mir | 64 +++
13 files changed, 184 insertions(+), 3 deletions(-)
create mode 100644 llvm/test/CodeGen/AMDGPU/llvm.deoptimize.ll
create mode 100644 llvm/test/CodeGen/AMDGPU/statepoint-asm-printer.mir
create mode 100644 llvm/test/CodeGen/AMDGPU/statepoint-insert-waitcnts.mir
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 18142c2c0adf3..bdd9fee795e08 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -2350,7 +2350,18 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
/// Returns the callee operand from the given \p MI.
virtual const MachineOperand &getCalleeOperand(const MachineInstr &MI) const
{
-return MI.getOperand(0);
+assert(MI.isCall());
+
+switch (MI.getOpcode()) {
+case TargetOpcode::STATEPOINT:
+case TargetOpcode::STACKMAP:
+case TargetOpcode::PATCHPOINT:
+ return MI.getOperand(3);
+default:
+ return MI.getOperand(0);
+}
+
+llvm_unreachable("impossible call instruction");
}
/// Return the uniformity behavior of the given instruction.
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index dd8f18d3b8a6a..7998da0ea06eb 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -331,6 +331,17 @@ namespace llvm {
MachineBasicBlock *
TargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
MachineBasicBlock *MBB) const {
+ switch (MI.getOpcode()) {
+ case TargetOpcode::STATEPOINT:
+// As an implementation detail, STATEPOINT shares the STACKMAP format at
+// this point in the process. We diverge later.
+ case TargetOpcode::STACKMAP:
+ case TargetOpcode::PATCHPOINT:
+return emitPatchPoint(MI, MBB);
+ default:
+break;
+ }
+
#ifndef NDEBUG
dbgs() << "If a target marks an instruction with "
"'usesCustomInserter', it must implement "
diff --git a/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
b/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
index 46a5e44374e1c..5b8cd343557fa 100644
--- a/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
@@ -1145,6 +1145,8 @@ void
SelectionDAGBuilder::LowerCallSiteWithDeoptBundleImpl(
const CallBase *Call, SDValue Callee, const BasicBlock *EHPadBB,
bool VarArgDisallowed, bool ForceVoidReturnTy) {
StatepointLoweringInfo SI(DAG);
+ SI.CLI.CB = Call;
+
unsigned ArgBeginIndex = Call->arg_begin() - Call->op_begin();
populateCallLoweringInfo(
SI.CLI, Call, ArgBeginIndex, Call->arg_size(), Callee,
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
index bf9b4297bd435..99c1ab8d379d5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
@@ -406,6 +406,16 @@ void AMDGPUAsmPrinter::emitInstruction(const MachineInstr
*MI) {
return;
}
+unsigned Opc = MI->getOpcode();
+if (LLVM_UNLIKELY(Opc == TargetOpcode::STATEPOINT ||
+ Opc == TargetOpcode::STACKMAP ||
+ Opc == TargetOpcode::PATCHPOINT)) {
+ LLVMContext &Ctx = MI->getMF()->getFunction().getContext();
+ Ctx.emitError("unhandled statepoint-like instruction");
+ OutStreamer->emitRawComment("unsupported
statepoint/stackmap/patchpoint");
+ return;
+}
+
if
[llvm-branch-commits] [llvm] AMDGPU: Avoid crashing on statepoint-like pseudoinstructions (PR #170657)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/170657
>From 5c91afe00979438edfb664dfa69f22fff71c655d Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Thu, 4 Dec 2025 12:42:14 +0100
Subject: [PATCH] AMDGPU: Avoid crashing on statepoint-like pseudoinstructions
At the moment the MIR tests are somewhat redundant. The waitcnt
one is needed to ensure we actually have a load, given we are
currently just emitting an error on ExternalSymbol. The asm printer
one is more redundant for the moment, since it's stressed by the IR
test. However I am planning to change the error path for the IR test,
so it will soon not be redundant.
---
llvm/include/llvm/CodeGen/TargetInstrInfo.h | 13 +++-
.../CodeGen/SelectionDAG/SelectionDAGISel.cpp | 11
.../SelectionDAG/StatepointLowering.cpp | 2 +
llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp | 10 +++
.../AMDGPU/AMDGPUResourceUsageAnalysis.cpp| 5 +-
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 12
llvm/lib/Target/AMDGPU/SIISelLowering.h | 2 +
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp | 2 +-
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp| 8 +++
llvm/lib/Target/AMDGPU/SIInstrInfo.h | 2 +
llvm/test/CodeGen/AMDGPU/llvm.deoptimize.ll | 16 +
.../CodeGen/AMDGPU/statepoint-asm-printer.mir | 40
.../AMDGPU/statepoint-insert-waitcnts.mir | 64 +++
13 files changed, 184 insertions(+), 3 deletions(-)
create mode 100644 llvm/test/CodeGen/AMDGPU/llvm.deoptimize.ll
create mode 100644 llvm/test/CodeGen/AMDGPU/statepoint-asm-printer.mir
create mode 100644 llvm/test/CodeGen/AMDGPU/statepoint-insert-waitcnts.mir
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 18142c2c0adf3..bdd9fee795e08 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -2350,7 +2350,18 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
/// Returns the callee operand from the given \p MI.
virtual const MachineOperand &getCalleeOperand(const MachineInstr &MI) const
{
-return MI.getOperand(0);
+assert(MI.isCall());
+
+switch (MI.getOpcode()) {
+case TargetOpcode::STATEPOINT:
+case TargetOpcode::STACKMAP:
+case TargetOpcode::PATCHPOINT:
+ return MI.getOperand(3);
+default:
+ return MI.getOperand(0);
+}
+
+llvm_unreachable("impossible call instruction");
}
/// Return the uniformity behavior of the given instruction.
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index dd8f18d3b8a6a..7998da0ea06eb 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -331,6 +331,17 @@ namespace llvm {
MachineBasicBlock *
TargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
MachineBasicBlock *MBB) const {
+ switch (MI.getOpcode()) {
+ case TargetOpcode::STATEPOINT:
+// As an implementation detail, STATEPOINT shares the STACKMAP format at
+// this point in the process. We diverge later.
+ case TargetOpcode::STACKMAP:
+ case TargetOpcode::PATCHPOINT:
+return emitPatchPoint(MI, MBB);
+ default:
+break;
+ }
+
#ifndef NDEBUG
dbgs() << "If a target marks an instruction with "
"'usesCustomInserter', it must implement "
diff --git a/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
b/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
index 46a5e44374e1c..5b8cd343557fa 100644
--- a/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
@@ -1145,6 +1145,8 @@ void
SelectionDAGBuilder::LowerCallSiteWithDeoptBundleImpl(
const CallBase *Call, SDValue Callee, const BasicBlock *EHPadBB,
bool VarArgDisallowed, bool ForceVoidReturnTy) {
StatepointLoweringInfo SI(DAG);
+ SI.CLI.CB = Call;
+
unsigned ArgBeginIndex = Call->arg_begin() - Call->op_begin();
populateCallLoweringInfo(
SI.CLI, Call, ArgBeginIndex, Call->arg_size(), Callee,
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
index bf9b4297bd435..99c1ab8d379d5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
@@ -406,6 +406,16 @@ void AMDGPUAsmPrinter::emitInstruction(const MachineInstr
*MI) {
return;
}
+unsigned Opc = MI->getOpcode();
+if (LLVM_UNLIKELY(Opc == TargetOpcode::STATEPOINT ||
+ Opc == TargetOpcode::STACKMAP ||
+ Opc == TargetOpcode::PATCHPOINT)) {
+ LLVMContext &Ctx = MI->getMF()->getFunction().getContext();
+ Ctx.emitError("unhandled statepoint-like instruction");
+ OutStreamer->emitRawComment("unsupported
statepoint/stackmap/patchpoint");
+ return;
+}
+
if
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Reorder struct fields to have less padding (PR #170222)
https://github.com/evelez7 edited https://github.com/llvm/llvm-project/pull/170222 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Reorder struct fields to have less padding (PR #170222)
https://github.com/evelez7 approved this pull request. https://github.com/llvm/llvm-project/pull/170222 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [ExposeObjCDirect] Setup helper functions (PR #170617)
https://github.com/DataCorrupted updated
https://github.com/llvm/llvm-project/pull/170617
>From fb969c3e8f50f80f497ab6b1aca23537e04d172b Mon Sep 17 00:00:00 2001
From: Peter Rong
Date: Wed, 3 Dec 2025 22:35:15 -0800
Subject: [PATCH 1/2] [ExposeObjCDirect] Setup helper functions
1. GenerateDirectMethodsPreconditionCheck: Move some functionalities to a
separate functions.
Those functions will be reused if we move precondition checks into a thunk
2. Create `DirectMethodInfo`, which will be used to manage true implementation
and its thunk
---
clang/lib/CodeGen/CGObjCGNU.cpp | 9 +++
clang/lib/CodeGen/CGObjCMac.cpp | 95 ---
clang/lib/CodeGen/CGObjCRuntime.h | 6 ++
3 files changed, 88 insertions(+), 22 deletions(-)
diff --git a/clang/lib/CodeGen/CGObjCGNU.cpp b/clang/lib/CodeGen/CGObjCGNU.cpp
index 06643d4bdc211..9c814487860ac 100644
--- a/clang/lib/CodeGen/CGObjCGNU.cpp
+++ b/clang/lib/CodeGen/CGObjCGNU.cpp
@@ -600,6 +600,9 @@ class CGObjCGNU : public CGObjCRuntime {
// Map to unify direct method definitions.
llvm::DenseMap
DirectMethodDefinitions;
+ void GenerateDirectMethodsPreconditionCheck(
+ CodeGenFunction &CGF, llvm::Function *Fn, const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD) override;
void GenerateDirectMethodPrologue(CodeGenFunction &CGF, llvm::Function *Fn,
const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD) override;
@@ -4196,6 +4199,12 @@ llvm::Function *CGObjCGNU::GenerateMethod(const
ObjCMethodDecl *OMD,
return Fn;
}
+void CGObjCGNU::GenerateDirectMethodsPreconditionCheck(
+CodeGenFunction &CGF, llvm::Function *Fn, const ObjCMethodDecl *OMD,
+const ObjCContainerDecl *CD) {
+ // GNU runtime doesn't support direct calls at this time
+}
+
void CGObjCGNU::GenerateDirectMethodPrologue(CodeGenFunction &CGF,
llvm::Function *Fn,
const ObjCMethodDecl *OMD,
diff --git a/clang/lib/CodeGen/CGObjCMac.cpp b/clang/lib/CodeGen/CGObjCMac.cpp
index cb5bb403bb53b..3f4b11c634ce4 100644
--- a/clang/lib/CodeGen/CGObjCMac.cpp
+++ b/clang/lib/CodeGen/CGObjCMac.cpp
@@ -847,9 +847,19 @@ class CGObjCCommonMac : public CodeGen::CGObjCRuntime {
/// this translation unit.
llvm::DenseMap MethodDefinitions;
+ /// Information about a direct method definition
+ struct DirectMethodInfo {
+llvm::Function
+*Implementation; // The true implementation (where body is emitted)
+llvm::Function *Thunk; // The nil-check thunk (nullptr if not generated)
+
+DirectMethodInfo(llvm::Function *Impl, llvm::Function *Thunk = nullptr)
+: Implementation(Impl), Thunk(Thunk) {}
+ };
+
/// DirectMethodDefinitions - map of direct methods which have been defined
in
/// this translation unit.
- llvm::DenseMap
+ llvm::DenseMap
DirectMethodDefinitions;
/// PropertyNames - uniqued method variable names.
@@ -1053,9 +1063,20 @@ class CGObjCCommonMac : public CodeGen::CGObjCRuntime {
GenerateMethod(const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD = nullptr) override;
- llvm::Function *GenerateDirectMethod(const ObjCMethodDecl *OMD,
+ DirectMethodInfo &GenerateDirectMethod(const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD);
+ /// Generate class realization code: [self self]
+ /// This is used for class methods to ensure the class is initialized.
+ /// Returns the realized class object.
+ llvm::Value *GenerateClassRealization(CodeGenFunction &CGF,
+llvm::Value *classObject,
+const ObjCInterfaceDecl *OID);
+
+ void GenerateDirectMethodsPreconditionCheck(
+ CodeGenFunction &CGF, llvm::Function *Fn, const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD) override;
+
void GenerateDirectMethodPrologue(CodeGenFunction &CGF, llvm::Function *Fn,
const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD) override;
@@ -3847,7 +3868,9 @@ llvm::Function *CGObjCCommonMac::GenerateMethod(const
ObjCMethodDecl *OMD,
llvm::Function *Method;
if (OMD->isDirectMethod()) {
-Method = GenerateDirectMethod(OMD, CD);
+// Returns DirectMethodInfo& containing both Implementation and Thunk
+DirectMethodInfo &Info = GenerateDirectMethod(OMD, CD);
+Method = Info.Implementation; // Extract implementation for body generation
} else {
auto Name = getSymbolNameForMethod(OMD);
@@ -3863,7 +3886,7 @@ llvm::Function *CGObjCCommonMac::GenerateMethod(const
ObjCMethodDecl *OMD,
return Method;
}
-llvm::Function *
+CGObjCCommonMac::DirectMethodInfo &
CGObjCCommonMac::GenerateDirectMethod(const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD) {
auto *C
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Replace HTML generation with Mustache backend (PR #170199)
https://github.com/evelez7 edited https://github.com/llvm/llvm-project/pull/170199 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [HEXAGON] [MachinePipeliner] Fix the DAG in case of dependent phis. (#135925) (PR #170749)
llvmbot wrote:
@llvm/pr-subscribers-backend-hexagon
Author: None (llvmbot)
Changes
Backport 78ee4a5
Requested by: @alexrp
---
Full diff: https://github.com/llvm/llvm-project/pull/170749.diff
3 Files Affected:
- (modified) llvm/lib/CodeGen/MachinePipeliner.cpp (+13-2)
- (added) llvm/test/CodeGen/Hexagon/swp-dependent-phis.ll (+96)
- (modified) llvm/test/CodeGen/Hexagon/swp-epilog-phi9.ll (+4-6)
``diff
diff --git a/llvm/lib/CodeGen/MachinePipeliner.cpp
b/llvm/lib/CodeGen/MachinePipeliner.cpp
index 0e7cb0c980d40..1035188344c33 100644
--- a/llvm/lib/CodeGen/MachinePipeliner.cpp
+++ b/llvm/lib/CodeGen/MachinePipeliner.cpp
@@ -1217,8 +1217,19 @@ void SwingSchedulerDAG::updatePhiDependences() {
HasPhiDef = Reg;
// Add a chain edge to a dependent Phi that isn't an existing
// predecessor.
- if (SU->NodeNum < I.NodeNum && !I.isPred(SU))
-I.addPred(SDep(SU, SDep::Barrier));
+
+ // %3:intregs = PHI %21:intregs, %bb.6, %7:intregs, %bb.1 - SU0
+ // %7:intregs = PHI %21:intregs, %bb.6, %13:intregs, %bb.1 - SU1
+ // %27:intregs = A2_zxtb %3:intregs - SU2
+ // %13:intregs = C2_muxri %45:predregs, 0, %46:intreg
+ // If we have dependent phis, SU0 should be the successor of SU1
+ // not the other way around. (it used to be SU1 is the successor
+ // of SU0). In some cases, SU0 is scheduled earlier than SU1
+ // resulting in bad IR as we do not have a value that can be used
+ // by SU2.
+
+ if (SU->NodeNum < I.NodeNum && !SU->isPred(&I))
+SU->addPred(SDep(&I, SDep::Barrier));
}
}
}
diff --git a/llvm/test/CodeGen/Hexagon/swp-dependent-phis.ll
b/llvm/test/CodeGen/Hexagon/swp-dependent-phis.ll
new file mode 100644
index 0..6d324029966d7
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/swp-dependent-phis.ll
@@ -0,0 +1,96 @@
+;RUN: llc -march=hexagon -mv71t -O2 < %s -o - 2>&1 > /dev/null
+
+; Validate that we do not crash while running this test.
+;%3:intregs = PHI %21:intregs, %bb.6, %7:intregs, %bb.1 - SU0
+;%7:intregs = PHI %21:intregs, %bb.6, %13:intregs, %bb.1 - SU1
+;%27:intregs = A2_zxtb %3:intregs - SU2
+;%13:intregs = C2_muxri %45:predregs, 0, %46:intreg
+;If we have dependent phis, SU0 should be the successor of SU1 not
+;the other way around. (it used to be SU1 is the successor of SU0).
+;In some cases, SU0 is scheduled earlier than SU1 resulting in bad
+;IR as we do not have a value that can be used by SU2.
+
+@global = common dso_local local_unnamed_addr global ptr null, align 4
[email protected] = common dso_local local_unnamed_addr global i32 0, align 4
[email protected] = common dso_local local_unnamed_addr global i16 0, align 2
[email protected] = common dso_local local_unnamed_addr global i16 0, align 2
[email protected] = common dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree norecurse nosync nounwind
+define dso_local i16 @wombat(i8 zeroext %arg, i16 %dummy) local_unnamed_addr
#0 {
+bb:
+ %load = load ptr, ptr @global, align 4
+ %load1 = load i32, ptr @global.1, align 4
+ %add2 = add nsw i32 %load1, -1
+ store i32 %add2, ptr @global.1, align 4
+ %icmp = icmp eq i32 %load1, 0
+ br i1 %icmp, label %bb36, label %bb3
+
+bb3: ; preds = %bb3, %bb
+ %phi = phi i32 [ %add30, %bb3 ], [ %add2, %bb ]
+ %phi4 = phi i8 [ %phi8, %bb3 ], [ %arg, %bb ]
+ %phi5 = phi i16 [ %select23, %bb3 ], [ %dummy, %bb ]
+ %phi6 = phi i16 [ %select26, %bb3 ], [ %dummy, %bb ]
+ %phi7 = phi i16 [ %select, %bb3 ], [ %dummy, %bb ]
+ %phi8 = phi i8 [ %select29, %bb3 ], [ %arg, %bb ]
+ %zext = zext i8 %phi4 to i32
+ %getelementptr = getelementptr inbounds i32, ptr %load, i32 %zext
+ %getelementptr9 = getelementptr inbounds i32, ptr %getelementptr, i32 2
+ %ptrtoint = ptrtoint ptr %getelementptr9 to i32
+ %trunc = trunc i32 %ptrtoint to i16
+ %sext10 = sext i16 %phi7 to i32
+ %shl11 = shl i32 %ptrtoint, 16
+ %ashr = ashr exact i32 %shl11, 16
+ %icmp12 = icmp slt i32 %ashr, %sext10
+ %select = select i1 %icmp12, i16 %trunc, i16 %phi7
+ %getelementptr13 = getelementptr inbounds i32, ptr %getelementptr, i32 3
+ %load14 = load i32, ptr %getelementptr13, align 4
+ %shl = shl i32 %load14, 8
+ %getelementptr15 = getelementptr inbounds i32, ptr %getelementptr, i32 1
+ %load16 = load i32, ptr %getelementptr15, align 4
+ %shl17 = shl i32 %load16, 16
+ %ashr18 = ashr exact i32 %shl17, 16
+ %add = add nsw i32 %ashr18, %load14
+ %lshr = lshr i32 %add, 8
+ %or = or i32 %lshr, %shl
+ %sub = sub i32 %or, %load16
+ %trunc19 = trunc i32 %sub to i16
+ %sext = sext i16 %phi5 to i32
+ %shl20 = shl i32 %sub, 16
+ %ashr21 = ashr exact i32 %shl20, 16
+ %icmp22 = icmp sgt i32 %ashr21, %sext
+ %select23 = select i1 %icmp22, i16 %trunc19, i16 %phi5
+ %sext24 = sext i16 %phi6 to i32
+ %icmp25 = icmp sl
[llvm-branch-commits] [llvm] release/21.x: [HEXAGON] [MachinePipeliner] Fix the DAG in case of dependent phis. (#135925) (PR #170749)
https://github.com/llvmbot created
https://github.com/llvm/llvm-project/pull/170749
Backport 78ee4a5
Requested by: @alexrp
>From fcfa910e046db3bffada6515e0042f9c56dd3bdd Mon Sep 17 00:00:00 2001
From: Abinaya Saravanan
Date: Fri, 5 Dec 2025 00:57:54 +0530
Subject: [PATCH] [HEXAGON] [MachinePipeliner] Fix the DAG in case of dependent
phis. (#135925)
This change corrects the scheduling relationship between dependent PHI
nodes. Previously, the implementation treated SU1 as the successor of
SU0. In reality, SU0 should depend on SU1, not the other way around.
The incorrect ordering could cause SU0 to be scheduled before SU1, which
leads to invalid IR: subsequent instructions may reference values that
have not yet been defined.
%3:intregs = PHI %21:intregs, %bb.6, %7:intregs, %bb.1 - SU0 %7:intregs
= PHI %21:intregs, %bb.6, %13:intregs, %bb.1 - SU1 %27:intregs = A2_zxtb
%3:intregs - SU2
%13:intregs = C2_muxri %45:predregs, 0, %46:intreg
Co-Authored by: Sumanth Gundapaneni
(cherry picked from commit 78ee4a59764720875f51ccfb8086c656e745bec6)
---
llvm/lib/CodeGen/MachinePipeliner.cpp | 15 ++-
.../CodeGen/Hexagon/swp-dependent-phis.ll | 96 +++
llvm/test/CodeGen/Hexagon/swp-epilog-phi9.ll | 10 +-
3 files changed, 113 insertions(+), 8 deletions(-)
create mode 100644 llvm/test/CodeGen/Hexagon/swp-dependent-phis.ll
diff --git a/llvm/lib/CodeGen/MachinePipeliner.cpp
b/llvm/lib/CodeGen/MachinePipeliner.cpp
index 0e7cb0c980d40..1035188344c33 100644
--- a/llvm/lib/CodeGen/MachinePipeliner.cpp
+++ b/llvm/lib/CodeGen/MachinePipeliner.cpp
@@ -1217,8 +1217,19 @@ void SwingSchedulerDAG::updatePhiDependences() {
HasPhiDef = Reg;
// Add a chain edge to a dependent Phi that isn't an existing
// predecessor.
- if (SU->NodeNum < I.NodeNum && !I.isPred(SU))
-I.addPred(SDep(SU, SDep::Barrier));
+
+ // %3:intregs = PHI %21:intregs, %bb.6, %7:intregs, %bb.1 - SU0
+ // %7:intregs = PHI %21:intregs, %bb.6, %13:intregs, %bb.1 - SU1
+ // %27:intregs = A2_zxtb %3:intregs - SU2
+ // %13:intregs = C2_muxri %45:predregs, 0, %46:intreg
+ // If we have dependent phis, SU0 should be the successor of SU1
+ // not the other way around. (it used to be SU1 is the successor
+ // of SU0). In some cases, SU0 is scheduled earlier than SU1
+ // resulting in bad IR as we do not have a value that can be used
+ // by SU2.
+
+ if (SU->NodeNum < I.NodeNum && !SU->isPred(&I))
+SU->addPred(SDep(&I, SDep::Barrier));
}
}
}
diff --git a/llvm/test/CodeGen/Hexagon/swp-dependent-phis.ll
b/llvm/test/CodeGen/Hexagon/swp-dependent-phis.ll
new file mode 100644
index 0..6d324029966d7
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/swp-dependent-phis.ll
@@ -0,0 +1,96 @@
+;RUN: llc -march=hexagon -mv71t -O2 < %s -o - 2>&1 > /dev/null
+
+; Validate that we do not crash while running this test.
+;%3:intregs = PHI %21:intregs, %bb.6, %7:intregs, %bb.1 - SU0
+;%7:intregs = PHI %21:intregs, %bb.6, %13:intregs, %bb.1 - SU1
+;%27:intregs = A2_zxtb %3:intregs - SU2
+;%13:intregs = C2_muxri %45:predregs, 0, %46:intreg
+;If we have dependent phis, SU0 should be the successor of SU1 not
+;the other way around. (it used to be SU1 is the successor of SU0).
+;In some cases, SU0 is scheduled earlier than SU1 resulting in bad
+;IR as we do not have a value that can be used by SU2.
+
+@global = common dso_local local_unnamed_addr global ptr null, align 4
[email protected] = common dso_local local_unnamed_addr global i32 0, align 4
[email protected] = common dso_local local_unnamed_addr global i16 0, align 2
[email protected] = common dso_local local_unnamed_addr global i16 0, align 2
[email protected] = common dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree norecurse nosync nounwind
+define dso_local i16 @wombat(i8 zeroext %arg, i16 %dummy) local_unnamed_addr
#0 {
+bb:
+ %load = load ptr, ptr @global, align 4
+ %load1 = load i32, ptr @global.1, align 4
+ %add2 = add nsw i32 %load1, -1
+ store i32 %add2, ptr @global.1, align 4
+ %icmp = icmp eq i32 %load1, 0
+ br i1 %icmp, label %bb36, label %bb3
+
+bb3: ; preds = %bb3, %bb
+ %phi = phi i32 [ %add30, %bb3 ], [ %add2, %bb ]
+ %phi4 = phi i8 [ %phi8, %bb3 ], [ %arg, %bb ]
+ %phi5 = phi i16 [ %select23, %bb3 ], [ %dummy, %bb ]
+ %phi6 = phi i16 [ %select26, %bb3 ], [ %dummy, %bb ]
+ %phi7 = phi i16 [ %select, %bb3 ], [ %dummy, %bb ]
+ %phi8 = phi i8 [ %select29, %bb3 ], [ %arg, %bb ]
+ %zext = zext i8 %phi4 to i32
+ %getelementptr = getelementptr inbounds i32, ptr %load, i32 %zext
+ %getelementptr9 = getelementptr inbounds i32, ptr %getelementptr, i32 2
+ %ptrtoint = ptrtoint ptr %getelementptr9 to i32
+ %trunc = trunc i32 %ptrtoint to i16
+ %sext10 = sext i16 %phi7 to i32
+
[llvm-branch-commits] [llvm] release/21.x: [HEXAGON] [MachinePipeliner] Fix the DAG in case of dependent phis. (#135925) (PR #170749)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/170749 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [HEXAGON] [MachinePipeliner] Fix the DAG in case of dependent phis. (#135925) (PR #170749)
llvmbot wrote: @iajbar What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/170749 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Use tighter lifetime bounds for C temporary arguments (PR #170518)
https://github.com/ilovepi updated
https://github.com/llvm/llvm-project/pull/170518
>From acdccf174bad71ff21f820660528bcd460bb1f37 Mon Sep 17 00:00:00 2001
From: Paul Kirth
Date: Tue, 2 Dec 2025 15:14:32 -0800
Subject: [PATCH 1/4] [clang] Use tighter lifetime bounds for C temporary
arguments
In C, consecutive statements in the same scope are under
CompoundStmt/CallExpr, while in C++ they typically fall under
CompoundStmt/ExprWithCleanup. This leads to different behavior with
respect to where pushFullExprCleanUp inserts the lifetime end markers
(e.g., at the end of scope).
For these cases, we can track and insert the lifetime end markers right
after the call completes. Allowing the stack space to be reused
immediately. This partially addresses #109204 and #43598 for improving
stack usage.
---
clang/lib/CodeGen/CGCall.cpp | 18 ++
clang/lib/CodeGen/CGCall.h| 19 +++
clang/test/CodeGen/stack-usage-lifetimes.c| 12 ++--
.../CodeGenCXX/stack-reuse-miscompile.cpp | 2 +-
4 files changed, 40 insertions(+), 11 deletions(-)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 80075dd8a4cca..75e3bea3f3237 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4973,11 +4973,16 @@ void CodeGenFunction::EmitCallArg(CallArgList &args,
const Expr *E,
RawAddress ArgSlotAlloca = Address::invalid();
ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", &ArgSlotAlloca);
-// Emit a lifetime start/end for this temporary at the end of the full
-// expression.
+// Emit a lifetime start/end for this temporary. If the type has a
+// destructor, then we need to keep it alive for the full expression.
if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
-EmitLifetimeStart(ArgSlotAlloca.getPointer()))
- pushFullExprCleanup(NormalAndEHCleanup, ArgSlotAlloca);
+EmitLifetimeStart(ArgSlotAlloca.getPointer())) {
+ if (E->getType().isDestructedType()) {
+pushFullExprCleanup(NormalAndEHCleanup,
ArgSlotAlloca);
+ } else {
+args.addLifetimeCleanup({ArgSlotAlloca.getPointer()});
+ }
+}
}
args.add(EmitAnyExpr(E, ArgSlot), type);
@@ -6307,6 +6312,11 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo
&CallInfo,
for (CallLifetimeEnd &LifetimeEnd : CallLifetimeEndAfterCall)
LifetimeEnd.Emit(*this, /*Flags=*/{});
+ if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries)
+for (const CallArgList::EndLifetimeInfo < :
+ CallArgs.getLifetimeCleanups())
+ EmitLifetimeEnd(LT.Addr);
+
if (!ReturnValue.isExternallyDestructed() &&
RetTy.isDestructedType() == QualType::DK_nontrivial_c_struct)
pushDestroy(QualType::DK_nontrivial_c_struct, Ret.getAggregateAddress(),
diff --git a/clang/lib/CodeGen/CGCall.h b/clang/lib/CodeGen/CGCall.h
index 1ef8a3f114573..aab4b64d6a4a8 100644
--- a/clang/lib/CodeGen/CGCall.h
+++ b/clang/lib/CodeGen/CGCall.h
@@ -299,6 +299,10 @@ class CallArgList : public SmallVector {
llvm::Instruction *IsActiveIP;
};
+ struct EndLifetimeInfo {
+llvm::Value *Addr;
+ };
+
void add(RValue rvalue, QualType type) { push_back(CallArg(rvalue, type)); }
void addUncopiedAggregate(LValue LV, QualType type) {
@@ -312,6 +316,9 @@ class CallArgList : public SmallVector {
llvm::append_range(*this, other);
llvm::append_range(Writebacks, other.Writebacks);
llvm::append_range(CleanupsToDeactivate, other.CleanupsToDeactivate);
+LifetimeCleanups.insert(LifetimeCleanups.end(),
+other.LifetimeCleanups.begin(),
+other.LifetimeCleanups.end());
assert(!(StackBase && other.StackBase) && "can't merge stackbases");
if (!StackBase)
StackBase = other.StackBase;
@@ -352,6 +359,14 @@ class CallArgList : public SmallVector {
/// memory.
bool isUsingInAlloca() const { return StackBase; }
+ void addLifetimeCleanup(EndLifetimeInfo Info) {
+LifetimeCleanups.push_back(Info);
+ }
+
+ ArrayRef getLifetimeCleanups() const {
+return LifetimeCleanups;
+ }
+
// Support reversing writebacks for MSVC ABI.
void reverseWritebacks() {
std::reverse(Writebacks.begin(), Writebacks.end());
@@ -365,6 +380,10 @@ class CallArgList : public SmallVector {
/// occurs.
SmallVector CleanupsToDeactivate;
+ /// Lifetime information needed to call llvm.lifetime.end for any temporary
+ /// argument allocas.
+ SmallVector LifetimeCleanups;
+
/// The stacksave call. It dominates all of the argument evaluation.
llvm::CallInst *StackBase = nullptr;
};
diff --git a/clang/test/CodeGen/stack-usage-lifetimes.c
b/clang/test/CodeGen/stack-usage-lifetimes.c
index 3787a29e4ce7d..189bc9c229ca4 100644
--- a/clang/test/CodeGen/stack-usage-lifetimes.c
+++ b/clang/test/CodeGen/stack-usage-lifetimes.c
@@ -40,11 +40,11 @@ void t1(int c) {
}
void t2(void) {
- // x
[llvm-branch-commits] [clang] [clang] Use tighter lifetime bounds for C temporary arguments (PR #170518)
https://github.com/ilovepi updated
https://github.com/llvm/llvm-project/pull/170518
>From acdccf174bad71ff21f820660528bcd460bb1f37 Mon Sep 17 00:00:00 2001
From: Paul Kirth
Date: Tue, 2 Dec 2025 15:14:32 -0800
Subject: [PATCH 1/4] [clang] Use tighter lifetime bounds for C temporary
arguments
In C, consecutive statements in the same scope are under
CompoundStmt/CallExpr, while in C++ they typically fall under
CompoundStmt/ExprWithCleanup. This leads to different behavior with
respect to where pushFullExprCleanUp inserts the lifetime end markers
(e.g., at the end of scope).
For these cases, we can track and insert the lifetime end markers right
after the call completes. Allowing the stack space to be reused
immediately. This partially addresses #109204 and #43598 for improving
stack usage.
---
clang/lib/CodeGen/CGCall.cpp | 18 ++
clang/lib/CodeGen/CGCall.h| 19 +++
clang/test/CodeGen/stack-usage-lifetimes.c| 12 ++--
.../CodeGenCXX/stack-reuse-miscompile.cpp | 2 +-
4 files changed, 40 insertions(+), 11 deletions(-)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 80075dd8a4cca..75e3bea3f3237 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4973,11 +4973,16 @@ void CodeGenFunction::EmitCallArg(CallArgList &args,
const Expr *E,
RawAddress ArgSlotAlloca = Address::invalid();
ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", &ArgSlotAlloca);
-// Emit a lifetime start/end for this temporary at the end of the full
-// expression.
+// Emit a lifetime start/end for this temporary. If the type has a
+// destructor, then we need to keep it alive for the full expression.
if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
-EmitLifetimeStart(ArgSlotAlloca.getPointer()))
- pushFullExprCleanup(NormalAndEHCleanup, ArgSlotAlloca);
+EmitLifetimeStart(ArgSlotAlloca.getPointer())) {
+ if (E->getType().isDestructedType()) {
+pushFullExprCleanup(NormalAndEHCleanup,
ArgSlotAlloca);
+ } else {
+args.addLifetimeCleanup({ArgSlotAlloca.getPointer()});
+ }
+}
}
args.add(EmitAnyExpr(E, ArgSlot), type);
@@ -6307,6 +6312,11 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo
&CallInfo,
for (CallLifetimeEnd &LifetimeEnd : CallLifetimeEndAfterCall)
LifetimeEnd.Emit(*this, /*Flags=*/{});
+ if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries)
+for (const CallArgList::EndLifetimeInfo < :
+ CallArgs.getLifetimeCleanups())
+ EmitLifetimeEnd(LT.Addr);
+
if (!ReturnValue.isExternallyDestructed() &&
RetTy.isDestructedType() == QualType::DK_nontrivial_c_struct)
pushDestroy(QualType::DK_nontrivial_c_struct, Ret.getAggregateAddress(),
diff --git a/clang/lib/CodeGen/CGCall.h b/clang/lib/CodeGen/CGCall.h
index 1ef8a3f114573..aab4b64d6a4a8 100644
--- a/clang/lib/CodeGen/CGCall.h
+++ b/clang/lib/CodeGen/CGCall.h
@@ -299,6 +299,10 @@ class CallArgList : public SmallVector {
llvm::Instruction *IsActiveIP;
};
+ struct EndLifetimeInfo {
+llvm::Value *Addr;
+ };
+
void add(RValue rvalue, QualType type) { push_back(CallArg(rvalue, type)); }
void addUncopiedAggregate(LValue LV, QualType type) {
@@ -312,6 +316,9 @@ class CallArgList : public SmallVector {
llvm::append_range(*this, other);
llvm::append_range(Writebacks, other.Writebacks);
llvm::append_range(CleanupsToDeactivate, other.CleanupsToDeactivate);
+LifetimeCleanups.insert(LifetimeCleanups.end(),
+other.LifetimeCleanups.begin(),
+other.LifetimeCleanups.end());
assert(!(StackBase && other.StackBase) && "can't merge stackbases");
if (!StackBase)
StackBase = other.StackBase;
@@ -352,6 +359,14 @@ class CallArgList : public SmallVector {
/// memory.
bool isUsingInAlloca() const { return StackBase; }
+ void addLifetimeCleanup(EndLifetimeInfo Info) {
+LifetimeCleanups.push_back(Info);
+ }
+
+ ArrayRef getLifetimeCleanups() const {
+return LifetimeCleanups;
+ }
+
// Support reversing writebacks for MSVC ABI.
void reverseWritebacks() {
std::reverse(Writebacks.begin(), Writebacks.end());
@@ -365,6 +380,10 @@ class CallArgList : public SmallVector {
/// occurs.
SmallVector CleanupsToDeactivate;
+ /// Lifetime information needed to call llvm.lifetime.end for any temporary
+ /// argument allocas.
+ SmallVector LifetimeCleanups;
+
/// The stacksave call. It dominates all of the argument evaluation.
llvm::CallInst *StackBase = nullptr;
};
diff --git a/clang/test/CodeGen/stack-usage-lifetimes.c
b/clang/test/CodeGen/stack-usage-lifetimes.c
index 3787a29e4ce7d..189bc9c229ca4 100644
--- a/clang/test/CodeGen/stack-usage-lifetimes.c
+++ b/clang/test/CodeGen/stack-usage-lifetimes.c
@@ -40,11 +40,11 @@ void t1(int c) {
}
void t2(void) {
- // x
[llvm-branch-commits] [clang] [clang] Use tighter lifetime bounds for C temporary arguments (PR #170518)
@@ -13,30 +13,27 @@ void test() {
// CHECK: call void @llvm.lifetime.start.p0(ptr nonnull %[[AGG1]])
// CHECK: invoke void @_Z16func_that_throws7Trivial(ptr noundef nonnull
byval(%struct.Trivial) align 8 %[[AGG1]])
- // CHECK-NEXT: to label %[[CONT1:.*]] unwind label %[[LPAD1:.*]]
-
+ // CHECK-NEXT: to label %[[CONT1:.*]] unwind label %[[LPAD:.*]]
+
// CHECK: [[CONT1]]:
+ // CHECK-NEXT: call void @llvm.lifetime.end.p0(ptr nonnull %[[AGG1]])
// CHECK-NEXT: call void @llvm.lifetime.start.p0(ptr nonnull %[[AGG2]])
// CHECK: invoke void @_Z16func_that_throws7Trivial(ptr noundef nonnull
byval(%struct.Trivial) align 8 %[[AGG2]])
- // CHECK-NEXT: to label %[[CONT2:.*]] unwind label %[[LPAD2:.*]]
+ // CHECK-NEXT: to label %[[CONT2:.*]] unwind label %[[LPAD]]
// CHECK: [[CONT2]]:
- // CHECK-DAG: call void @llvm.lifetime.end.p0(ptr nonnull %[[AGG2]])
- // CHECK-DAG: call void @llvm.lifetime.end.p0(ptr nonnull %[[AGG1]])
+ // CHECK-NEXT: call void @llvm.lifetime.end.p0(ptr nonnull %[[AGG2]])
// CHECK: br label %[[TRY_CONT:.*]]
- // CHECK: [[LPAD1]]:
- // CHECK: landingpad
- // CHECK: br label %[[EHCLEANUP:.*]]
-
- // CHECK: [[LPAD2]]:
+ // CHECK: [[LPAD]]:
// CHECK: landingpad
- // CHECK: call void @llvm.lifetime.end.p0(ptr nonnull %[[AGG2]])
- // CHECK: br label %[[EHCLEANUP]]
-
- // CHECK: [[EHCLEANUP]]:
- // CHECK: call void @llvm.lifetime.end.p0(ptr nonnull %[[AGG1]])
+ // CHECK-NOT: call void @llvm.lifetime.end.p0(ptr nonnull %[[AGG1]])
+ // CHECK-NOT: call void @llvm.lifetime.end.p0(ptr nonnull %[[AGG2]])
ilovepi wrote:
I think the new version covers this correctly now.
https://github.com/llvm/llvm-project/pull/170518
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [ExposeObjCDirect] Optimizations (PR #170619)
https://github.com/DataCorrupted edited https://github.com/llvm/llvm-project/pull/170619 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [ExposeObjCDirect] Optimizations (PR #170619)
https://github.com/DataCorrupted updated
https://github.com/llvm/llvm-project/pull/170619
>From bbf2e85a9bc07a52c83d13af5db0d35878484b9a Mon Sep 17 00:00:00 2001
From: Peter Rong
Date: Wed, 3 Dec 2025 22:45:04 -0800
Subject: [PATCH 1/4] [ExposeObjCDirect] Optimizations
In many cases we can infer that class object has been realized
---
clang/lib/CodeGen/CGObjCRuntime.cpp | 65 -
clang/lib/CodeGen/CGObjCRuntime.h | 23 +++---
2 files changed, 82 insertions(+), 6 deletions(-)
diff --git a/clang/lib/CodeGen/CGObjCRuntime.cpp
b/clang/lib/CodeGen/CGObjCRuntime.cpp
index a4b4460fdc49c..fd227d9645ac1 100644
--- a/clang/lib/CodeGen/CGObjCRuntime.cpp
+++ b/clang/lib/CodeGen/CGObjCRuntime.cpp
@@ -415,7 +415,70 @@ bool CGObjCRuntime::canMessageReceiverBeNull(
bool CGObjCRuntime::canClassObjectBeUnrealized(
const ObjCInterfaceDecl *CalleeClassDecl, CodeGenFunction &CGF) const {
- // TODO
+ if (!CalleeClassDecl)
+return true;
+
+ // Heuristic 1: +load method on this class
+ // If the class has a +load method, it's realized when the binary is loaded.
+ ASTContext &Ctx = CGM.getContext();
+ const IdentifierInfo *LoadII = &Ctx.Idents.get("load");
+ Selector LoadSel = Ctx.Selectors.getSelector(0, &LoadII);
+
+ // TODO: if one if the child had +load, this class is guaranteed to be
+ // realized as well. We should have a translation unit specific map that
+ // precomputes all classes that are realized, and just do a lookup here.
+ // But we need to measure how expensive it is to create a map like that.
+ if (CalleeClassDecl->lookupClassMethod(LoadSel))
+return false; // This class has +load, so it's already realized
+
+ // Heuristic 2: using Self / Super
+ // If we're currently executing a method of ClassDecl (or a subclass),
+ // then ClassDecl must already be realized.
+ if (const auto *CurMethod =
+ dyn_cast_or_null(CGF.CurCodeDecl)) {
+const ObjCInterfaceDecl *CallerCalssDecl = CurMethod->getClassInterface();
+if (CallerCalssDecl && CalleeClassDecl->isSuperClassOf(CallerCalssDecl))
+ return false;
+ }
+
+ // Heuristic 3: previously realized
+ // Heuristic 3.1: Walk through the current BasicBlock looking for calls that
+ // realize the class. All heuristics in this cluster share the same
+ // implementation pattern.
+ auto *BB = CGF.Builder.GetInsertBlock();
+ if (!BB)
+return true; // No current block, assume unrealized
+
+ llvm::StringRef CalleeClassName = CalleeClassDecl->getName();
+
+ // Heuristic 3.2 / TODO: If realization happened in a dominating block, the
+ // class is realized Requires Dominator tree analysis. There should be an
+ // outer loop `for (BB: DominatingBasicBlocks)`
+ for (const auto &Inst : *BB) {
+// Check if this is a call instruction
+const auto *Call = llvm::dyn_cast(&Inst);
+if (!Call)
+ continue;
+llvm::Function *CalledFunc = Call->getCalledFunction();
+if (!CalledFunc)
+ continue;
+
+llvm::StringRef FuncNamePtr = CalledFunc->getName();
+// Skip the \01 prefix if present
+if (FuncNamePtr.starts_with("\01"))
+ FuncNamePtr = FuncNamePtr.drop_front(1);
+// Check for instance method calls: "-[ClassName methodName]"
+// or class method calls: "+[ClassName methodName]"
+// Also check for thunks: "-[ClassName methodName]_thunk"
+if ((FuncNamePtr.starts_with("-[") || FuncNamePtr.starts_with("+["))) {
+ FuncNamePtr = FuncNamePtr.drop_front(2);
+ // TODO: if the current class is the super class of the function that's
+ // used, it should've been realized as well
+ if (FuncNamePtr.starts_with(CalleeClassName))
+return false;
+}
+ }
+
// Otherwise, assume it can be unrealized.
return true;
}
diff --git a/clang/lib/CodeGen/CGObjCRuntime.h
b/clang/lib/CodeGen/CGObjCRuntime.h
index d3d4745cb77a7..b0cf04fc8553b 100644
--- a/clang/lib/CodeGen/CGObjCRuntime.h
+++ b/clang/lib/CodeGen/CGObjCRuntime.h
@@ -226,7 +226,7 @@ class CGObjCRuntime {
virtual llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
const ObjCContainerDecl *CD) = 0;
-/// Generates precondition checks for direct Objective-C Methods.
+ /// Generates precondition checks for direct Objective-C Methods.
/// This includes [self self] for class methods and nil checks.
virtual void GenerateDirectMethodsPreconditionCheck(
CodeGenFunction &CGF, llvm::Function *Fn, const ObjCMethodDecl *OMD,
@@ -330,10 +330,23 @@ class CGObjCRuntime {
QualType resultType,
CallArgList &callArgs);
- bool canMessageReceiverBeNull(CodeGenFunction &CGF,
-const ObjCMethodDecl *method, bool isSuper,
-const ObjCInterfaceDecl *classReceiver,
-llvm::Value *receiver);
+ /// Check if the receiver of an ObjC message send can be null.
+
[llvm-branch-commits] [clang] [ExposeObjCDirect] Optimizations (PR #170619)
https://github.com/DataCorrupted edited https://github.com/llvm/llvm-project/pull/170619 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [ExposeObjCDirect] Optimizations (PR #170619)
https://github.com/DataCorrupted edited https://github.com/llvm/llvm-project/pull/170619 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] Prepare libcxx and libcxxabi for pointer field protection. (PR #151651)
@@ -214,7 +214,11 @@ set(LIBCXX_LIBRARY_VERSION "${LIBCXX_ABI_VERSION}.0" CACHE
STRING
For example, -DLIBCXX_LIBRARY_VERSION=x.y will result in the library being
named
libc++.x.y.dylib, along with the usual symlinks pointing to that. On Apple
platforms,
this also controls the linker's 'current_version' property.")
-set(LIBCXX_ABI_NAMESPACE "__${LIBCXX_ABI_VERSION}" CACHE STRING "The inline
ABI namespace used by libc++. It defaults to __n where `n` is the current ABI
version.")
+set(default_abi_namespace "__${LIBCXX_ABI_VERSION}")
+if(NOT LIBCXX_PFP STREQUAL "none")
ldionne wrote:
Why do you change the namespace when PFP is enabled? Because the ABI is
different and you want to minimize the likelihood of ABI mismatches?
The right way to achieve this would be to set `LIBCXX_ABI_NAMESPACE` from the
cache file that you use to build libc++ on the platform where PFP needs to
coexist with non-PFP code. The libc++ build process itself shouldn't have to
know about PFP.
https://github.com/llvm/llvm-project/pull/151651
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] Prepare libcxx and libcxxabi for pointer field protection. (PR #151651)
@@ -1067,6 +1067,12 @@ typedef __char32_t char32_t; #define _LIBCPP_DIAGNOSE_NULLPTR # endif +# if __has_cpp_attribute(_Clang::__no_field_protection__) +#define _LIBCPP_NO_PFP [[_Clang::__no_field_protection__]] +# else +#define _LIBCPP_NO_PFP +# endif ldionne wrote: I'd suggest `_LIBCPP_DISABLE_PFP` or even `_LIBCPP_DISABLE_POINTER_FIELD_PROTECTION`. `_LIBCPP_NO_PFP` feels like like a yes-no macro, not an attribute macro. https://github.com/llvm/llvm-project/pull/151651 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] Prepare libcxx and libcxxabi for pointer field protection. (PR #151651)
@@ -300,7 +300,7 @@ class _LIBCPP_EXPORTED_FROM_ABI _LIBCPP_TYPE_INFO_VTABLE_POINTER_AUTH type_info protected: typedef __type_info_implementations::__impl __impl; - __impl::__type_name_t __type_name; + _LIBCPP_NO_PFP __impl::__type_name_t __type_name; ldionne wrote: To reiterate my comment on the RFC, I would much rather we teach Clang how to generate the right thing here. I understand this might reduce the amount of work needed in Clang, but at the end of the day the PFP implementation in Clang will be more complete and more robust if it can handle this out of the box -- even if the actual security benefits are marginal. The same comment holds for the changes in `private_typeinfo.h` -- that would actually have the nice effect of not requiring any changes to libc++abi. https://github.com/llvm/llvm-project/pull/151651 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] Prepare libcxx and libcxxabi for pointer field protection. (PR #151651)
https://github.com/ldionne requested changes to this pull request. https://github.com/llvm/llvm-project/pull/151651 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] Prepare libcxx and libcxxabi for pointer field protection. (PR #151651)
@@ -34,10 +34,13 @@ template
struct __libcpp_is_trivially_relocatable : is_trivially_copyable<_Tp> {};
#endif
+// __trivially_relocatable on libc++'s builtin types does not currently return
the right answer with PFP.
ldionne wrote:
```suggestion
// __trivially_relocatable on libc++'s builtin types does not currently return
the right answer with PFP.
```
What do you mean by "libc++ builtin types"? Do you mean libc++ types like
`std::string` & friends? We are currently not using Clang's notion of
trivially-relocatable for anything, so I'm not certain we need to disable this
when PFP is enabled. IMO we need to take a look at each individual type who
advertises `__trivially_relocatable` and check whether that's fine with PFP.
https://github.com/llvm/llvm-project/pull/151651
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64][PAC] Rework the expansion of AUT/AUTPAC pseudos (PR #169699)
https://github.com/atrosinenko updated
https://github.com/llvm/llvm-project/pull/169699
>From 37dc9f869840fde61224fe8d0d687bc76984216d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko
Date: Thu, 25 Sep 2025 22:28:14 +0300
Subject: [PATCH] [AArch64][PAC] Rework the expansion of AUT/AUTPAC pseudos
Refactor `AArch64AsmPrinter::emitPtrauthAuthResign` to improve
readability and fix the conditions of `emitPtrauthDiscriminator` being
allowed to clobber AddrDisc:
* do not clobber `AUTAddrDisc` when computing `AUTDiscReg` on resigning
if `AUTAddrDisc == PACAddrDisc`, as it would prevent passing raw,
64-bit value as the new discriminator
* mark the `$Scratch` operand of `AUTxMxN` as early-clobber (fixes
assertions when emitting code at `-O0`)
* move the code computing `ShouldCheck` and `ShouldTrap` conditions to a
separate function
* define helper `struct PtrAuthSchema` to pass arguments to
`emitPtrauthAuthResign` in a better structured way
---
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp | 182 +++---
llvm/lib/Target/AArch64/AArch64InstrInfo.td | 13 +-
...trauth-intrinsic-auth-resign-with-blend.ll | 77 +++-
3 files changed, 195 insertions(+), 77 deletions(-)
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index eee35ba29dd86..9e89936c9e8c0 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -182,13 +182,21 @@ class AArch64AsmPrinter : public AsmPrinter {
// Check authenticated LR before tail calling.
void emitPtrauthTailCallHardening(const MachineInstr *TC);
+ struct PtrAuthSchema {
+PtrAuthSchema(AArch64PACKey::ID Key, uint64_t Disc,
+ const MachineOperand &AddrDiscOp);
+
+AArch64PACKey::ID Key;
+uint64_t Disc;
+Register AddrDisc;
+bool AddrDiscIsKilled;
+ };
+
// Emit the sequence for AUT or AUTPAC.
- void emitPtrauthAuthResign(Register AUTVal, AArch64PACKey::ID AUTKey,
- uint64_t AUTDisc,
- const MachineOperand *AUTAddrDisc,
- Register Scratch,
- std::optional PACKey,
- uint64_t PACDisc, Register PACAddrDisc, Value
*DS);
+ void emitPtrauthAuthResign(Register Pointer, Register Scratch,
+ PtrAuthSchema AuthSchema,
+ std::optional SignSchema,
+ Value *DS);
// Emit R_AARCH64_PATCHINST, the deactivation symbol relocation. Returns true
// if no instruction should be emitted because the deactivation symbol is
@@ -2207,23 +2215,9 @@ bool
AArch64AsmPrinter::emitDeactivationSymbolRelocation(Value *DS) {
return false;
}
-void AArch64AsmPrinter::emitPtrauthAuthResign(
-Register AUTVal, AArch64PACKey::ID AUTKey, uint64_t AUTDisc,
-const MachineOperand *AUTAddrDisc, Register Scratch,
-std::optional PACKey, uint64_t PACDisc,
-Register PACAddrDisc, Value *DS) {
- const bool IsAUTPAC = PACKey.has_value();
-
- // We expand AUT/AUTPAC into a sequence of the form
- //
- // ; authenticate x16
- // ; check pointer in x16
- //Lsuccess:
- // ; sign x16 (if AUTPAC)
- //Lend: ; if not trapping on failure
- //
- // with the checking sequence chosen depending on whether/how we should check
- // the pointer and whether we should trap on failure.
+static std::pair getCheckAndTrapMode(const MachineFunction *MF,
+ bool IsResign) {
+ const AArch64Subtarget &STI = MF->getSubtarget();
// By default, auth/resign sequences check for auth failures.
bool ShouldCheck = true;
@@ -2232,7 +2226,7 @@ void AArch64AsmPrinter::emitPtrauthAuthResign(
// On an FPAC CPU, you get traps whether you want them or not: there's
// no point in emitting checks or traps.
- if (STI->hasFPAC())
+ if (STI.hasFPAC())
ShouldCheck = ShouldTrap = false;
// However, command-line flags can override this, for experimentation.
@@ -2251,40 +2245,81 @@ void AArch64AsmPrinter::emitPtrauthAuthResign(
break;
}
- // Compute aut discriminator
- Register AUTDiscReg = emitPtrauthDiscriminator(
- AUTDisc, AUTAddrDisc->getReg(), Scratch, AUTAddrDisc->isKill());
+ // Checked-but-not-trapping mode ("poison") only applies to resigning,
+ // replace with "unchecked" for standalone AUT.
+ if (!IsResign && ShouldCheck && !ShouldTrap)
+ShouldCheck = ShouldTrap = false;
- if (!emitDeactivationSymbolRelocation(DS))
-emitAUT(AUTKey, AUTVal, AUTDiscReg);
+ return std::make_pair(ShouldCheck, ShouldTrap);
+}
- // Unchecked or checked-but-non-trapping AUT is just an "AUT": we're done.
- if (!IsAUTPAC && (!ShouldCheck || !ShouldTrap))
-return;
+AArch64AsmPrinter::PtrAuthSchema::PtrAuthSchema(
+AArch64PACKey::ID Key, uint64_t Disc, const MachineOperand &AddrDiscOp)
+: Key(Key), Disc(Disc
[llvm-branch-commits] [llvm] [AArch64][PAC] Rework the expansion of AUT/AUTPAC pseudos (PR #169699)
https://github.com/atrosinenko updated
https://github.com/llvm/llvm-project/pull/169699
>From 37dc9f869840fde61224fe8d0d687bc76984216d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko
Date: Thu, 25 Sep 2025 22:28:14 +0300
Subject: [PATCH] [AArch64][PAC] Rework the expansion of AUT/AUTPAC pseudos
Refactor `AArch64AsmPrinter::emitPtrauthAuthResign` to improve
readability and fix the conditions of `emitPtrauthDiscriminator` being
allowed to clobber AddrDisc:
* do not clobber `AUTAddrDisc` when computing `AUTDiscReg` on resigning
if `AUTAddrDisc == PACAddrDisc`, as it would prevent passing raw,
64-bit value as the new discriminator
* mark the `$Scratch` operand of `AUTxMxN` as early-clobber (fixes
assertions when emitting code at `-O0`)
* move the code computing `ShouldCheck` and `ShouldTrap` conditions to a
separate function
* define helper `struct PtrAuthSchema` to pass arguments to
`emitPtrauthAuthResign` in a better structured way
---
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp | 182 +++---
llvm/lib/Target/AArch64/AArch64InstrInfo.td | 13 +-
...trauth-intrinsic-auth-resign-with-blend.ll | 77 +++-
3 files changed, 195 insertions(+), 77 deletions(-)
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index eee35ba29dd86..9e89936c9e8c0 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -182,13 +182,21 @@ class AArch64AsmPrinter : public AsmPrinter {
// Check authenticated LR before tail calling.
void emitPtrauthTailCallHardening(const MachineInstr *TC);
+ struct PtrAuthSchema {
+PtrAuthSchema(AArch64PACKey::ID Key, uint64_t Disc,
+ const MachineOperand &AddrDiscOp);
+
+AArch64PACKey::ID Key;
+uint64_t Disc;
+Register AddrDisc;
+bool AddrDiscIsKilled;
+ };
+
// Emit the sequence for AUT or AUTPAC.
- void emitPtrauthAuthResign(Register AUTVal, AArch64PACKey::ID AUTKey,
- uint64_t AUTDisc,
- const MachineOperand *AUTAddrDisc,
- Register Scratch,
- std::optional PACKey,
- uint64_t PACDisc, Register PACAddrDisc, Value
*DS);
+ void emitPtrauthAuthResign(Register Pointer, Register Scratch,
+ PtrAuthSchema AuthSchema,
+ std::optional SignSchema,
+ Value *DS);
// Emit R_AARCH64_PATCHINST, the deactivation symbol relocation. Returns true
// if no instruction should be emitted because the deactivation symbol is
@@ -2207,23 +2215,9 @@ bool
AArch64AsmPrinter::emitDeactivationSymbolRelocation(Value *DS) {
return false;
}
-void AArch64AsmPrinter::emitPtrauthAuthResign(
-Register AUTVal, AArch64PACKey::ID AUTKey, uint64_t AUTDisc,
-const MachineOperand *AUTAddrDisc, Register Scratch,
-std::optional PACKey, uint64_t PACDisc,
-Register PACAddrDisc, Value *DS) {
- const bool IsAUTPAC = PACKey.has_value();
-
- // We expand AUT/AUTPAC into a sequence of the form
- //
- // ; authenticate x16
- // ; check pointer in x16
- //Lsuccess:
- // ; sign x16 (if AUTPAC)
- //Lend: ; if not trapping on failure
- //
- // with the checking sequence chosen depending on whether/how we should check
- // the pointer and whether we should trap on failure.
+static std::pair getCheckAndTrapMode(const MachineFunction *MF,
+ bool IsResign) {
+ const AArch64Subtarget &STI = MF->getSubtarget();
// By default, auth/resign sequences check for auth failures.
bool ShouldCheck = true;
@@ -2232,7 +2226,7 @@ void AArch64AsmPrinter::emitPtrauthAuthResign(
// On an FPAC CPU, you get traps whether you want them or not: there's
// no point in emitting checks or traps.
- if (STI->hasFPAC())
+ if (STI.hasFPAC())
ShouldCheck = ShouldTrap = false;
// However, command-line flags can override this, for experimentation.
@@ -2251,40 +2245,81 @@ void AArch64AsmPrinter::emitPtrauthAuthResign(
break;
}
- // Compute aut discriminator
- Register AUTDiscReg = emitPtrauthDiscriminator(
- AUTDisc, AUTAddrDisc->getReg(), Scratch, AUTAddrDisc->isKill());
+ // Checked-but-not-trapping mode ("poison") only applies to resigning,
+ // replace with "unchecked" for standalone AUT.
+ if (!IsResign && ShouldCheck && !ShouldTrap)
+ShouldCheck = ShouldTrap = false;
- if (!emitDeactivationSymbolRelocation(DS))
-emitAUT(AUTKey, AUTVal, AUTDiscReg);
+ return std::make_pair(ShouldCheck, ShouldTrap);
+}
- // Unchecked or checked-but-non-trapping AUT is just an "AUT": we're done.
- if (!IsAUTPAC && (!ShouldCheck || !ShouldTrap))
-return;
+AArch64AsmPrinter::PtrAuthSchema::PtrAuthSchema(
+AArch64PACKey::ID Key, uint64_t Disc, const MachineOperand &AddrDiscOp)
+: Key(Key), Disc(Disc
[llvm-branch-commits] [llvm] [AArch64][PAC] Factor out printing real AUT/PAC/BLRA encodings (NFC) (PR #160901)
https://github.com/atrosinenko updated
https://github.com/llvm/llvm-project/pull/160901
>From ce51a40dbff39fdca481ee9818ef34a9349b13de Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko
Date: Thu, 25 Sep 2025 16:33:04 +0300
Subject: [PATCH 1/2] [AArch64][PAC] Factor out printing real AUT/PAC/BLRA
encodings (NFC)
Separate the low-level emission of the appropriate variants of `AUT*`,
`PAC*` and `B(L)RA*` instructions from the high-level logic of pseudo
instruction expansion.
Introduce `getBranchOpcodeForKey` helper function by analogy to
`get(AUT|PAC)OpcodeForKey`.
---
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp | 176 +++---
llvm/lib/Target/AArch64/AArch64InstrInfo.h| 18 ++
2 files changed, 89 insertions(+), 105 deletions(-)
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 7f1d331982bc2..eee35ba29dd86 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -331,6 +331,11 @@ class AArch64AsmPrinter : public AsmPrinter {
void emitMOVZ(Register Dest, uint64_t Imm, unsigned Shift);
void emitMOVK(Register Dest, uint64_t Imm, unsigned Shift);
+ void emitAUT(AArch64PACKey::ID Key, Register Pointer, Register Disc);
+ void emitPAC(AArch64PACKey::ID Key, Register Pointer, Register Disc);
+ void emitBLRA(bool IsCall, AArch64PACKey::ID Key, Register Target,
+Register Disc);
+
/// Emit instruction to set float register to zero.
void emitFMov0(const MachineInstr &MI);
void emitFMov0AsFMov(const MachineInstr &MI, Register DestReg);
@@ -1852,6 +1857,55 @@ void AArch64AsmPrinter::emitMOVK(Register Dest, uint64_t
Imm, unsigned Shift) {
.addImm(Shift));
}
+void AArch64AsmPrinter::emitAUT(AArch64PACKey::ID Key, Register Pointer,
+Register Disc) {
+ bool IsZeroDisc = Disc == AArch64::XZR;
+ unsigned Opcode = getAUTOpcodeForKey(Key, IsZeroDisc);
+
+ // autiza x16 ; if IsZeroDisc
+ // autia x16, x17 ; if !IsZeroDisc
+ MCInst AUTInst;
+ AUTInst.setOpcode(Opcode);
+ AUTInst.addOperand(MCOperand::createReg(Pointer));
+ AUTInst.addOperand(MCOperand::createReg(Pointer));
+ if (!IsZeroDisc)
+AUTInst.addOperand(MCOperand::createReg(Disc));
+
+ EmitToStreamer(AUTInst);
+}
+
+void AArch64AsmPrinter::emitPAC(AArch64PACKey::ID Key, Register Pointer,
+Register Disc) {
+ bool IsZeroDisc = Disc == AArch64::XZR;
+ unsigned Opcode = getPACOpcodeForKey(Key, IsZeroDisc);
+
+ // paciza x16 ; if IsZeroDisc
+ // pacia x16, x17 ; if !IsZeroDisc
+ MCInst PACInst;
+ PACInst.setOpcode(Opcode);
+ PACInst.addOperand(MCOperand::createReg(Pointer));
+ PACInst.addOperand(MCOperand::createReg(Pointer));
+ if (!IsZeroDisc)
+PACInst.addOperand(MCOperand::createReg(Disc));
+
+ EmitToStreamer(PACInst);
+}
+
+void AArch64AsmPrinter::emitBLRA(bool IsCall, AArch64PACKey::ID Key,
+ Register Target, Register Disc) {
+ bool IsZeroDisc = Disc == AArch64::XZR;
+ unsigned Opcode = getBranchOpcodeForKey(IsCall, Key, IsZeroDisc);
+
+ // blraaz x16 ; if IsZeroDisc
+ // blraa x16, x17 ; if !IsZeroDisc
+ MCInst Inst;
+ Inst.setOpcode(Opcode);
+ Inst.addOperand(MCOperand::createReg(Target));
+ if (!IsZeroDisc)
+Inst.addOperand(MCOperand::createReg(Disc));
+ EmitToStreamer(Inst);
+}
+
void AArch64AsmPrinter::emitFMov0(const MachineInstr &MI) {
Register DestReg = MI.getOperand(0).getReg();
if (!STI->hasZeroCycleZeroingFPWorkaround() && STI->isNeonAvailable()) {
@@ -2200,20 +2254,9 @@ void AArch64AsmPrinter::emitPtrauthAuthResign(
// Compute aut discriminator
Register AUTDiscReg = emitPtrauthDiscriminator(
AUTDisc, AUTAddrDisc->getReg(), Scratch, AUTAddrDisc->isKill());
- bool AUTZero = AUTDiscReg == AArch64::XZR;
- unsigned AUTOpc = getAUTOpcodeForKey(AUTKey, AUTZero);
- if (!emitDeactivationSymbolRelocation(DS)) {
-// autiza x16 ; if AUTZero
-// autia x16, x17 ; if !AUTZero
-MCInst AUTInst;
-AUTInst.setOpcode(AUTOpc);
-AUTInst.addOperand(MCOperand::createReg(AUTVal));
-AUTInst.addOperand(MCOperand::createReg(AUTVal));
-if (!AUTZero)
- AUTInst.addOperand(MCOperand::createReg(AUTDiscReg));
-EmitToStreamer(*OutStreamer, AUTInst);
- }
+ if (!emitDeactivationSymbolRelocation(DS))
+emitAUT(AUTKey, AUTVal, AUTDiscReg);
// Unchecked or checked-but-non-trapping AUT is just an "AUT": we're done.
if (!IsAUTPAC && (!ShouldCheck || !ShouldTrap))
@@ -2236,20 +2279,8 @@ void AArch64AsmPrinter::emitPtrauthAuthResign(
return;
// Compute pac discriminator
- Register PACDiscReg =
- emitPtrauthDiscriminator(PACDisc, PACAddrDisc, Scratch);
- bool PACZero = PACDiscReg == AArch64::XZR;
- unsigned PACOpc = getPACOpcodeForKey(*PACKey, PACZero);
-
- // pacizb x16 ; if PACZero
- // pacib x16, x17 ; if !PACZero
- MCIns
[llvm-branch-commits] [llvm] [AArch64][PAC] Factor out printing real AUT/PAC/BLRA encodings (NFC) (PR #160901)
https://github.com/atrosinenko updated
https://github.com/llvm/llvm-project/pull/160901
>From ce51a40dbff39fdca481ee9818ef34a9349b13de Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko
Date: Thu, 25 Sep 2025 16:33:04 +0300
Subject: [PATCH 1/2] [AArch64][PAC] Factor out printing real AUT/PAC/BLRA
encodings (NFC)
Separate the low-level emission of the appropriate variants of `AUT*`,
`PAC*` and `B(L)RA*` instructions from the high-level logic of pseudo
instruction expansion.
Introduce `getBranchOpcodeForKey` helper function by analogy to
`get(AUT|PAC)OpcodeForKey`.
---
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp | 176 +++---
llvm/lib/Target/AArch64/AArch64InstrInfo.h| 18 ++
2 files changed, 89 insertions(+), 105 deletions(-)
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 7f1d331982bc2..eee35ba29dd86 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -331,6 +331,11 @@ class AArch64AsmPrinter : public AsmPrinter {
void emitMOVZ(Register Dest, uint64_t Imm, unsigned Shift);
void emitMOVK(Register Dest, uint64_t Imm, unsigned Shift);
+ void emitAUT(AArch64PACKey::ID Key, Register Pointer, Register Disc);
+ void emitPAC(AArch64PACKey::ID Key, Register Pointer, Register Disc);
+ void emitBLRA(bool IsCall, AArch64PACKey::ID Key, Register Target,
+Register Disc);
+
/// Emit instruction to set float register to zero.
void emitFMov0(const MachineInstr &MI);
void emitFMov0AsFMov(const MachineInstr &MI, Register DestReg);
@@ -1852,6 +1857,55 @@ void AArch64AsmPrinter::emitMOVK(Register Dest, uint64_t
Imm, unsigned Shift) {
.addImm(Shift));
}
+void AArch64AsmPrinter::emitAUT(AArch64PACKey::ID Key, Register Pointer,
+Register Disc) {
+ bool IsZeroDisc = Disc == AArch64::XZR;
+ unsigned Opcode = getAUTOpcodeForKey(Key, IsZeroDisc);
+
+ // autiza x16 ; if IsZeroDisc
+ // autia x16, x17 ; if !IsZeroDisc
+ MCInst AUTInst;
+ AUTInst.setOpcode(Opcode);
+ AUTInst.addOperand(MCOperand::createReg(Pointer));
+ AUTInst.addOperand(MCOperand::createReg(Pointer));
+ if (!IsZeroDisc)
+AUTInst.addOperand(MCOperand::createReg(Disc));
+
+ EmitToStreamer(AUTInst);
+}
+
+void AArch64AsmPrinter::emitPAC(AArch64PACKey::ID Key, Register Pointer,
+Register Disc) {
+ bool IsZeroDisc = Disc == AArch64::XZR;
+ unsigned Opcode = getPACOpcodeForKey(Key, IsZeroDisc);
+
+ // paciza x16 ; if IsZeroDisc
+ // pacia x16, x17 ; if !IsZeroDisc
+ MCInst PACInst;
+ PACInst.setOpcode(Opcode);
+ PACInst.addOperand(MCOperand::createReg(Pointer));
+ PACInst.addOperand(MCOperand::createReg(Pointer));
+ if (!IsZeroDisc)
+PACInst.addOperand(MCOperand::createReg(Disc));
+
+ EmitToStreamer(PACInst);
+}
+
+void AArch64AsmPrinter::emitBLRA(bool IsCall, AArch64PACKey::ID Key,
+ Register Target, Register Disc) {
+ bool IsZeroDisc = Disc == AArch64::XZR;
+ unsigned Opcode = getBranchOpcodeForKey(IsCall, Key, IsZeroDisc);
+
+ // blraaz x16 ; if IsZeroDisc
+ // blraa x16, x17 ; if !IsZeroDisc
+ MCInst Inst;
+ Inst.setOpcode(Opcode);
+ Inst.addOperand(MCOperand::createReg(Target));
+ if (!IsZeroDisc)
+Inst.addOperand(MCOperand::createReg(Disc));
+ EmitToStreamer(Inst);
+}
+
void AArch64AsmPrinter::emitFMov0(const MachineInstr &MI) {
Register DestReg = MI.getOperand(0).getReg();
if (!STI->hasZeroCycleZeroingFPWorkaround() && STI->isNeonAvailable()) {
@@ -2200,20 +2254,9 @@ void AArch64AsmPrinter::emitPtrauthAuthResign(
// Compute aut discriminator
Register AUTDiscReg = emitPtrauthDiscriminator(
AUTDisc, AUTAddrDisc->getReg(), Scratch, AUTAddrDisc->isKill());
- bool AUTZero = AUTDiscReg == AArch64::XZR;
- unsigned AUTOpc = getAUTOpcodeForKey(AUTKey, AUTZero);
- if (!emitDeactivationSymbolRelocation(DS)) {
-// autiza x16 ; if AUTZero
-// autia x16, x17 ; if !AUTZero
-MCInst AUTInst;
-AUTInst.setOpcode(AUTOpc);
-AUTInst.addOperand(MCOperand::createReg(AUTVal));
-AUTInst.addOperand(MCOperand::createReg(AUTVal));
-if (!AUTZero)
- AUTInst.addOperand(MCOperand::createReg(AUTDiscReg));
-EmitToStreamer(*OutStreamer, AUTInst);
- }
+ if (!emitDeactivationSymbolRelocation(DS))
+emitAUT(AUTKey, AUTVal, AUTDiscReg);
// Unchecked or checked-but-non-trapping AUT is just an "AUT": we're done.
if (!IsAUTPAC && (!ShouldCheck || !ShouldTrap))
@@ -2236,20 +2279,8 @@ void AArch64AsmPrinter::emitPtrauthAuthResign(
return;
// Compute pac discriminator
- Register PACDiscReg =
- emitPtrauthDiscriminator(PACDisc, PACAddrDisc, Scratch);
- bool PACZero = PACDiscReg == AArch64::XZR;
- unsigned PACOpc = getPACOpcodeForKey(*PACKey, PACZero);
-
- // pacizb x16 ; if PACZero
- // pacib x16, x17 ; if !PACZero
- MCIns
[llvm-branch-commits] [llvm] DAG: Use RuntimeLibcalls to legalize vector frem calls (PR #170719)
llvmbot wrote:
@llvm/pr-subscribers-llvm-selectiondag
Author: Matt Arsenault (arsenm)
Changes
This continues the replacement of TargetLibraryInfo uses in codegen
with RuntimeLibcallsInfo started in 821d2825a4f782da3da3c03b8a002802bff4b95c.
The series there handled all of the multiple result calls. This
extends for the other handled case, which happened to be frem.
For some reason the Libcall for these are prefixed with "REM_", for
the instruction "frem", which maps to the libcall "fmod".
---
Patch is 26.96 KiB, truncated to 20.00 KiB below, full version:
https://github.com/llvm/llvm-project/pull/170719.diff
7 Files Affected:
- (modified) llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h (+4)
- (modified) llvm/include/llvm/IR/RuntimeLibcalls.td (+11-11)
- (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp (+35-79)
- (modified) llvm/lib/CodeGen/TargetLoweringBase.cpp (+22)
- (modified) llvm/lib/IR/RuntimeLibcalls.cpp (+49-4)
- (modified) llvm/test/Transforms/Util/DeclareRuntimeLibcalls/armpl.ll (+22-13)
- (modified) llvm/test/Transforms/Util/DeclareRuntimeLibcalls/sleef.ll (+23-13)
``diff
diff --git a/llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h
b/llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h
index dda0899f11337..cc71c3206410a 100644
--- a/llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h
+++ b/llvm/include/llvm/CodeGen/RuntimeLibcallUtil.h
@@ -132,6 +132,10 @@ LLVM_ABI Libcall getSINCOS_STRET(EVT RetVT);
/// UNKNOWN_LIBCALL if there is none.
LLVM_ABI Libcall getMODF(EVT VT);
+/// \return the REM_* value for the given types, or UNKNOWN_LIBCALL if there is
+/// none.
+LLVM_ABI Libcall getREM(EVT VT);
+
/// \return the LROUND_* value for the given types, or UNKNOWN_LIBCALL if there
/// is none.
LLVM_ABI Libcall getLROUND(EVT VT);
diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.td
b/llvm/include/llvm/IR/RuntimeLibcalls.td
index 09e33d7f89e8a..426fd7144257b 100644
--- a/llvm/include/llvm/IR/RuntimeLibcalls.td
+++ b/llvm/include/llvm/IR/RuntimeLibcalls.td
@@ -227,7 +227,7 @@ foreach S = !listconcat(F32VectorSuffixes,
F64VectorSuffixes) in {
def FMA_#S : RuntimeLibcall;
def FMAX_#S : RuntimeLibcall;
def FMIN_#S : RuntimeLibcall;
- def FMOD_#S : RuntimeLibcall;
+ def REM_#S : RuntimeLibcall; // "fmod"
def HYPOT_#S : RuntimeLibcall;
def ILOGB_#S : RuntimeLibcall;
def LDEXP_#S : RuntimeLibcall;
@@ -3915,7 +3915,7 @@ defset list SLEEFGNUABI_VF2_VECFUNCS
= {
def _ZGVnN2vv_fdim : RuntimeLibcallImpl;
def _ZGVnN2vv_fmax : RuntimeLibcallImpl;
def _ZGVnN2vv_fmin : RuntimeLibcallImpl;
- def _ZGVnN2vv_fmod : RuntimeLibcallImpl;
+ def _ZGVnN2vv_fmod : RuntimeLibcallImpl;
def _ZGVnN2vv_hypot : RuntimeLibcallImpl;
def _ZGVnN2vv_ldexp : RuntimeLibcallImpl;
def _ZGVnN2vv_nextafter : RuntimeLibcallImpl;
@@ -3961,7 +3961,7 @@ defset list SLEEFGNUABI_VF4_VECFUNCS
= {
def _ZGVnN4vv_fdimf : RuntimeLibcallImpl;
def _ZGVnN4vv_fmaxf : RuntimeLibcallImpl;
def _ZGVnN4vv_fminf : RuntimeLibcallImpl;
- def _ZGVnN4vv_fmodf : RuntimeLibcallImpl;
+ def _ZGVnN4vv_fmodf : RuntimeLibcallImpl;
def _ZGVnN4vv_hypotf : RuntimeLibcallImpl;
def _ZGVnN4vv_ldexpf : RuntimeLibcallImpl;
def _ZGVnN4vv_nextafterf : RuntimeLibcallImpl;
@@ -4038,8 +4038,8 @@ defset list
SLEEFGNUABI_SCALABLE_VECFUNCS = {
def _ZGVsMxvv_fmaxf : RuntimeLibcallImpl;
def _ZGVsMxvv_fmin : RuntimeLibcallImpl;
def _ZGVsMxvv_fminf : RuntimeLibcallImpl;
- def _ZGVsMxvv_fmod : RuntimeLibcallImpl;
- def _ZGVsMxvv_fmodf : RuntimeLibcallImpl;
+ def _ZGVsMxvv_fmod : RuntimeLibcallImpl;
+ def _ZGVsMxvv_fmodf : RuntimeLibcallImpl;
def _ZGVsMxvv_hypot : RuntimeLibcallImpl;
def _ZGVsMxvv_hypotf : RuntimeLibcallImpl;
def _ZGVsMxvv_ldexp : RuntimeLibcallImpl;
@@ -4103,8 +4103,8 @@ defset list
SLEEFGNUABI_SCALABLE_VECFUNCS_RISCV = {
def Sleef_fmaxfx_rvvm2 : RuntimeLibcallImpl;
def Sleef_fmindx_u10rvvm2 : RuntimeLibcallImpl;
def Sleef_fminfx_u10rvvm2 : RuntimeLibcallImpl;
- def Sleef_fmoddx_rvvm2 : RuntimeLibcallImpl;
- def Sleef_fmodfx_rvvm2 : RuntimeLibcallImpl;
+ def Sleef_fmoddx_rvvm2 : RuntimeLibcallImpl;
+ def Sleef_fmodfx_rvvm2 : RuntimeLibcallImpl;
def Sleef_hypotdx_u05rvvm2 : RuntimeLibcallImpl;
def Sleef_hypotfx_u05rvvm2 : RuntimeLibcallImpl;
def Sleef_ilogbdx_rvvm2 : RuntimeLibcallImpl;
@@ -4196,8 +4196,8 @@ defset list ARMPL_VECFUNCS = {
def armpl_svfmax_f64_x : RuntimeLibcallImpl;
def armpl_svfmin_f32_x : RuntimeLibcallImpl;
def armpl_svfmin_f64_x : RuntimeLibcallImpl;
- def armpl_svfmod_f32_x : RuntimeLibcallImpl;
- def armpl_svfmod_f64_x : RuntimeLibcallImpl;
+ def armpl_svfmod_f32_x : RuntimeLibcallImpl;
+ def armpl_svfmod_f64_x : RuntimeLibcallImpl;
def armpl_svhypot_f32_x : RuntimeLibcallImpl;
def armpl_svhypot_f64_x : RuntimeLibcallImpl;
def armpl_svilogb_f32_x : RuntimeLibcallImpl;
@@ -4282,8 +4282,8 @@ defset list ARMPL_VECFUNCS = {
def armpl_vfmaxq_f64 : RuntimeLibcallImpl;
def armpl_vfminq_f32
[llvm-branch-commits] [flang] [flang][OpenMP] Generalize checks of loop construct structure (PR #170735)
https://github.com/kparzysz created
https://github.com/llvm/llvm-project/pull/170735
For an OpenMP loop construct, count how many loops will effectively be
contained in its associated block. For constructs that are loop-nest associated
this number should be 1. Report cases where this number is different.
Take into account that the block associated with a loop construct can contain
compiler directives.
>From 9a2d3dca08ab237e7e949fd5642c96cf0fba89b8 Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek
Date: Tue, 2 Dec 2025 14:59:34 -0600
Subject: [PATCH] [flang][OpenMP] Generalize checks of loop construct structure
For an OpenMP loop construct, count how many loops will effectively be
contained in its associated block. For constructs that are loop-nest
associated this number should be 1. Report cases where this number is
different.
Take into account that the block associated with a loop construct can
contain compiler directives.
---
flang/lib/Semantics/check-omp-loop.cpp| 201 +++---
flang/lib/Semantics/check-omp-structure.h | 3 +-
flang/test/Parser/OpenMP/tile-fail.f90| 8 +-
flang/test/Semantics/OpenMP/do21.f90 | 10 +-
.../Semantics/OpenMP/loop-association.f90 | 6 +-
.../OpenMP/loop-transformation-clauses01.f90 | 16 +-
.../loop-transformation-construct01.f90 | 4 +-
.../loop-transformation-construct02.f90 | 8 +-
.../loop-transformation-construct04.f90 | 4 +-
9 files changed, 156 insertions(+), 104 deletions(-)
diff --git a/flang/lib/Semantics/check-omp-loop.cpp
b/flang/lib/Semantics/check-omp-loop.cpp
index fc4b9222d91b3..6414f0028e008 100644
--- a/flang/lib/Semantics/check-omp-loop.cpp
+++ b/flang/lib/Semantics/check-omp-loop.cpp
@@ -37,6 +37,14 @@
#include
#include
+namespace Fortran::semantics {
+static bool IsLoopTransforming(llvm::omp::Directive dir);
+static bool IsFullUnroll(const parser::OpenMPLoopConstruct &x);
+static std::optional CountGeneratedLoops(
+const parser::ExecutionPartConstruct &epc);
+static std::optional CountGeneratedLoops(const parser::Block &block);
+} // namespace Fortran::semantics
+
namespace {
using namespace Fortran;
@@ -263,22 +271,19 @@ static bool IsLoopTransforming(llvm::omp::Directive dir) {
}
void OmpStructureChecker::CheckNestedBlock(const parser::OpenMPLoopConstruct
&x,
-const parser::Block &body, size_t &nestedCount) {
+const parser::Block &body) {
for (auto &stmt : body) {
if (auto *dir{parser::Unwrap(stmt)}) {
context_.Say(dir->source,
"Compiler directives are not allowed inside OpenMP loop
constructs"_warn_en_US);
-} else if (parser::Unwrap(stmt)) {
- ++nestedCount;
} else if (auto *omp{parser::Unwrap(stmt)}) {
if (!IsLoopTransforming(omp->BeginDir().DirName().v)) {
context_.Say(omp->source,
"Only loop-transforming OpenMP constructs are allowed inside
OpenMP loop constructs"_err_en_US);
}
- ++nestedCount;
} else if (auto *block{parser::Unwrap(stmt)}) {
- CheckNestedBlock(x, std::get(block->t), nestedCount);
-} else {
+ CheckNestedBlock(x, std::get(block->t));
+} else if (!parser::Unwrap(stmt)) {
parser::CharBlock source{parser::GetSource(stmt).value_or(x.source)};
context_.Say(source,
"OpenMP loop construct can only contain DO loops or
loop-nest-generating OpenMP constructs"_err_en_US);
@@ -286,16 +291,96 @@ void OmpStructureChecker::CheckNestedBlock(const
parser::OpenMPLoopConstruct &x,
}
}
+static bool IsFullUnroll(const parser::OpenMPLoopConstruct &x) {
+ const parser::OmpDirectiveSpecification &beginSpec{x.BeginDir()};
+
+ if (beginSpec.DirName().v == llvm::omp::Directive::OMPD_unroll) {
+return llvm::none_of(beginSpec.Clauses().v, [](const parser::OmpClause &c)
{
+ return c.Id() == llvm::omp::Clause::OMPC_partial;
+});
+ }
+ return false;
+}
+
+static std::optional CountGeneratedLoops(
+const parser::ExecutionPartConstruct &epc) {
+ if (parser::Unwrap(epc)) {
+return 1;
+ }
+
+ auto &omp{DEREF(parser::Unwrap(epc))};
+ const parser::OmpDirectiveSpecification &beginSpec{omp.BeginDir()};
+ llvm::omp::Directive dir{beginSpec.DirName().v};
+
+ // TODO: Handle split, apply.
+ if (IsFullUnroll(omp)) {
+return std::nullopt;
+ }
+ if (dir == llvm::omp::Directive::OMPD_fuse) {
+auto rangeAt{
+llvm::find_if(beginSpec.Clauses().v, [](const parser::OmpClause &c) {
+ return c.Id() == llvm::omp::Clause::OMPC_looprange;
+})};
+if (rangeAt == beginSpec.Clauses().v.end()) {
+ return std::nullopt;
+}
+
+auto *loopRange{parser::Unwrap(*rangeAt)};
+std::optional count{GetIntValue(std::get<1>(loopRange->t))};
+if (!count || *count <= 0) {
+ return std::nullopt;
+}
+if (auto nestedCount{CountGeneratedLoops(std::get(omp.t))})
{
+ return 1 + *nestedCount - static_cast(*count);
+} else {
+ return std::nullop
[llvm-branch-commits] [flang] [flang][OpenMP] Generalize checks of loop construct structure (PR #170735)
llvmbot wrote:
@llvm/pr-subscribers-flang-openmp
Author: Krzysztof Parzyszek (kparzysz)
Changes
For an OpenMP loop construct, count how many loops will effectively be
contained in its associated block. For constructs that are loop-nest associated
this number should be 1. Report cases where this number is different.
Take into account that the block associated with a loop construct can contain
compiler directives.
---
Patch is 21.23 KiB, truncated to 20.00 KiB below, full version:
https://github.com/llvm/llvm-project/pull/170735.diff
9 Files Affected:
- (modified) flang/lib/Semantics/check-omp-loop.cpp (+120-81)
- (modified) flang/lib/Semantics/check-omp-structure.h (+1-2)
- (modified) flang/test/Parser/OpenMP/tile-fail.f90 (+4-4)
- (modified) flang/test/Semantics/OpenMP/do21.f90 (+5-5)
- (modified) flang/test/Semantics/OpenMP/loop-association.f90 (+3-3)
- (modified) flang/test/Semantics/OpenMP/loop-transformation-clauses01.f90
(+15-1)
- (modified) flang/test/Semantics/OpenMP/loop-transformation-construct01.f90
(+2-2)
- (modified) flang/test/Semantics/OpenMP/loop-transformation-construct02.f90
(+4-4)
- (modified) flang/test/Semantics/OpenMP/loop-transformation-construct04.f90
(+2-2)
``diff
diff --git a/flang/lib/Semantics/check-omp-loop.cpp
b/flang/lib/Semantics/check-omp-loop.cpp
index fc4b9222d91b3..6414f0028e008 100644
--- a/flang/lib/Semantics/check-omp-loop.cpp
+++ b/flang/lib/Semantics/check-omp-loop.cpp
@@ -37,6 +37,14 @@
#include
#include
+namespace Fortran::semantics {
+static bool IsLoopTransforming(llvm::omp::Directive dir);
+static bool IsFullUnroll(const parser::OpenMPLoopConstruct &x);
+static std::optional CountGeneratedLoops(
+const parser::ExecutionPartConstruct &epc);
+static std::optional CountGeneratedLoops(const parser::Block &block);
+} // namespace Fortran::semantics
+
namespace {
using namespace Fortran;
@@ -263,22 +271,19 @@ static bool IsLoopTransforming(llvm::omp::Directive dir) {
}
void OmpStructureChecker::CheckNestedBlock(const parser::OpenMPLoopConstruct
&x,
-const parser::Block &body, size_t &nestedCount) {
+const parser::Block &body) {
for (auto &stmt : body) {
if (auto *dir{parser::Unwrap(stmt)}) {
context_.Say(dir->source,
"Compiler directives are not allowed inside OpenMP loop
constructs"_warn_en_US);
-} else if (parser::Unwrap(stmt)) {
- ++nestedCount;
} else if (auto *omp{parser::Unwrap(stmt)}) {
if (!IsLoopTransforming(omp->BeginDir().DirName().v)) {
context_.Say(omp->source,
"Only loop-transforming OpenMP constructs are allowed inside
OpenMP loop constructs"_err_en_US);
}
- ++nestedCount;
} else if (auto *block{parser::Unwrap(stmt)}) {
- CheckNestedBlock(x, std::get(block->t), nestedCount);
-} else {
+ CheckNestedBlock(x, std::get(block->t));
+} else if (!parser::Unwrap(stmt)) {
parser::CharBlock source{parser::GetSource(stmt).value_or(x.source)};
context_.Say(source,
"OpenMP loop construct can only contain DO loops or
loop-nest-generating OpenMP constructs"_err_en_US);
@@ -286,16 +291,96 @@ void OmpStructureChecker::CheckNestedBlock(const
parser::OpenMPLoopConstruct &x,
}
}
+static bool IsFullUnroll(const parser::OpenMPLoopConstruct &x) {
+ const parser::OmpDirectiveSpecification &beginSpec{x.BeginDir()};
+
+ if (beginSpec.DirName().v == llvm::omp::Directive::OMPD_unroll) {
+return llvm::none_of(beginSpec.Clauses().v, [](const parser::OmpClause &c)
{
+ return c.Id() == llvm::omp::Clause::OMPC_partial;
+});
+ }
+ return false;
+}
+
+static std::optional CountGeneratedLoops(
+const parser::ExecutionPartConstruct &epc) {
+ if (parser::Unwrap(epc)) {
+return 1;
+ }
+
+ auto &omp{DEREF(parser::Unwrap(epc))};
+ const parser::OmpDirectiveSpecification &beginSpec{omp.BeginDir()};
+ llvm::omp::Directive dir{beginSpec.DirName().v};
+
+ // TODO: Handle split, apply.
+ if (IsFullUnroll(omp)) {
+return std::nullopt;
+ }
+ if (dir == llvm::omp::Directive::OMPD_fuse) {
+auto rangeAt{
+llvm::find_if(beginSpec.Clauses().v, [](const parser::OmpClause &c) {
+ return c.Id() == llvm::omp::Clause::OMPC_looprange;
+})};
+if (rangeAt == beginSpec.Clauses().v.end()) {
+ return std::nullopt;
+}
+
+auto *loopRange{parser::Unwrap(*rangeAt)};
+std::optional count{GetIntValue(std::get<1>(loopRange->t))};
+if (!count || *count <= 0) {
+ return std::nullopt;
+}
+if (auto nestedCount{CountGeneratedLoops(std::get(omp.t))})
{
+ return 1 + *nestedCount - static_cast(*count);
+} else {
+ return std::nullopt;
+}
+ }
+
+ // For every other loop construct return 1.
+ return 1;
+}
+
+static std::optional CountGeneratedLoops(const parser::Block &block) {
+ // Count the number of loops in the associated block. If there are any
+ // malformed construct in there, getting the numb
[llvm-branch-commits] [flang] [flang][OpenMP] Generalize checks of loop construct structure (PR #170735)
github-actions[bot] wrote:
:warning: C/C++ code formatter, clang-format found issues in your code.
:warning:
You can test this locally with the following command:
``bash
git-clang-format --diff origin/main HEAD --extensions cpp,h --
flang/lib/Semantics/check-omp-loop.cpp
flang/lib/Semantics/check-omp-structure.h --diff_from_common_commit
``
:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:
View the diff from clang-format here.
``diff
diff --git a/flang/lib/Semantics/check-omp-loop.cpp
b/flang/lib/Semantics/check-omp-loop.cpp
index 6414f0028..917b2c501 100644
--- a/flang/lib/Semantics/check-omp-loop.cpp
+++ b/flang/lib/Semantics/check-omp-loop.cpp
@@ -270,8 +270,8 @@ static bool IsLoopTransforming(llvm::omp::Directive dir) {
}
}
-void OmpStructureChecker::CheckNestedBlock(const parser::OpenMPLoopConstruct
&x,
-const parser::Block &body) {
+void OmpStructureChecker::CheckNestedBlock(
+const parser::OpenMPLoopConstruct &x, const parser::Block &body) {
for (auto &stmt : body) {
if (auto *dir{parser::Unwrap(stmt)}) {
context_.Say(dir->source,
diff --git a/flang/lib/Semantics/check-omp-structure.h
b/flang/lib/Semantics/check-omp-structure.h
index 267362b63..12f458b62 100644
--- a/flang/lib/Semantics/check-omp-structure.h
+++ b/flang/lib/Semantics/check-omp-structure.h
@@ -325,8 +325,8 @@ private:
void CheckLooprangeBounds(const parser::OpenMPLoopConstruct &x);
void CheckDistLinear(const parser::OpenMPLoopConstruct &x);
void CheckSIMDNest(const parser::OpenMPConstruct &x);
- void CheckNestedBlock(const parser::OpenMPLoopConstruct &x,
- const parser::Block &body);
+ void CheckNestedBlock(
+ const parser::OpenMPLoopConstruct &x, const parser::Block &body);
void CheckNestedConstruct(const parser::OpenMPLoopConstruct &x);
void CheckFullUnroll(const parser::OpenMPLoopConstruct &x);
void CheckTargetNest(const parser::OpenMPConstruct &x);
``
https://github.com/llvm/llvm-project/pull/170735
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Use tighter lifetime bounds for C temporary arguments (PR #170518)
https://github.com/ilovepi updated
https://github.com/llvm/llvm-project/pull/170518
>From b23585403b26d267cf21fcff4f3c550b1b0dd597 Mon Sep 17 00:00:00 2001
From: Paul Kirth
Date: Tue, 2 Dec 2025 15:14:32 -0800
Subject: [PATCH 1/7] [clang] Use tighter lifetime bounds for C temporary
arguments
In C, consecutive statements in the same scope are under
CompoundStmt/CallExpr, while in C++ they typically fall under
CompoundStmt/ExprWithCleanup. This leads to different behavior with
respect to where pushFullExprCleanUp inserts the lifetime end markers
(e.g., at the end of scope).
For these cases, we can track and insert the lifetime end markers right
after the call completes. Allowing the stack space to be reused
immediately. This partially addresses #109204 and #43598 for improving
stack usage.
---
clang/lib/CodeGen/CGCall.cpp | 18 ++
clang/lib/CodeGen/CGCall.h| 19 +++
clang/test/CodeGen/stack-usage-lifetimes.c| 12 ++--
.../CodeGenCXX/stack-reuse-miscompile.cpp | 2 +-
4 files changed, 40 insertions(+), 11 deletions(-)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 35e237c8eedbe..b6387d63a7657 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4972,11 +4972,16 @@ void CodeGenFunction::EmitCallArg(CallArgList &args,
const Expr *E,
RawAddress ArgSlotAlloca = Address::invalid();
ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", &ArgSlotAlloca);
-// Emit a lifetime start/end for this temporary at the end of the full
-// expression.
+// Emit a lifetime start/end for this temporary. If the type has a
+// destructor, then we need to keep it alive for the full expression.
if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
-EmitLifetimeStart(ArgSlotAlloca.getPointer()))
- pushFullExprCleanup(NormalAndEHCleanup, ArgSlotAlloca);
+EmitLifetimeStart(ArgSlotAlloca.getPointer())) {
+ if (E->getType().isDestructedType()) {
+pushFullExprCleanup(NormalAndEHCleanup,
ArgSlotAlloca);
+ } else {
+args.addLifetimeCleanup({ArgSlotAlloca.getPointer()});
+ }
+}
}
args.add(EmitAnyExpr(E, ArgSlot), type);
@@ -6306,6 +6311,11 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo
&CallInfo,
for (CallLifetimeEnd &LifetimeEnd : CallLifetimeEndAfterCall)
LifetimeEnd.Emit(*this, /*Flags=*/{});
+ if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries)
+for (const CallArgList::EndLifetimeInfo < :
+ CallArgs.getLifetimeCleanups())
+ EmitLifetimeEnd(LT.Addr);
+
if (!ReturnValue.isExternallyDestructed() &&
RetTy.isDestructedType() == QualType::DK_nontrivial_c_struct)
pushDestroy(QualType::DK_nontrivial_c_struct, Ret.getAggregateAddress(),
diff --git a/clang/lib/CodeGen/CGCall.h b/clang/lib/CodeGen/CGCall.h
index 1ef8a3f114573..aab4b64d6a4a8 100644
--- a/clang/lib/CodeGen/CGCall.h
+++ b/clang/lib/CodeGen/CGCall.h
@@ -299,6 +299,10 @@ class CallArgList : public SmallVector {
llvm::Instruction *IsActiveIP;
};
+ struct EndLifetimeInfo {
+llvm::Value *Addr;
+ };
+
void add(RValue rvalue, QualType type) { push_back(CallArg(rvalue, type)); }
void addUncopiedAggregate(LValue LV, QualType type) {
@@ -312,6 +316,9 @@ class CallArgList : public SmallVector {
llvm::append_range(*this, other);
llvm::append_range(Writebacks, other.Writebacks);
llvm::append_range(CleanupsToDeactivate, other.CleanupsToDeactivate);
+LifetimeCleanups.insert(LifetimeCleanups.end(),
+other.LifetimeCleanups.begin(),
+other.LifetimeCleanups.end());
assert(!(StackBase && other.StackBase) && "can't merge stackbases");
if (!StackBase)
StackBase = other.StackBase;
@@ -352,6 +359,14 @@ class CallArgList : public SmallVector {
/// memory.
bool isUsingInAlloca() const { return StackBase; }
+ void addLifetimeCleanup(EndLifetimeInfo Info) {
+LifetimeCleanups.push_back(Info);
+ }
+
+ ArrayRef getLifetimeCleanups() const {
+return LifetimeCleanups;
+ }
+
// Support reversing writebacks for MSVC ABI.
void reverseWritebacks() {
std::reverse(Writebacks.begin(), Writebacks.end());
@@ -365,6 +380,10 @@ class CallArgList : public SmallVector {
/// occurs.
SmallVector CleanupsToDeactivate;
+ /// Lifetime information needed to call llvm.lifetime.end for any temporary
+ /// argument allocas.
+ SmallVector LifetimeCleanups;
+
/// The stacksave call. It dominates all of the argument evaluation.
llvm::CallInst *StackBase = nullptr;
};
diff --git a/clang/test/CodeGen/stack-usage-lifetimes.c
b/clang/test/CodeGen/stack-usage-lifetimes.c
index 3787a29e4ce7d..189bc9c229ca4 100644
--- a/clang/test/CodeGen/stack-usage-lifetimes.c
+++ b/clang/test/CodeGen/stack-usage-lifetimes.c
@@ -40,11 +40,11 @@ void t1(int c) {
}
void t2(void) {
- // x
[llvm-branch-commits] [libcxx] Prepare libcxx and libcxxabi for pointer field protection. (PR #151651)
@@ -34,10 +34,13 @@ template
struct __libcpp_is_trivially_relocatable : is_trivially_copyable<_Tp> {};
#endif
+// __trivially_relocatable on libc++'s builtin types does not currently return
the right answer with PFP.
pcc wrote:
I looked at all the types and I think they would all be
non-trivially-relocatable with PFP because they can all contain pointer fields.
We could surround each `using __trivially_relocatable` with an `#ifndef
__POINTER_FIELD_PROTECTION__`, but that could make it easier to accidentally
introduce a PFP-incompatible type, which would hopefully be detectable via
testing, but it's possible that the tests will not trigger the bug. Disabling
it like this seemed like the most robust approach.
https://github.com/llvm/llvm-project/pull/151651
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
