[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671) (PR #106691)

2024-08-30 Thread Jeremy Morse via llvm-branch-commits

jmorse wrote:

> @jmorse What do you think about merging this PR to the release branch?

I feel this is safe to go into the release branch; I don't think it should 
block the 19.0.0 release though, better to have it in a point release. This 
fault has been present for roughly five months and has only just been 
uncovered, so IMHO it's not a pressing issue.

https://github.com/llvm/llvm-project/pull/106691
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671, #106723) (PR #106952)

2024-09-03 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

As with the previous rev, LGTM as a fix for an edge case. This situation has 
taken many months to expose, so I don't think it urgently needs to go into the 
imminent release, a point release should be fine.

https://github.com/llvm/llvm-project/pull/106952
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 6b2e4c5 - [DebugInfo][InstrRef] Don't break up ret-sequences on debug-info instrs

2021-07-29 Thread Jeremy Morse via llvm-branch-commits

Author: Jeremy Morse
Date: 2021-07-29T15:08:13+01:00
New Revision: 6b2e4c5a58d7f75f52b740bd7d96dc3a00b7ef6c

URL: 
https://github.com/llvm/llvm-project/commit/6b2e4c5a58d7f75f52b740bd7d96dc3a00b7ef6c
DIFF: 
https://github.com/llvm/llvm-project/commit/6b2e4c5a58d7f75f52b740bd7d96dc3a00b7ef6c.diff

LOG: [DebugInfo][InstrRef] Don't break up ret-sequences on debug-info instrs

When we have a terminator sequence (i.e. a tailcall or return),
MIIsInTerminatorSequence is used to work out where the preceding ABI-setup
instructions end, i.e. the parts that were glued to the terminator
instruction. This allows LLVM to split blocks safely without having to
worry about ABI stuff.

The function only ignores DBG_VALUE instructions, meaning that the two
debug instructions I recently added can end terminator sequences early,
causing various MachineVerifier errors. This patch promotes the test for
debug instructions from "isDebugValue" to "isDebugInstr", thus avoiding any
debug-info interfering with this function.

Differential Revision: https://reviews.llvm.org/D106660

(cherry picked from commit 8612417e5a54cfef941ab45de55e48b4a0c4e8b4)

Added: 
llvm/test/DebugInfo/ARM/instr-ref-tcreturn.ll

Modified: 
llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index 1415cce3b1dff..09627ee6a1645 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -1660,7 +1660,7 @@ static bool MIIsInTerminatorSequence(const MachineInstr 
&MI) {
 // physical registers if there is debug info associated with the terminator
 // of our mbb. We want to include said debug info in our terminator
 // sequence, so we return true in that case.
-return MI.isDebugValue();
+return MI.isDebugInstr();
 
   // We have left the terminator sequence if we are not doing one of the
   // following:

diff  --git a/llvm/test/DebugInfo/ARM/instr-ref-tcreturn.ll 
b/llvm/test/DebugInfo/ARM/instr-ref-tcreturn.ll
new file mode 100644
index 0..f86e35061dc22
--- /dev/null
+++ b/llvm/test/DebugInfo/ARM/instr-ref-tcreturn.ll
@@ -0,0 +1,66 @@
+; RUN: llc %s -o - -stop-after=finalize-isel -verify-machineinstrs 
-experimental-debug-variable-locations | FileCheck %s
+
+; In the sequence below, the sdiv is converted to a function call to __divsi3,
+; which is then tail call optimised. The dbg.value is suddenly stuck between
+; terminators, and the corresponding DBG_INSTR_REF is forced-placed to be
+; immediately before the TCRETURN.
+; However, with the function having the sspstrong attribute, we then try to
+; peel apart the terminator sequence, DBG_INSTR_REF is interpreted as being
+; a "real" instruction, and the stack check is inserted at that point rather
+; than before the copies-to-physreg setting up the call. This breaks the
+; code, and MachineVerifier complains.
+;
+; Check that the tail sequence is stack-protected, and split at the correct
+; position, ignoring the DBG_INSTR_REF
+
+target datalayout = 
"e-m:o-p:32:32-Fi8-f64:32:64-v64:32:64-v128:32:128-a:0:32-n32-S32"
+target triple = "thumbv7-apple-ios7.0.0"
+
+; CHECK-LABEL:  bb.0.entry:
+; CHECK:LOAD_STACK_GUARD
+
+; CHECK-LABEL:  bb.2.entry:
+; CHECK:tBL {{.*}} &__stack_chk_fail,
+
+; CHECK-LABEL:  bb.1.entry:
+; CHECK:  $r0 = COPY %0
+; CHECK-NEXT: $r1 = COPY %1
+; CHECK-NEXT: DBG_INSTR_REF 1, 0
+; CHECK-NEXT: TCRETURNdi &__divsi3, 0, implicit $sp, implicit $r0, 
implicit $r1
+
+declare i1 @ext()
+
+define i32 @test(i32 %a1, i32 %a2) #1 !dbg !5 {
+entry:
+  %foo = alloca i32, i32 %a1
+  %bool = call i1 @ext()
+  %res = sdiv i32 %a1, %a2
+  call void @llvm.dbg.value(metadata i32 %a1, metadata !13, metadata 
!DIExpression()), !dbg !16
+  ret i32 %res
+}
+
+attributes #1 = {sspstrong}
+
+; Function Attrs: nounwind readnone speculatable willreturn
+declare void @llvm.dbg.value(metadata, metadata, metadata)
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!3, !4}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_Swift, file: !1, producer: 
"Swift", isOptimized: true, runtimeVersion: 5, emissionKind: FullDebug)
+!1 = !DIFile(filename: "foo.swift", directory: "/tmp")
+!2 = !{}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"Swift Minor Version", i8 3}
+!5 = distinct !DISubprogram(name: "n0", linkageName: "n1", scope: !7, file: 
!6, line: 86, type: !8, scopeLine: 86, spFlags: DISPFlagDefinition | 
DISPFlagOptimized, unit: !0)
+!6 = !DIFile(filename: "bar.swift", directory: "")
+!7 = !DIModule(scope: null, name: "Swift")
+!8 = !DISubroutineType(types: !9)
+!9 = !{!10, !12}
+!10 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "Int", 
scope: !7, file: !11, size: 32, elements: !2, runtimeLang: DW_LANG_Swift, 
identifier: "$i1")
+!11 = !DIFile(f

[llvm-branch-commits] [lld] eff6e75 - [LLD][ELF] Correct test temporary file paths

2021-01-20 Thread Jeremy Morse via llvm-branch-commits

Author: Jeremy Morse
Date: 2021-01-20T11:46:11Z
New Revision: eff6e75c3f7c5471f0326526dc3f0b8b10f8a4df

URL: 
https://github.com/llvm/llvm-project/commit/eff6e75c3f7c5471f0326526dc3f0b8b10f8a4df
DIFF: 
https://github.com/llvm/llvm-project/commit/eff6e75c3f7c5471f0326526dc3f0b8b10f8a4df.diff

LOG: [LLD][ELF] Correct test temporary file paths

In 8031785f4a7ebd the temporary object being built was moved to %t/main.o,
but not all run lines were updated to reflect this. Observe the failure
on this buildbot:

  http://lab.llvm.org:8011/#/builders/5/builds/3646/steps/9/logs/stdio

It might pass locally for some people due to a stale %t.o hanging around
the build directory.

Added: 


Modified: 
lld/test/ELF/wrap-shlib-undefined.s

Removed: 




diff  --git a/lld/test/ELF/wrap-shlib-undefined.s 
b/lld/test/ELF/wrap-shlib-undefined.s
index f46ebe36b779..0c4a79171e27 100644
--- a/lld/test/ELF/wrap-shlib-undefined.s
+++ b/lld/test/ELF/wrap-shlib-undefined.s
@@ -31,7 +31,7 @@
 # CHECK2-NEXT: NOTYPE  GLOBAL DEFAULT6 foo
 
 ## __wrap_bar is undefined.
-# RUN: ld.lld -shared %t.o --wrap=bar -o %t3.so
+# RUN: ld.lld -shared %t/main.o --wrap=bar -o %t3.so
 # RUN: llvm-readelf -r --dyn-syms %t3.so | FileCheck %s --check-prefix=CHECK3
 # CHECK3:  R_X86_64_JUMP_SLOT  __wrap_bar + 0
 # CHECK3:  Symbol table '.dynsym' contains 4 entries:
@@ -41,7 +41,7 @@
 # CHECK3-NEXT: NOTYPE  GLOBAL DEFAULT6 foo
 
 ## __wrap_bar is defined in %t/wrap.so.
-# RUN: ld.lld -shared %t.o %t/wrap.so --wrap=bar -o %t4.so
+# RUN: ld.lld -shared %t/main.o %t/wrap.so --wrap=bar -o %t4.so
 # RUN: llvm-readelf -r --dyn-syms %t4.so | FileCheck %s --check-prefix=CHECK4
 # CHECK4:  R_X86_64_JUMP_SLOT {{.*}} __wrap_bar + 0
 # CHECK4:  Symbol table '.dynsym' contains 4 entries:
@@ -50,7 +50,7 @@
 # CHECK4-NEXT: NOTYPE  GLOBAL DEFAULT6 _start
 # CHECK4-NEXT: NOTYPE  GLOBAL DEFAULT6 foo
 
-# RUN: ld.lld %t.o %t/wrap.so --wrap bar -o %t1
+# RUN: ld.lld %t/main.o %t/wrap.so --wrap bar -o %t1
 # RUN: llvm-readelf --dyn-syms %t1 | FileCheck %s --check-prefix=DYNSYM
 # RUN: llvm-objdump -d %t1 | FileCheck %s --check-prefix=ASM
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 914066f - [DebugInfo] Avoid LSR crash on large integer inputs

2021-01-05 Thread Jeremy Morse via llvm-branch-commits

Author: Jeremy Morse
Date: 2021-01-05T10:25:37Z
New Revision: 914066fe38a93c004b742a696ec337701eb738ec

URL: 
https://github.com/llvm/llvm-project/commit/914066fe38a93c004b742a696ec337701eb738ec
DIFF: 
https://github.com/llvm/llvm-project/commit/914066fe38a93c004b742a696ec337701eb738ec.diff

LOG: [DebugInfo] Avoid LSR crash on large integer inputs

Loop strength reduction tries to recover debug variable values by looking
for simple offsets from PHI values. In really extreme conditions there may
be an offset used that won't fit in an int64_t, hitting an APInt assertion.

This patch adds a regression test and adjusts the equivalent value
collecting code to filter out any values where the offset can't be
represented by an int64_t. This means that for very large integers with
very large offsets, the variable location will become undef, which is the
same behaviour as before 2a6782bb9f1 / D87494.

Differential Revision: https://reviews.llvm.org/D94016

Added: 
llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-1.ll

Modified: 
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp 
b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
index 7742ebafcac4..6e64a1ddf787 100644
--- a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
@@ -5796,8 +5796,9 @@ static void DbgGatherEqualValues(Loop *L, ScalarEvolution 
&SE,
 if (!SE.isSCEVable(Phi.getType()))
   continue;
 auto PhiSCEV = SE.getSCEV(&Phi);
-if (Optional Offset =
-SE.computeConstantDifference(DbgValueSCEV, PhiSCEV))
+Optional Offset =
+SE.computeConstantDifference(DbgValueSCEV, PhiSCEV);
+if (Offset && Offset->getMinSignedBits() <= 64)
   EqSet.emplace_back(std::make_tuple(
   &Phi, Offset.getValue().getSExtValue(), DVI->getExpression()));
   }

diff  --git a/llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-1.ll 
b/llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-1.ll
new file mode 100644
index ..d651ff239a46
--- /dev/null
+++ b/llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-1.ll
@@ -0,0 +1,73 @@
+; RUN: opt < %s -loop-reduce -S | FileCheck %s
+;
+; Test that LSR avoids crashing on very large integer inputs. It should
+; discard the variable location by creating an undef dbg.value.
+;
+; CHECK: call void @llvm.dbg.value(metadata i128 undef,
+
+source_filename = ""
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @qux() local_unnamed_addr !dbg !10 {
+entry:
+  br label %for.body, !dbg !17
+
+for.body: ; preds = 
%for.body.for.body_crit_edge, %entry
+  %0 = phi i128 [ 0, %entry ], [ %.pre8, %for.body.for.body_crit_edge ], !dbg 
!19
+  %add6.i.i = add i128 %0, 18446744073709551615, !dbg !35
+  call void @llvm.dbg.value(metadata i128 %add6.i.i, metadata !25, metadata 
!DIExpression(DW_OP_LLVM_fragment, 0, 128)), !dbg !36
+  br label %for.body.for.body_crit_edge
+
+for.body.for.body_crit_edge:  ; preds = %for.body
+  %.pre8 = load i128, i128* undef, align 16, !dbg !19, !tbaa !37
+  br label %for.body, !dbg !17
+}
+
+; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
+declare void @llvm.dbg.value(metadata, metadata, metadata)
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!7, !8, !9}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: 
"clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: 
!2, retainedTypes: !3, globals: !2, splitDebugInlining: false, nameTableKind: 
None)
+!1 = !DIFile(filename: "/tmp/beans.c", directory: "/tmp")
+!2 = !{}
+!3 = !{!4}
+!4 = !DIDerivedType(tag: DW_TAG_typedef, name: "uint64_t", file: !5, line: 1, 
baseType: !6)
+!5 = !DIFile(filename: "/tmp/beans.c", directory: "")
+!6 = !DIBasicType(name: "long unsigned int", size: 64, encoding: 
DW_ATE_unsigned)
+!7 = !{i32 7, !"Dwarf Version", i32 4}
+!8 = !{i32 2, !"Debug Info Version", i32 3}
+!9 = !{i32 1, !"wchar_size", i32 4}
+!10 = distinct !DISubprogram(name: "qux", scope: !5, file: !5, line: 27, type: 
!11, scopeLine: 27, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: 
DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !2)
+!11 = !DISubroutineType(types: !12)
+!12 = !{null, !13, !13}
+!13 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !14, size: 64)
+!14 = !DIDerivedType(tag: DW_TAG_typedef, name: "croix", file: !5, line: 2, 
baseType: !15)
+!15 = !DIDerivedType(tag: DW_TAG_typedef, name: "__uint128_t", file: !1, 
baseType: !16)
+!16 = !DIBasicType(name: "unsigned __int128", size: 128, encoding: 
DW_ATE_unsigned)
+!17 = !DILocation(line: 30, column: 3, scope: !18)
+

[llvm-branch-commits] [llvm] df2b9a3 - [DebugInfo] Avoid re-ordering assignments in LCSSA

2020-12-17 Thread Jeremy Morse via llvm-branch-commits

Author: Nabeel Omer
Date: 2020-12-17T16:17:32Z
New Revision: df2b9a3e02ca3bd7b60af6c65571909a7d3ab317

URL: 
https://github.com/llvm/llvm-project/commit/df2b9a3e02ca3bd7b60af6c65571909a7d3ab317
DIFF: 
https://github.com/llvm/llvm-project/commit/df2b9a3e02ca3bd7b60af6c65571909a7d3ab317.diff

LOG: [DebugInfo] Avoid re-ordering assignments in LCSSA

The LCSSA pass makes use of a function insertDebugValuesForPHIs() to
propogate dbg.value() intrinsics to newly inserted PHI instructions. Faulty
behaviour occurs when the parent PHI of a newly inserted PHI is not the
most recent assignment to a source variable. insertDebugValuesForPHIs ends
up propagating a value that isn't the most recent assignemnt.

This change removes the call to insertDebugValuesForPHIs() from LCSSA,
preventing incorrect dbg.value intrinsics from being propagated.
Propagating variable locations between blocks will occur later, during
LiveDebugValues.

Differential Revision: https://reviews.llvm.org/D92576

Added: 
llvm/test/Transforms/LCSSA/DontInsertDebugValuesForPHIs.ll

Modified: 
llvm/lib/Transforms/Utils/LCSSA.cpp
llvm/test/Transforms/LCSSA/basictest.ll
llvm/test/Transforms/LoopIdiom/X86/left-shift-until-bittest.ll

Removed: 




diff  --git a/llvm/lib/Transforms/Utils/LCSSA.cpp 
b/llvm/lib/Transforms/Utils/LCSSA.cpp
index 1bcf40e11d10..a601ec9349e0 100644
--- a/llvm/lib/Transforms/Utils/LCSSA.cpp
+++ b/llvm/lib/Transforms/Utils/LCSSA.cpp
@@ -265,15 +265,11 @@ bool 
llvm::formLCSSAForInstructions(SmallVectorImpl &Worklist,
 Worklist.push_back(PostProcessPN);
 
 // Keep track of PHI nodes that we want to remove because they did not have
-// any uses rewritten. If the new PHI is used, store it so that we can
-// try to propagate dbg.value intrinsics to it.
-SmallVector NeedDbgValues;
+// any uses rewritten.
 for (PHINode *PN : AddedPHIs)
   if (PN->use_empty())
 LocalPHIsToRemove.insert(PN);
-  else
-NeedDbgValues.push_back(PN);
-insertDebugValuesForPHIs(InstBB, NeedDbgValues);
+
 Changed = true;
   }
 

diff  --git a/llvm/test/Transforms/LCSSA/DontInsertDebugValuesForPHIs.ll 
b/llvm/test/Transforms/LCSSA/DontInsertDebugValuesForPHIs.ll
new file mode 100644
index ..b140bc96f5d0
--- /dev/null
+++ b/llvm/test/Transforms/LCSSA/DontInsertDebugValuesForPHIs.ll
@@ -0,0 +1,57 @@
+; RUN: opt < %s -lcssa -S | FileCheck %s
+
+; This test ensures that LCSSA does not insert dbg.value intrinsics using
+; insertDebugValuesForPHIs() which effectively cause assignments to be
+; re-ordered.
+; See PR48206 for more information.
+
+define dso_local i32 @_Z5lcssab(i1 zeroext %S2) {
+entry:
+  br label %loop.interior
+
+loop.interior:; preds = %post.if, %entry
+  br i1 %S2, label %if.true, label %if.false
+
+if.true:  ; preds = %loop.interior
+  %X1 = add i32 0, 0
+  br label %post.if
+
+if.false: ; preds = %loop.interior
+  %X2 = add i32 0, 1
+  br label %post.if
+
+post.if:  ; preds = %if.false, %if.true
+  %X3 = phi i32 [ %X1, %if.true ], [ %X2, %if.false ], !dbg !21
+  call void @llvm.dbg.value(metadata i32 %X3, metadata !9, metadata 
!DIExpression()), !dbg !21
+  %Y1 = add i32 4, %X3, !dbg !22
+  call void @llvm.dbg.value(metadata i32 %Y1, metadata !9, metadata 
!DIExpression()), !dbg !22
+  br i1 %S2, label %loop.exit, label %loop.interior, !dbg !23
+
+loop.exit:; preds = %post.if
+; CHECK: loop.exit:
+; CHECK-NEXT: %X3.lcssa = phi i32
+; CHECK-NOT: call void @llvm.dbg.value
+  %X4 = add i32 3, %X3
+  ret i32 %X4
+}
+
+declare void @llvm.dbg.value(metadata, metadata, metadata)
+
+!llvm.dbg.cu = !{!0}
+!llvm.debugify = !{!3, !4}
+!llvm.module.flags = !{!5}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: 
"debugify and Author", isOptimized: true, runtimeVersion: 0, emissionKind: 
FullDebug, enums: !2)
+!1 = !DIFile(filename: "./testcase.ll", directory: "/")
+!2 = !{}
+!3 = !{i32 11}
+!4 = !{i32 5}
+!5 = !{i32 2, !"Debug Info Version", i32 3}
+!6 = distinct !DISubprogram(name: "_Z5lcssab", linkageName: "_Z5lcssab", 
scope: null, file: !1, line: 1, type: !7, scopeLine: 1, spFlags: 
DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !{!9})
+!7 = !DISubroutineType(types: !2)
+!9 = !DILocalVariable(name: "var", scope: !6, file: !1, line: 3, type: !10)
+!10 = !DIBasicType(name: "ty32", size: 32, encoding: DW_ATE_unsigned)
+!21 = !DILocation(line: 7, column: 1, scope: !6)
+!22 = !DILocation(line: 8, column: 1, scope: !6)
+!23 = !DILocation(line: 9, column: 1, scope: !6)
+

diff  --git a/llvm/test/Transforms/LCSSA/basictest.ll 
b/llvm/test/Transforms/LCSSA/basictest.ll
index 7ca552039b63..30d5e762eedf 100644
--- a/llvm/test/Transforms/LCSSA/basictest.ll
+++ b/

[llvm-branch-commits] [llvm] cda32ab - [DebugInfo][Docs] Document MIR language debug-info constructs

2020-12-08 Thread Jeremy Morse via llvm-branch-commits

Author: Jeremy Morse
Date: 2020-12-08T11:01:55Z
New Revision: cda32aba4f46033eea4eea6eda703459dc3c7a1e

URL: 
https://github.com/llvm/llvm-project/commit/cda32aba4f46033eea4eea6eda703459dc3c7a1e
DIFF: 
https://github.com/llvm/llvm-project/commit/cda32aba4f46033eea4eea6eda703459dc3c7a1e.diff

LOG: [DebugInfo][Docs] Document MIR language debug-info constructs

This patch documents the MIR syntax for a number of things relevant to
debugging information:
 * Trailing 'debug-location' metadata that becomes a DebugLoc,
 * Variable location metadata for stack slots,
 * Syntax for DBG_VALUE metainstructions,
 * Syntax for DBG_INSTR_REF, including trailing instruction numbers
   attached to MIR instructions.

Differential Revision: https://reviews.llvm.org/D89337

Added: 


Modified: 
llvm/docs/MIRLangRef.rst

Removed: 




diff  --git a/llvm/docs/MIRLangRef.rst b/llvm/docs/MIRLangRef.rst
index 0380bcced5fc..ff8029c5c646 100644
--- a/llvm/docs/MIRLangRef.rst
+++ b/llvm/docs/MIRLangRef.rst
@@ -790,3 +790,105 @@ For an int eq predicate ``ICMP_EQ``, the syntax is:
instructions debug location attribute.
 .. TODO: Describe the syntax of the register live out machine operands.
 .. TODO: Describe the syntax of the machine memory operands.
+
+Debug-Info constructs
+-
+
+Most of the debugging information in a MIR file is to be found in the metadata
+of the embedded module. Within a machine function, that metadata is referred to
+by various constructs to describe source locations and variable locations.
+
+Source locations
+
+
+Every MIR instruction may optionally have a trailing reference to a
+``DILocation`` metadata node, after all operands and symbols, but before
+memory operands:
+
+.. code-block:: text
+
+   $rbp = MOV64rr $rdi, debug-location !12
+
+The source location attachment is synonymous with the ``!dbg`` metadata
+attachment in LLVM-IR. The absence of a source location attachment will be
+represented by an empty ``DebugLoc`` object in the machine instruction.
+
+Fixed variable locations
+
+
+There are several ways of specifying variable locations. The simpliest is
+describing a variable that is permanently located on the stack. In the stack
+or fixedStack attribute of the machine function, the variable, scope, and
+any qualifying location modifier are provided:
+
+.. code-block:: text
+
+- { id: 0, name: offset.addr, offset: -24, size: 8, alignment: 8, 
stack-id: default,
+ 4  debug-info-variable: '!1', debug-info-expression: '!DIExpression()',
+debug-info-location: '!2' }
+
+Where:
+
+- ``debug-info-variable`` identifies a DILocalVariable metadata node,
+
+- ``debug-info-expression`` adds qualifiers to the variable location,
+
+- ``debug-info-location`` identifies a DILocation metadata node.
+
+These metadata attributes correspond to the operands of a ``llvm.dbg.declare``
+IR intrinsic, see the :ref:`source level debugging`
+documentation.
+
+Varying variable locations
+^^
+
+Variables that are not always on the stack or change location are specified
+with the ``DBG_VALUE``  meta machine instruction. It is synonymous with the
+``llvm.dbg.value`` IR intrinsic, and is written:
+
+.. code-block:: text
+
+DBG_VALUE $rax, $noreg, !123, !DIExpression(), debug-location !456
+
+The operands to which respectively:
+
+1. Identifies a machine location such as a register, immediate, or frame index,
+
+2. Is either $noreg, or immediate value zero if an extra level of indirection 
is to be added to the first operand,
+
+3. Identifies a ``DILocalVariable`` metadata node,
+
+4. Specifies an expression qualifying the variable location, either inline or 
as a metadata node reference,
+
+While the source location identifies the ``DILocation`` for the scope of the
+variable. The second operand (``IsIndirect``) is deprecated and to be deleted.
+All additional qualifiers for the variable location should be made through the
+expression metadata.
+
+Instruction referencing locations
+^
+
+This experimental feature aims to separate the specification of variable
+*values* from the program point where a variable takes on that value. Changes
+in variable value occur in the same manner as ``DBG_VALUE`` meta instructions
+but using ``DBG_INSTR_REF``. Variable values are identified by a pair of
+instruction number and operand number. Consider the example below:
+
+.. code-block:: text
+
+$rbp = MOV64ri 0, debug-instr-number 1, debug-location !12
+DBG_INSTR_REF 1, 0, !123, !DIExpression(), debug-location !456
+
+Instruction numbers are directly attached to machine instructions with an
+optional ``debug-instr-number`` attachment, before the optional
+``debug-location`` attachment. The value defined in ``$rbp`` in the code
+above would be identified by the pair ``<1, 0>``.
+
+The first two operands of the ``

[llvm-branch-commits] [compiler-rt] [TySan] Improved compatability for tests (PR #96507)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits


@@ -23,8 +24,8 @@ void f(int m) {
   }
 
   // CHECK:  TypeSanitizer: type-aliasing-violation on address
-  // CHECK-NEXT: READ of size 2 at {{.+}} with type short accesses an existing 
object of type long long
-  // CHECK-NEXT:in f violation-pr47137.c:30
+  // CHECK-NEXT: READ of size 2 at {{.+}} with type short accesses an existing 
object of type {{(long)+}}

jmorse wrote:

IMO having an optional (with `?`) long is preferable to having an unbound 
number of longs.

https://github.com/llvm/llvm-project/pull/96507
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Improved compatability for tests (PR #96507)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits


@@ -18,7 +18,7 @@ int main(void) {
 
   // CHECK:  TypeSanitizer: type-aliasing-violation on address
   // CHECK-NEXT: WRITE of size 8 at {{.+}} with type double accesses an 
existing object of type float
-  // CHECK-NEXT:   in main violation-pr45282.c:25
+  // CHECK-NEXT:   in main {{.*violation-pr45282.c:25.*}}

jmorse wrote:

Nit: IMO having `{{.*}}` to swallow extra path components is slightly 
preferable, as that avoids the filename being part of a regex. Almost certainly 
not ever going to be a problem, but it reduces what can go wrong. Does your 
environment also have trailing components after the line number? If there's any 
whitespace between '5' and the extra text then FileCheck will (99% confident) 
ignore those anyway.

Similar thoughts apply to all the other similar regexes.

https://github.com/llvm/llvm-project/pull/96507
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits


@@ -221,7 +221,24 @@ __tysan_check(void *addr, int size, tysan_type_descriptor 
*td, int flags) {
 OldTDPtr -= i;
 OldTD = *OldTDPtr;
 
-if (!isAliasingLegal(td, OldTD))
+// When shadow memory is set for global objects, the entire object is 
tagged with the struct type
+// This means that when you access a member variable, tysan reads that as 
you accessing a struct midway
+// through, with 'i' being the offset
+// Therefore, if you are accessing a struct, we need to find the member 
type. We can go through the
+// members of the struct type and see if there is a member at the offset 
you are accessing the struct by.
+// If there is indeed a member starting at offset 'i' in the struct, we 
should check aliasing legality
+// with that type. If there isn't, we run alias checking on the struct 
with will give us the correct error.
+tysan_type_descriptor *InternalMember = OldTD;
+if (OldTD->Tag == TYSAN_STRUCT_TD) {
+  for (int j = 0; j < OldTD->Struct.MemberCount; j++) {

jmorse wrote:

Style guide says `++j`

https://github.com/llvm/llvm-project/pull/95387
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

I'm not familiar with TySan, but here's some style feedback and similar inline.

The comments in tysan.cpp are too long (clang-format-bot would have bitten you 
if it was live when you uploaded this), please clang-format.

https://github.com/llvm/llvm-project/pull/95387
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/95387
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,31 @@
+// RUN: %clang_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct X {
+  int a, b, c;
+} x;
+
+static struct X xArray[2];
+
+int main() {
+  x.a = 1;
+  x.b = 2;
+  x.c = 3;
+
+  printf("%d %d %d\n", x.a, x.b, x.c);
+  // CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation
+
+  for (size_t i = 0; i < 2; i++) {
+xArray[i].a = 1;
+xArray[i].b = 1;
+xArray[i].c = 1;
+  }
+
+  struct X *xPtr = (struct X *)&(xArray[0].c);
+  xPtr->a = 1;
+  // CHECK: ERROR: TypeSanitizer: type-aliasing-violation
+  // CHECK: WRITE of size 4 at {{.*}} with type int (in X at offset 0) 
accesses an existing object of type int (in X at offset 8)
+  // CHECK: {{#0 0x.* in main .*struct-members.c:}}[[@LINE-3]]

jmorse wrote:

Is it possible to strengthen this test even further by making the latter two 
"CHECK-NEXT", to ensure the error appears all in one block? (Reduces the risk 
of spurious extra output matching CHECK lines).

I'd recommend reducing the portions of the CHECK-line that are regexes to be as 
small as possible, so `{{.*}}struct-members.c` and similar.

https://github.com/llvm/llvm-project/pull/95387
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits


@@ -221,7 +221,24 @@ __tysan_check(void *addr, int size, tysan_type_descriptor 
*td, int flags) {
 OldTDPtr -= i;
 OldTD = *OldTDPtr;
 
-if (!isAliasingLegal(td, OldTD))
+// When shadow memory is set for global objects, the entire object is 
tagged with the struct type
+// This means that when you access a member variable, tysan reads that as 
you accessing a struct midway
+// through, with 'i' being the offset
+// Therefore, if you are accessing a struct, we need to find the member 
type. We can go through the
+// members of the struct type and see if there is a member at the offset 
you are accessing the struct by.
+// If there is indeed a member starting at offset 'i' in the struct, we 
should check aliasing legality
+// with that type. If there isn't, we run alias checking on the struct 
with will give us the correct error.

jmorse wrote:

```suggestion
// with that type. If there isn't, we run alias checking on the struct 
which will give us the correct error.
```
Doesn't parse without this change?

https://github.com/llvm/llvm-project/pull/95387
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits


@@ -221,7 +221,17 @@ __tysan_check(void *addr, int size, tysan_type_descriptor 
*td, int flags) {
 OldTDPtr -= i;
 OldTD = *OldTDPtr;
 
-if (!isAliasingLegal(td, OldTD))
+tysan_type_descriptor *InternalMember = OldTD;

jmorse wrote:

I feel like you should name the variable declared here "AccessedType" or 
similar, as the default-path is to check access to OldTD rather than an 
internal member of it, right?

https://github.com/llvm/llvm-project/pull/95387
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,31 @@
+// RUN: %clang_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct X {
+  int a, b, c;
+} x;
+
+static struct X xArray[2];
+
+int main() {
+  x.a = 1;
+  x.b = 2;
+  x.c = 3;
+
+  printf("%d %d %d\n", x.a, x.b, x.c);
+  // CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation

jmorse wrote:

I'm not familiar with the output of TySan, but this CHECK-NOT paired with the 
CHECK on line 28 is going to test that "there's no error until there's an 
error", which makes this specific CHECK-NOT redundant. If you're looking to 
ensure there's only one instance of that error-output then you need to put 
another CHECK-NOT under the CHECK lines below, or add `--implicit-check-not` to 
FileCheck. Also consider just not having this CHECK-NOT line, if TySan will 
never produce more than one error.

https://github.com/llvm/llvm-project/pull/95387
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits


@@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA,
   break;
   }
 
+  //You can't have negative offset, you must be partially inside the last 
type

jmorse wrote:

Space at start of comment; better to write the comment in a passive tone, "The 
offset can't be negative, thus we must be..."

https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,31 @@
+// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct inner {
+   char buffer;
+   int i;
+};
+
+void init_inner(inner *iPtr) {
+   iPtr->i = 0;
+}
+
+struct outer {
+   inner foo;
+char buffer;
+};
+
+int main(void) {
+outer *l = new outer();
+
+init_inner(&l->foo);
+
+int access_offsets_with_different_base = l->foo.i;
+printf("%d\n", access_offsets_with_different_base);
+
+return 0;
+}
+
+// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation

jmorse wrote:

Nit: please also include a test for /something/ in the output to ensure that 
the test is positively checking for something. Otherwise we could replace 
%clangxx_tysan with any program that succeeds , such as `true`, and this test 
would pass.

https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-11-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

(Minor style and test feedbacks)

https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

Are there any expected interactions between atom-group-numbers and loading 
bitcode? i.e., if we serialise the literal atom-group-number to the output and 
then read it back in again, then it might conflict with atom-group-numbers seen 
in other functions in other bitcode files. It doesn't appear that they get 
re-numbered in the textual IR parsing patch for example.

Possibly part of the design here is to simply not care, if it's only about 
internal consistency within a Function (does that hold after inlining too). 
Apologies if this is all explained in a later patch.

The answers to that should ultimately be documented somewhere; I imagine that's 
in the patch stack or coming later.

https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


@@ -1366,6 +1367,43 @@ TEST_F(DILocationTest, discriminatorSpecialCases) {
   EXPECT_EQ(std::nullopt, L4->cloneByMultiplyingDuplicationFactor(0x1000));
 }
 
+TEST_F(DILocationTest, KeyInstructions) {
+  Context.pImpl->NextAtomGroup = 1;
+
+  EXPECT_EQ(Context.pImpl->NextAtomGroup, 1u);
+  DILocation *A1 = DILocation::get(Context, 1, 0, getSubprogram(), nullptr, 
false, 1, 2);
+  // The group is only applied to the DILocation if the build has opted into
+  // the additional DILocation fields needed for the feature.

jmorse wrote:

Style nit: I feel "the build has opted into" is a bit too abstract, and is like 
the code referring to itself in the third person. "if we have been built 
with..." feels a lot cleaner IMHO, YMMV.

https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


@@ -335,6 +335,14 @@ class LLVMContext {
   StringRef getDefaultTargetFeatures();
   void setDefaultTargetFeatures(StringRef Features);
 
+  /// Key Instructions: update the highest number atom group emitted for any
+  /// function.
+  void updateAtomGroupWaterline(uint64_t G);
+
+  /// Key Instructions: get the next free atom group number and increment
+  /// the global tracker.
+  uint64_t incNextAtomGroup();
+

jmorse wrote:

IMO in isolation it's not clear that this is to do with debugging information 
and source locations; could we shoe-horn `DILocation` into the comments to make 
it clear what it affects?

(Thinking purely about someone stumbling on this and not immediately knowing 
whether it's relevant to what they're studying)

https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


@@ -87,6 +87,8 @@ class ValueMap {
   using ValueMapCVH = ValueMapCallbackVH;
   using MapT = DenseMap>;
   using MDMapT = DenseMap;
+  /// Map {(InlinedAt, old atom number) -> new atom number}.
+  using DMAtomT = DenseMap, uint64_t>;

jmorse wrote:

Consider using SmallDenseMap simply to reduce the initial allocations in the 
non-debug-info codepath?

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


@@ -105,6 +105,13 @@ enum RemapFlags {
   /// Any global values not in value map are mapped to null instead of mapping
   /// to self.  Illegal if RF_IgnoreMissingLocals is also set.
   RF_NullMapMissingGlobalValues = 8,
+
+  /// Do not remap atom instances. Only safe if to do this if the cloned
+  /// instructions being remapped are inserted into a new function, or an
+  /// existing function where the inlined-at fields are updated. If in doubt,
+  /// don't use this flag. It's used for compiler performance reasons rather
+  /// than correctness.

jmorse wrote:

```suggestion
  /// don't use this flag. It's used when remapping is known to be un-necessary
  /// to save some compile-time.
```

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


@@ -117,9 +118,21 @@ struct ClonedCodeInfo {
 /// If you would like to collect additional information about the cloned
 /// function, you can specify a ClonedCodeInfo object with the optional fifth
 /// parameter.
+///
+/// Set \p MapAtoms to false to skip mapping source atoms for later remapping.

jmorse wrote:

IMO "source-location atoms" to make it even clearer that this is a debugging 
feature.

Also IMO it's better to discuss when this flag is necessary instead of when 
it's not necessary, as it'll enlighten the reader what it's for. AFAIUI, 
something like "Must be true when you duplicate a code path and a source line 
is intended to appear twice in the generated instructions. Can be set to false 
if you are transplanting code from one place to another".

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


@@ -105,6 +105,13 @@ enum RemapFlags {
   /// Any global values not in value map are mapped to null instead of mapping
   /// to self.  Illegal if RF_IgnoreMissingLocals is also set.
   RF_NullMapMissingGlobalValues = 8,
+
+  /// Do not remap atom instances. Only safe if to do this if the cloned
+  /// instructions being remapped are inserted into a new function, or an
+  /// existing function where the inlined-at fields are updated. If in doubt,
+  /// don't use this flag. It's used for compiler performance reasons rather
+  /// than correctness.

jmorse wrote:

IMO suggesting that it's not related to correctness is misleading, because the 
presence/absence of the flag can lead to correctness issues. Better to just 
avoid saying that and say something else as suggested.

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


@@ -105,6 +105,13 @@ enum RemapFlags {
   /// Any global values not in value map are mapped to null instead of mapping
   /// to self.  Illegal if RF_IgnoreMissingLocals is also set.
   RF_NullMapMissingGlobalValues = 8,
+
+  /// Do not remap atom instances. Only safe if to do this if the cloned

jmorse wrote:

IMO needs "source location atom" instead of just "atom" to ensure the random 
reader knows it's about debug-info.

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM, code and tests are good. As ever all my comments are about comments and 
maintainability!

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


@@ -284,6 +291,9 @@ inline void RemapInstruction(Instruction *I, 
ValueToValueMapTy &VM,
   .remapInstruction(*I);
 }
 
+/// Remap source atom. Called by RemapInstruction.

jmorse wrote:

IMO too terse; needs some purpose and context.

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NPM] Port LiveDebugValues to NPM (PR #131563)

2025-03-17 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM, although I'm not especially familiar with the transition to the new pass 
manager.

https://github.com/llvm/llvm-project/pull/131563
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NPM] Port LiveDebugValues to NPM (PR #131563)

2025-03-17 Thread Jeremy Morse via llvm-branch-commits


@@ -77,36 +78,68 @@ class LiveDebugValues : public MachineFunctionPass {
 AU.setPreservesCFG();
 MachineFunctionPass::getAnalysisUsage(AU);
   }
+};
+
+struct LiveDebugValues {
+  LiveDebugValues();
+  ~LiveDebugValues() = default;
+  bool run(MachineFunction &MF, bool ShouldEmitDebugEntryValues);
 
 private:
   std::unique_ptr InstrRefImpl;
   std::unique_ptr VarLocImpl;
-  TargetPassConfig *TPC = nullptr;
   MachineDominatorTree MDT;
 };
 } // namespace
 
-char LiveDebugValues::ID = 0;
+char LiveDebugValuesLegacy::ID = 0;
 
-char &llvm::LiveDebugValuesID = LiveDebugValues::ID;
+char &llvm::LiveDebugValuesID = LiveDebugValuesLegacy::ID;
 
-INITIALIZE_PASS(LiveDebugValues, DEBUG_TYPE, "Live DEBUG_VALUE analysis", 
false,
-false)
+INITIALIZE_PASS(LiveDebugValuesLegacy, DEBUG_TYPE, "Live DEBUG_VALUE analysis",
+false, false)
 
 /// Default construct and initialize the pass.
-LiveDebugValues::LiveDebugValues() : MachineFunctionPass(ID) {
-  initializeLiveDebugValuesPass(*PassRegistry::getPassRegistry());
+LiveDebugValuesLegacy::LiveDebugValuesLegacy() : MachineFunctionPass(ID) {
+  initializeLiveDebugValuesLegacyPass(*PassRegistry::getPassRegistry());
+}
+
+LiveDebugValues::LiveDebugValues() {
   InstrRefImpl =
   std::unique_ptr(llvm::makeInstrRefBasedLiveDebugValues());
   VarLocImpl = 
std::unique_ptr(llvm::makeVarLocBasedLiveDebugValues());
 }
 
-bool LiveDebugValues::runOnMachineFunction(MachineFunction &MF) {
+PreservedAnalyses
+LiveDebugValuesPass::run(MachineFunction &MF,
+ MachineFunctionAnalysisManager &MFAM) {
+  if (!LiveDebugValues().run(MF, ShouldEmitDebugEntryValues))
+return PreservedAnalyses::all();
+  auto PA = getMachineFunctionPassPreservedAnalyses();

jmorse wrote:

As far as I'm aware this should preserve all information in analyses -- LDV 
will insert new DBG_VALUE instructions and should do absolutely nothing else.

https://github.com/llvm/llvm-project/pull/131563
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NPM] Port LiveDebugValues to NPM (PR #131563)

2025-03-17 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/131563
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

Looks good, although I think the atom-group selection could be refactored so 
that it's simpler to read. There's already multiple levels of nesting happening 
in that function, best to reduce that as much as possible (possibly put it in a 
completely different helper function to separate the concerns?)

https://github.com/llvm/llvm-project/pull/133480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -226,8 +230,44 @@ DILocation *DILocation::getMergedLocation(DILocation 
*LocA, DILocation *LocB) {
 bool SameCol = L1->getColumn() == L2->getColumn();
 unsigned Line = SameLine ? L1->getLine() : 0;
 unsigned Col = SameLine && SameCol ? L1->getColumn() : 0;
-
-return DILocation::get(C, Line, Col, Scope, InlinedAt);
+bool IsImplicitCode = L1->isImplicitCode() && L2->isImplicitCode();
+uint64_t Group = 0;
+uint64_t Rank = 0;
+if (SameLine) {
+  if (L1->getAtomGroup() || L2->getAtomGroup()) {

jmorse wrote:

Early-exit the lambda/delegate instead?

https://github.com/llvm/llvm-project/pull/133480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -1243,6 +1243,140 @@ TEST_F(DILocationTest, Merge) {
 auto *M2 = DILocation::getMergedLocation(A2, B);
 EXPECT_EQ(M1, M2);
   }
+
+#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
+#define EXPECT_ATOM(Loc, Group, Rank)  
\
+  EXPECT_EQ(Group, M->getAtomGroup()); 
\
+  EXPECT_EQ(Rank, M->getAtomRank());
+#else
+#define EXPECT_ATOM(Loc, Group, Rank)  
\
+  EXPECT_EQ(0u, M->getAtomGroup());
\
+  EXPECT_EQ(0u, M->getAtomRank()); 
\
+  (void)Group; 
\
+  (void)Rank;
+#endif
+  // Identical, including source atom numbers.
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+// DILocations are uniqued, so we can check equality by ptr.
+EXPECT_EQ(M, DILocation::getMergedLocation(A, B));
+  }
+
+  // Identical but different atom ranks (same atom) - choose the lowest nonzero
+  // rank.
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0);
+B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 2u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Identical but different atom ranks (different atom) - choose the lowest
+  // nonzero rank.
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0);
+B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 2u, 2u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Identical but equal atom rank (different atom) - choose the lowest 
non-zero
+  // group (arbitrary choice for deterministic behaviour).
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 2, 7, N, nullptr, false, 0, 1);
+B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 2u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Completely different except same atom numbers. Zero out the atoms.
+  {
+auto *I = DILocation::get(Context, 2, 7, N);
+auto *A = DILocation::get(Context, 1, 6, S, I, false, 1, 1);
+auto *B =
+DILocation::get(Context, 2, 7, getSubprogram(), nullptr, false, 1, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_EQ(0u, M->getLine());
+EXPECT_EQ(0u, M->getColumn());
+EXPECT_TRUE(isa(M->getScope()));
+EXPECT_EQ(S, M->getScope());
+EXPECT_EQ(nullptr, M->getInlinedAt());
+  }
+
+  // Same inlined-at chain but different atoms. Choose the lowest
+  // non-zero group (arbitrary choice for deterministic behaviour).
+  {
+auto *I = DILocation::get(Context, 1, 7, N);
+auto *F = getSubprogram();
+auto *A = DILocation::get(Context, 1, 1, F, I, false, 1, 2);
+auto *B = DILocation::get(Context, 1, 1, F, I, false, 2, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 2u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 1, 1, F, I, false, 1, 2);
+B = DILocation::get(Context, 1, 1, F, I, false, 2, 0);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 2u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Partially equal inlined-at chain but different atoms. Generate a new atom
+  // group (if either have a group number). This configuration seems unlikely
+  // to occur as line numbers must match, but isn't impossible.
+  {

jmorse wrote:

I believe it's a desired and supported configuration for this to occur when a 
function is inlined multiple times then the copies are merged: we keep the line 
number of the instruction, but put line-zero in the inlinedAt field so that the 
inlining-callstack refers to a no

[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -1243,6 +1243,140 @@ TEST_F(DILocationTest, Merge) {
 auto *M2 = DILocation::getMergedLocation(A2, B);
 EXPECT_EQ(M1, M2);
   }
+
+#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
+#define EXPECT_ATOM(Loc, Group, Rank)  
\
+  EXPECT_EQ(Group, M->getAtomGroup()); 
\
+  EXPECT_EQ(Rank, M->getAtomRank());
+#else
+#define EXPECT_ATOM(Loc, Group, Rank)  
\
+  EXPECT_EQ(0u, M->getAtomGroup());
\
+  EXPECT_EQ(0u, M->getAtomRank()); 
\
+  (void)Group; 
\
+  (void)Rank;
+#endif
+  // Identical, including source atom numbers.
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+// DILocations are uniqued, so we can check equality by ptr.
+EXPECT_EQ(M, DILocation::getMergedLocation(A, B));
+  }
+
+  // Identical but different atom ranks (same atom) - choose the lowest nonzero
+  // rank.
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0);
+B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 2u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Identical but different atom ranks (different atom) - choose the lowest
+  // nonzero rank.
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0);
+B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 2u, 2u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Identical but equal atom rank (different atom) - choose the lowest 
non-zero
+  // group (arbitrary choice for deterministic behaviour).
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 2, 7, N, nullptr, false, 0, 1);
+B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 2u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Completely different except same atom numbers. Zero out the atoms.
+  {
+auto *I = DILocation::get(Context, 2, 7, N);
+auto *A = DILocation::get(Context, 1, 6, S, I, false, 1, 1);
+auto *B =
+DILocation::get(Context, 2, 7, getSubprogram(), nullptr, false, 1, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_EQ(0u, M->getLine());
+EXPECT_EQ(0u, M->getColumn());
+EXPECT_TRUE(isa(M->getScope()));
+EXPECT_EQ(S, M->getScope());
+EXPECT_EQ(nullptr, M->getInlinedAt());
+  }
+
+  // Same inlined-at chain but different atoms. Choose the lowest
+  // non-zero group (arbitrary choice for deterministic behaviour).
+  {
+auto *I = DILocation::get(Context, 1, 7, N);
+auto *F = getSubprogram();
+auto *A = DILocation::get(Context, 1, 1, F, I, false, 1, 2);
+auto *B = DILocation::get(Context, 1, 1, F, I, false, 2, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 2u, 1u);

jmorse wrote:

This is an awkwardness I forsee in the future: because these are just integers, 
I tend to forget which way around things are. And so in this example: isn't the 
first integer to EXPECT_ATOM the group, which according to the comment here 
should be the lowest, but you're expecting the highest (2 instead of 1)? (Looks 
like the rank-test is applied before the atom test?)

https://github.com/llvm/llvm-project/pull/133480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

Looks good, question inline. I worried a little about whether someone out there 
might inadvertantly re-use this function for some other purpose, but it looks 
like it's dedicated to `llvm::foldBranchToCommonDest`, so pretty much a single 
use case.

https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][debugify] Add --debugify-atoms to add key instructions metadata (PR #133483)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/133483
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -1129,13 +1130,17 @@ static void 
cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses(
 
 Instruction *NewBonusInst = BonusInst.clone();
 
-if (!isa(BonusInst) &&
-PTI->getDebugLoc() != NewBonusInst->getDebugLoc()) {
-  // Unless the instruction has the same !dbg location as the original
-  // branch, drop it. When we fold the bonus instructions we want to make
-  // sure we reset their debug locations in order to avoid stepping on
-  // dead code caused by folding dead branches.
-  NewBonusInst->setDebugLoc(DebugLoc());
+if (!isa(BonusInst)) {
+  if (!NewBonusInst->getDebugLoc().isSameSourceLocation(
+  PTI->getDebugLoc())) {
+// Unless the instruction has the same !dbg location as the original
+// branch, drop it. When we fold the bonus instructions we want to make
+// sure we reset their debug locations in order to avoid stepping on
+// dead code caused by folding dead branches.
+NewBonusInst->setDebugLoc(DebugLoc());
+  } else if (const DebugLoc &DL = NewBonusInst->getDebugLoc()) {
+mapAtomInstance(DL, VMap);

jmorse wrote:

What are the consequences for this when stepping -- this is an abnormal 
situation seeing how the source location is identical to where it's being 
hoisted to. Will developers potentially step on the same source location twice?

I understand the general situation of "code that is duplicated needs new atoms, 
because there are now multiple key instructions", but wouldn't this be 
different when the source-location at PTI potentially becomes key _because_ 
we've remapped a key instruction we've hoisted to beneath it?

https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,81 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt %s -S -passes=simplifycfg -bonus-inst-threshold=2 | FileCheck %s
+
+;; Block d gets folded into preds b and c. Check the cloned instructions get

jmorse wrote:

Copy the block diagram from this review into here?

https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms after duplication for threading (PR #133484)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/133484
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM with nit

https://github.com/llvm/llvm-project/pull/133485
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133485
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -1821,9 +1821,9 @@ static DebugLoc inlineDebugLoc(DebugLoc OrigDL, 
DILocation *InlinedAt,
 /// to encode location where these instructions are inlined.
 static void fixupLineNumbers(Function *Fn, Function::iterator FI,
  Instruction *TheCall, bool CalleeHasDebugInfo) {
-  const DebugLoc &TheCallDL = TheCall->getDebugLoc();
-  if (!TheCallDL)
+  if (!TheCall->getDebugLoc())
 return;
+  DebugLoc TheCallDL = TheCall->getDebugLoc().get()->getOrCloneWithoutAtom();

jmorse wrote:

Comment on why this is necessary desired.

https://github.com/llvm/llvm-project/pull/133485
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -2314,6 +2326,7 @@ void 
JumpThreadingPass::threadThroughTwoBasicBlocks(BasicBlock *PredPredBB,
{DominatorTree::Insert, PredPredBB, NewBB},
{DominatorTree::Delete, PredPredBB, PredBB}});
 
+  remapSourceAtoms(ValueMapping, NewBB->begin(), NewBB->end());

jmorse wrote:

I'd say "comment pls", but I suppose the fact there's "source" in the name 
gives a hint as to what's going on.

https://github.com/llvm/llvm-project/pull/133486
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,68 @@
+; RUN: opt -S -passes=jump-threading,verify %s | FileCheck %s
+
+;; Modified from llvm/test/Transforms/JumpThreading/thread-two-bbs.ll
+;;
+;; JumpThreading duplicates bb.cond2 to thread through bb.file to bb.f2 or 
exit.
+;;
+;; Check the duplicated instructions get remapped atom groups.
+
+; CHECK: bb.cond2:

jmorse wrote:

CHECK-LABEL?

https://github.com/llvm/llvm-project/pull/133486
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -2087,8 +2097,10 @@ void 
JumpThreadingPass::cloneInstructions(ValueToValueMapTy &ValueMapping,
 adaptNoAliasScopes(New, ClonedScopes, Context);
 
 CloneAndRemapDbgInfo(New, &*BI);
+if (const DebugLoc &DL = New->getDebugLoc())
+  mapAtomInstance(DL, ValueMapping);
 
-if (RetargetDbgValueIfPossible(New))
+  if (RetargetDbgValueIfPossible(New))

jmorse wrote:

Accidental whitespace

https://github.com/llvm/llvm-project/pull/133486
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM with nits

https://github.com/llvm/llvm-project/pull/133486
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133486
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms after threading (PR #133487)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133487
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms after threading (PR #133487)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,78 @@
+; RUN: opt %s --passes=jump-threading -S -o - -S | FileCheck %s
+
+;;+-> T1 -+
+;;|   v  +-> T2
+;; Entry -+   Merge -+
+;;|   ^  +-> F2
+;;+-> F1 -+
+;;
+;; Thread T1 -> T2 and F1 -> F2 through Merge.
+;;
+;;+-> T1
+;;|
+;; Entry -+
+;;|
+;;+-> F1

jmorse wrote:

Diagram should include T2 / F2, or is that deliberate?

https://github.com/llvm/llvm-project/pull/133487
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms after threading (PR #133487)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

The previous patch in this stack #133486 calls remapSourceAtoms before calling 
calling threadEdge; and this patch adds a call to remapSourceAtoms  into 
threadEdge, so it'll be called twice. Is this just inconsequential or 
deliberate, or meaningless?

https://github.com/llvm/llvm-project/pull/133487
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms duping bb with cond br on phi into pred (PR #133488)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

Whilst being some complicated threading, LGTM

https://github.com/llvm/llvm-project/pull/133488
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][LoopUnroll] Remap atoms while unrolling (PR #133489)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133489
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][LoopUnroll] Remap atoms while unrolling (PR #133489)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits


@@ -752,6 +752,14 @@ llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo 
*LI,
 }
   }
 
+  // Remap source location atom instance. Do this now, rather than
+  // when we remap instructions, because remap is called once we've
+  // cloned all blocks (all the clones would get the same atom
+  // number).

jmorse wrote:

I get why all the clones need to get distinct atom numbers; what is it here 
that ensures that's the case, don't we need to reset something in VMap to 
ensure that's the case? i.e., if we called RemapSourceAtom later what would be 
different?

(I get the overall picture, but not how it's achieved here.)

https://github.com/llvm/llvm-project/pull/133489
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][LoopUnroll] Remap atoms while unrolling (PR #133489)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

Looks good, but question inline

https://github.com/llvm/llvm-project/pull/133489
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][LoopRotate] Remap atoms of duplicated instructions (PR #133490)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM (although the hand-written tests are more fun and easier to read)

https://github.com/llvm/llvm-project/pull/133490
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][LoopRotate] Remap atoms of duplicated instructions (PR #133490)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133490
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][LoopUnswitch] Remap cloned instructions' atoms (PR #133491)

2025-04-11 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

Seems fine, but the test is hard to decipher, future generations will praise us 
if it can be simplified.

https://github.com/llvm/llvm-project/pull/133491
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Propagate DebugLocs on phis in BreakCriticalEdges (PR #133492)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

jmorse wrote:

It can't hurt, let's making things slightly more correct!

https://github.com/llvm/llvm-project/pull/133492
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const 
MachineFunction &MF, unsigned CUID) {
   return PrologEndLoc;
 }
 
+void DwarfDebug::findKeyInstructions(const MachineFunction *MF) {
+  // New function - reset KeyInstructions.
+  KeyInstructions.clear();
+
+  // The current candidate is_stmt instructions for each source atom.
+  // Map {(InlinedAt, Group): (Rank, Instructions)}.
+  DenseMap,
+   std::pair>>
+  GroupCandidates;
+
+  // For each instruction:
+  //   * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros.
+  //   * Check if insts in this group have been seen already in 
GroupCandidates.
+  // * If this instr rank is equal, add this instruction to 
KeyInstructions.
+  //   Remove existing instructions from KeyInstructions if they have the
+  //   same parent.
+  // * If this instr rank is higher (lower precedence), ignore it.
+  // * If this instr rank is lower (higher precedence), erase existing
+  //   instructions from KeyInstructions. Add this instr to 
KeyInstructions.
+
+  for (auto &MBB : *MF) {
+// Rather than apply is_stmt directly to Key Instructions, we "float"
+// is_stmt up to the 1st instruction with the same line number in a
+// contiguous block. That instruction is called the "buoy". The
+// buoy gets reset if we encouner an instruction with an atom
+// group.
+const MachineInstr *Buoy = nullptr;
+// The atom group number associated with Buoy which may be 0 if we haven't
+// encountered an atom group yet in this blob of instructions with the same
+// line number.
+uint64_t BuoyAtom = 0;
+
+for (auto &MI : MBB) {
+  if (MI.isMetaInstruction())
+continue;
+
+  if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine())
+continue;
+
+  // Reset the Buoy to this instruciton if it has a different line number.
+  if (!Buoy ||
+  Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) {
+Buoy = &MI;
+BuoyAtom = 0;
+  }
+
+  // Call instructions are handled specially - we always mark them as key
+  // regardless of atom info.
+  const auto &TII =
+  *MI.getParent()->getParent()->getSubtarget().getInstrInfo();
+  if (MI.isCall() || TII.isTailCall(MI)) {
+assert(MI.getDebugLoc() && "Unexpectedly missing DL");
+
+// Calls are always key.
+KeyInstructions.insert(Buoy);
+
+uint64_t Group = MI.getDebugLoc()->getAtomGroup();
+uint8_t Rank = MI.getDebugLoc()->getAtomRank();
+if (Group && Rank) {
+  auto *InlinedAt = MI.getDebugLoc()->getInlinedAt();
+  auto &[CandidateRank, CandidateInsts] = GroupCandidates[{InlinedAt, 
Group}];
+
+  // This looks similar to the non-call handling code, except that
+  // we don't put the call into CandidateInsts so that they can't be
+  // made un-key. As a result, we also have to take special care not
+  // to erase the is_stmt from the buoy, and prevent that happening
+  // in the future.
+
+  if (CandidateRank == Rank) {
+// We've seen other instructions in this group of this rank. 
Discard
+// ones we've seen in this block, keep the others.
+assert(!CandidateInsts.empty());
+SmallVector Insts;
+Insts.reserve(CandidateInsts.size());
+for (auto &PrevInst : CandidateInsts) {
+  if (PrevInst->getParent() != MI.getParent())
+Insts.push_back(PrevInst);
+  else if (PrevInst != Buoy)
+KeyInstructions.erase(PrevInst);
+}
+
+if (Insts.empty()) {
+  CandidateInsts.clear();
+  CandidateRank = 0;
+} else {
+  CandidateInsts = std::move(Insts);
+}
+
+  } else if (CandidateRank > Rank) {
+// We've seen other instructions in this group of lower precedence
+// (higher rank). Discard them.
+for (auto *Supplanted : CandidateInsts) {
+  // Don't erase the is_stmt we're using for this call.
+  if (Supplanted != Buoy)
+KeyInstructions.erase(Supplanted);
+}
+CandidateInsts.clear();
+CandidateRank = 0;
+  }
+}
+
+// Avoid floating any future is_stmts up to the call.
+Buoy = nullptr;
+continue;
+  }
+
+  auto *InlinedAt = MI.getDebugLoc()->getInlinedAt();
+  uint64_t Group = MI.getDebugLoc()->getAtomGroup();
+  uint8_t Rank = MI.getDebugLoc()->getAtomRank();
+  if (!Group || !Rank)
+continue;
+
+  // Don't let is_stmts float past instructions from different source 
atoms.
+  if (BuoyAtom && BuoyAtom != Group) {
+Buoy = &MI;
+BuoyAtom = MI.getDebugLoc()->getAtomGroup();

jmorse wrote:

```suggestion
BuoyAtom = Group;
```
Avo

[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,78 @@
+# RUN: llc %s --start-after=livedebugvalues --dwarf-use-key-instructions 
--filetype=obj -o - \
+# RUN: | llvm-objdump -d - --no-show-raw-insn \
+# RUN: | FileCheck %s --check-prefix=OBJ
+
+# RUN: llc %s --start-after=livedebugvalues --dwarf-use-key-instructions 
--filetype=obj -o - \
+# RUN: | llvm-dwarfdump - --debug-line \
+# RUN: | FileCheck %s --check-prefix=DBG
+
+# OBJ:  <_Z1fPiii>:
+# OBJ-NEXT:  0:   movl$0x0, %ebx
+# OBJ-NEXT:  5:   movl$0x1, %ebx
+# OBJ-NEXT:  a:   movl$0x2, %ebx
+# OBJ-NEXT:  f:   movl$0x3, %ebx
+# OBJ-NEXT: 14:   movl$0x4, %eax
+# OBJ-NEXT: 19:   movl$0x5, %eax
+# OBJ-NEXT: 1e:   movl$0x6, %eax
+# OBJ-NEXT: 23:   movl$0x7, %eax
+# OBJ-NEXT: 28:   retq
+
+# DBG:  AddressLine   Column File   ISA Discriminator OpIndex 
Flags
+# DBG-NEXT: -- -- -- -- --- - --- 
-
+# DBG-NEXT: 0x  1  0  0   0 0   0  
is_stmt prologue_end
+# DBG-NEXT: 0x0005  2  0  0   0 0   0  
is_stmt
+# DBG-NEXT: 0x000a  2  0  0   0 0   0
+# DBG-NEXT: 0x000f  2  0  0   0 0   0
+# DBG-NEXT: 0x0014  2  0  0   0 0   0
+# DBG-NEXT: 0x0019  2  0  0   0 0   0
+# DBG-NEXT: 0x001e  2  0  0   0 0   0  
is_stmt
+# DBG-NEXT: 0x0023  2  0  0   0 0   0  
is_stmt
+# DBG-NEXT: 0x0029  2  0  0   0 0   0  
is_stmt end_sequence
+
+## Check that interleaving atoms on the same line still produces reasonable
+## is_stmt placement (the is_stmts want to "float up" to the first instruction
+## in a contiguous set with the same line, but we don't let them float past
+## other atom groups).
+
+--- |
+  target triple = "x86_64-unknown-linux-gnu"
+
+  define hidden noundef i32 @_Z1fPiii(ptr %a, i32 %b, i32 %c, i1 %cond) 
local_unnamed_addr !dbg !5 {
+  entry:
+ret i32 2
+  }
+
+  !llvm.dbg.cu = !{!0}
+  !llvm.module.flags = !{!2, !3}
+  !llvm.ident = !{!4}
+
+  !0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_17, file: !1, 
producer: "clang version 19.0.0", isOptimized: true, runtimeVersion: 0, 
emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None)
+  !1 = !DIFile(filename: "test.cpp", directory: "/")
+  !2 = !{i32 7, !"Dwarf Version", i32 5}
+  !3 = !{i32 2, !"Debug Info Version", i32 3}
+  !4 = !{!"clang version 19.0.0"}
+  !5 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 1, type: 
!6, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: 
DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+  !6 = !DISubroutineType(types: !7)
+  !7 = !{}
+  !8 = !DILocalVariable(name: "x", scope: !5, file: !1, line: 1, type: !7)
+
+...
+---
+name:_Z1fPiii
+alignment:   16
+body: |
+  bb.0.entry:
+$ebx = MOV32ri 0, debug-location !DILocation(line: 1, scope: !5) 
+;; is_stmt floats up here from mov 3.
+$ebx = MOV32ri 1, debug-location !DILocation(line: 2, scope: !5, 
atomGroup: 1, atomRank: 1)
+$ebx = MOV32ri 2, debug-location !DILocation(line: 2, scope: !5, 
atomGroup: 1, atomRank: 2)
+$ebx = MOV32ri 3, debug-location !DILocation(line: 2, scope: !5, 
atomGroup: 1, atomRank: 1)
+$eax = MOV32ri 4, debug-location !DILocation(line: 2, scope: !5)
+$eax = MOV32ri 5, debug-location !DILocation(line: 2, scope: !5, 
atomGroup: 2, atomRank: 1)

jmorse wrote:

My understanding is that this instruction should get an is_stmt in its own 
entry because it's a separate span of atom group 2, is that right? But doesn't 
(address 0x19 above?) as far as I can tell.

https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,117 @@
+; RUN: llc %s --filetype=obj -o - --dwarf-use-key-instructions \
+; RUN: | llvm-objdump -d - --no-show-raw-insn \
+; RUN: | FileCheck %s --check-prefix=OBJ
+
+; RUN: llc %s --filetype=obj -o - --dwarf-use-key-instructions \
+; RUN: | llvm-dwarfdump - --debug-line \
+; RUN: | FileCheck %s --check-prefix=DBG
+
+; OBJ: :
+; OBJ-NEXT:  0:   pushq   %rbp
+; OBJ-NEXT:  1:   pushq   %r14
+; OBJ-NEXT:  3:   pushq   %rbx
+; OBJ-NEXT:  4:   movq(%rip), %rax
+; OBJ-NEXT:  b:   movl(%rax), %ebp
+; OBJ-NEXT:  d:   callq   0x12 
+; OBJ-NEXT: 12:   callq   0x17 
+; OBJ-NEXT: 17:   movl%eax, %ebx
+; OBJ-NEXT: 19:   addl%ebp, %ebx
+; OBJ-NEXT: 1b:   movq(%rip), %r14
+; OBJ-NEXT: 22:   movl$0x1, (%r14)
+; OBJ-NEXT: 29:   callq   0x2e 
+; OBJ-NEXT: 2e:   movl$0x2, (%r14)
+; OBJ-NEXT: 35:   callq   0x3a 
+; OBJ-NEXT: 3a:   movl$0x3, (%r14)
+; OBJ-NEXT: 41:   callq   0x46 
+; OBJ-NEXT: 46:   movl$0x4, (%r14)
+; OBJ-NEXT: 4d:   callq   0x52 
+; OBJ-NEXT: 52:   movl%ebx, %eax
+; OBJ-NEXT: 54:   popq%rbx
+; OBJ-NEXT: 55:   popq%r14
+; OBJ-NEXT: 57:   popq%rbp
+; OBJ-NEXT: 58:   retq
+
+; DBG:  AddressLine   Column File   ISA Discriminator OpIndex 
Flags
+; DBG-NEXT: -- -- -- -- --- - --- 
-
+; DBG-NEXT: 0x  1  0  0   0 0   0  
is_stmt
+; DBG-NEXT: 0x0004  2  0  0   0 0   0  
is_stmt prologue_end
+
+;; Test A:
+;; Check the 1st call (line 3) gets is_stmt despite having no atom group.
+; DBG-NEXT: 0x000d  3  0  0   0 0   0  
is_stmt
+
+;; Test B:
+;; Check the 2nd call (line 4) gets is_stmt applied despite being part of group
+;; 1 and having lower precedence than the add. Check that the add stil gets
+;; is_stmt applied.
+;; There are two is_stmt line 4 entries are is_stmt because we don't float

jmorse wrote:

```suggestion
;; There are two is_stmt line 4 entries because we don't float
```

https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

Tests: I'd personally prefer the input source and explanation at the top of the 
file, although this is a style thing.

My understanding of this code is that within a basic block, it should be 
possible for there to be two sequences of instructions of equal group and rank 
that both get a buoy and is_stmt, if they're separated by some other atom 
group. Perhaps I'm wrong; but if I'm right, it probably wants explicit test 
coverage.

There's a risk that the code being generated is brittle and frequently needs 
updating; I don't know what probability of that we find acceptable, let's just 
see how often it needs updating I guess.

https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

I continued my ramblings inline; I see the potential for some slightly 
unexpected stepping behaviour, but still strictly better than todays 
"everything is a breakpoint" approach. I don't think I can put my finger on a 
specific bad behaviour that might come from this, so LGTM.

(Would still be nice to have the diagram in the test).

https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -1243,6 +1243,140 @@ TEST_F(DILocationTest, Merge) {
 auto *M2 = DILocation::getMergedLocation(A2, B);
 EXPECT_EQ(M1, M2);
   }
+
+#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
+#define EXPECT_ATOM(Loc, Group, Rank)  
\
+  EXPECT_EQ(Group, M->getAtomGroup()); 
\
+  EXPECT_EQ(Rank, M->getAtomRank());
+#else
+#define EXPECT_ATOM(Loc, Group, Rank)  
\
+  EXPECT_EQ(0u, M->getAtomGroup());
\
+  EXPECT_EQ(0u, M->getAtomRank()); 
\
+  (void)Group; 
\
+  (void)Rank;
+#endif
+  // Identical, including source atom numbers.
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+// DILocations are uniqued, so we can check equality by ptr.
+EXPECT_EQ(M, DILocation::getMergedLocation(A, B));
+  }
+
+  // Identical but different atom ranks (same atom) - choose the lowest nonzero
+  // rank.
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0);
+B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 2u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Identical but different atom ranks (different atom) - choose the lowest
+  // nonzero rank.
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0);
+B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 2u, 2u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Identical but equal atom rank (different atom) - choose the lowest 
non-zero
+  // group (arbitrary choice for deterministic behaviour).
+  {
+auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1);
+auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 1u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+
+A = DILocation::get(Context, 2, 7, N, nullptr, false, 0, 1);
+B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1);
+M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 2u, 1u);
+EXPECT_EQ(M, DILocation::getMergedLocation(B, A));
+  }
+
+  // Completely different except same atom numbers. Zero out the atoms.
+  {
+auto *I = DILocation::get(Context, 2, 7, N);
+auto *A = DILocation::get(Context, 1, 6, S, I, false, 1, 1);
+auto *B =
+DILocation::get(Context, 2, 7, getSubprogram(), nullptr, false, 1, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_EQ(0u, M->getLine());
+EXPECT_EQ(0u, M->getColumn());
+EXPECT_TRUE(isa(M->getScope()));
+EXPECT_EQ(S, M->getScope());
+EXPECT_EQ(nullptr, M->getInlinedAt());
+  }
+
+  // Same inlined-at chain but different atoms. Choose the lowest
+  // non-zero group (arbitrary choice for deterministic behaviour).
+  {
+auto *I = DILocation::get(Context, 1, 7, N);
+auto *F = getSubprogram();
+auto *A = DILocation::get(Context, 1, 1, F, I, false, 1, 2);
+auto *B = DILocation::get(Context, 1, 1, F, I, false, 2, 1);
+auto *M = DILocation::getMergedLocation(A, B);
+EXPECT_ATOM(M, 2u, 1u);

jmorse wrote:

No real suggestions; I think the C++17 way might be to invent 
user-defined-literals and an integer-like object, but that seems like waaa 
overkill here. Or maybe enum classes? The unit tests will also (IMHO YMMV?) be 
the only place where literals get fed into this.

An ugly alternative is to put the pattern in the name, i.e. 
"DILocation::getWithAtomRank", so that every time it appears you're reminded 
what argument order it should be.

https://github.com/llvm/llvm-project/pull/133480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM with the style nits

https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const 
MachineFunction &MF, unsigned CUID) {
   return PrologEndLoc;
 }
 
+void DwarfDebug::findKeyInstructions(const MachineFunction *MF) {
+  // New function - reset KeyInstructions.
+  KeyInstructions.clear();
+
+  // The current candidate is_stmt instructions for each source atom.
+  // Map {(InlinedAt, Group): (Rank, Instructions)}.
+  DenseMap,
+   std::pair>>
+  GroupCandidates;
+
+  // For each instruction:
+  //   * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros.
+  //   * Check if insts in this group have been seen already in 
GroupCandidates.
+  // * If this instr rank is equal, add this instruction to 
KeyInstructions.
+  //   Remove existing instructions from KeyInstructions if they have the
+  //   same parent.
+  // * If this instr rank is higher (lower precedence), ignore it.
+  // * If this instr rank is lower (higher precedence), erase existing
+  //   instructions from KeyInstructions. Add this instr to 
KeyInstructions.
+
+  for (auto &MBB : *MF) {
+// Rather than apply is_stmt directly to Key Instructions, we "float"
+// is_stmt up to the 1st instruction with the same line number in a
+// contiguous block. That instruction is called the "buoy". The
+// buoy gets reset if we encouner an instruction with an atom
+// group.
+const MachineInstr *Buoy = nullptr;
+// The atom group number associated with Buoy which may be 0 if we haven't
+// encountered an atom group yet in this blob of instructions with the same
+// line number.
+uint64_t BuoyAtom = 0;
+
+for (auto &MI : MBB) {
+  if (MI.isMetaInstruction())
+continue;
+
+  if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine())
+continue;
+
+  // Reset the Buoy to this instruciton if it has a different line number.
+  if (!Buoy ||
+  Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) {
+Buoy = &MI;
+BuoyAtom = 0;
+  }
+
+  // Call instructions are handled specially - we always mark them as key
+  // regardless of atom info.
+  const auto &TII =
+  *MI.getParent()->getParent()->getSubtarget().getInstrInfo();
+  if (MI.isCall() || TII.isTailCall(MI)) {
+assert(MI.getDebugLoc() && "Unexpectedly missing DL");
+
+// Calls are always key.
+KeyInstructions.insert(Buoy);
+
+uint64_t Group = MI.getDebugLoc()->getAtomGroup();
+uint8_t Rank = MI.getDebugLoc()->getAtomRank();
+if (Group && Rank) {
+  auto *InlinedAt = MI.getDebugLoc()->getInlinedAt();
+  auto &[CandidateRank, CandidateInsts] = GroupCandidates[{InlinedAt, 
Group}];
+
+  // This looks similar to the non-call handling code, except that
+  // we don't put the call into CandidateInsts so that they can't be
+  // made un-key. As a result, we also have to take special care not
+  // to erase the is_stmt from the buoy, and prevent that happening
+  // in the future.
+
+  if (CandidateRank == Rank) {
+// We've seen other instructions in this group of this rank. 
Discard
+// ones we've seen in this block, keep the others.
+assert(!CandidateInsts.empty());
+SmallVector Insts;
+Insts.reserve(CandidateInsts.size());
+for (auto &PrevInst : CandidateInsts) {
+  if (PrevInst->getParent() != MI.getParent())
+Insts.push_back(PrevInst);
+  else if (PrevInst != Buoy)
+KeyInstructions.erase(PrevInst);
+}
+
+if (Insts.empty()) {
+  CandidateInsts.clear();
+  CandidateRank = 0;
+} else {
+  CandidateInsts = std::move(Insts);
+}
+
+  } else if (CandidateRank > Rank) {
+// We've seen other instructions in this group of lower precedence
+// (higher rank). Discard them.
+for (auto *Supplanted : CandidateInsts) {
+  // Don't erase the is_stmt we're using for this call.
+  if (Supplanted != Buoy)
+KeyInstructions.erase(Supplanted);
+}
+CandidateInsts.clear();
+CandidateRank = 0;
+  }
+}
+
+// Avoid floating any future is_stmts up to the call.
+Buoy = nullptr;
+continue;
+  }
+
+  auto *InlinedAt = MI.getDebugLoc()->getInlinedAt();
+  uint64_t Group = MI.getDebugLoc()->getAtomGroup();
+  uint8_t Rank = MI.getDebugLoc()->getAtomRank();
+  if (!Group || !Rank)
+continue;
+
+  // Don't let is_stmts float past instructions from different source 
atoms.
+  if (BuoyAtom && BuoyAtom != Group) {
+Buoy = &MI;
+BuoyAtom = MI.getDebugLoc()->getAtomGroup();
+  }
+
+  auto &[CandidateRank, CandidateInsts] = GroupCandidates[{Inli

[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const 
MachineFunction &MF, unsigned CUID) {
   return PrologEndLoc;
 }
 
+void DwarfDebug::findKeyInstructions(const MachineFunction *MF) {
+  // New function - reset KeyInstructions.
+  KeyInstructions.clear();
+
+  // The current candidate is_stmt instructions for each source atom.
+  // Map {(InlinedAt, Group): (Rank, Instructions)}.
+  DenseMap,
+   std::pair>>
+  GroupCandidates;

jmorse wrote:

This will be big due to each dense-map element containing a SmallVector of 
pointers; IMO we need to do some profiling and set a SmallVector allocation 
size.

https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const 
MachineFunction &MF, unsigned CUID) {
   return PrologEndLoc;
 }
 
+void DwarfDebug::findKeyInstructions(const MachineFunction *MF) {
+  // New function - reset KeyInstructions.
+  KeyInstructions.clear();
+
+  // The current candidate is_stmt instructions for each source atom.
+  // Map {(InlinedAt, Group): (Rank, Instructions)}.
+  DenseMap,
+   std::pair>>
+  GroupCandidates;
+
+  // For each instruction:
+  //   * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros.
+  //   * Check if insts in this group have been seen already in 
GroupCandidates.
+  // * If this instr rank is equal, add this instruction to 
KeyInstructions.
+  //   Remove existing instructions from KeyInstructions if they have the
+  //   same parent.
+  // * If this instr rank is higher (lower precedence), ignore it.
+  // * If this instr rank is lower (higher precedence), erase existing
+  //   instructions from KeyInstructions. Add this instr to 
KeyInstructions.
+
+  for (auto &MBB : *MF) {
+// Rather than apply is_stmt directly to Key Instructions, we "float"
+// is_stmt up to the 1st instruction with the same line number in a
+// contiguous block. That instruction is called the "buoy". The
+// buoy gets reset if we encouner an instruction with an atom
+// group.
+const MachineInstr *Buoy = nullptr;
+// The atom group number associated with Buoy which may be 0 if we haven't
+// encountered an atom group yet in this blob of instructions with the same
+// line number.
+uint64_t BuoyAtom = 0;
+
+for (auto &MI : MBB) {
+  if (MI.isMetaInstruction())
+continue;
+
+  if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine())
+continue;
+
+  // Reset the Buoy to this instruciton if it has a different line number.
+  if (!Buoy ||
+  Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) {
+Buoy = &MI;
+BuoyAtom = 0;
+  }
+
+  // Call instructions are handled specially - we always mark them as key
+  // regardless of atom info.
+  const auto &TII =
+  *MI.getParent()->getParent()->getSubtarget().getInstrInfo();
+  if (MI.isCall() || TII.isTailCall(MI)) {
+assert(MI.getDebugLoc() && "Unexpectedly missing DL");
+
+// Calls are always key.
+KeyInstructions.insert(Buoy);
+
+uint64_t Group = MI.getDebugLoc()->getAtomGroup();
+uint8_t Rank = MI.getDebugLoc()->getAtomRank();
+if (Group && Rank) {
+  auto *InlinedAt = MI.getDebugLoc()->getInlinedAt();
+  auto &[CandidateRank, CandidateInsts] = GroupCandidates[{InlinedAt, 
Group}];
+
+  // This looks similar to the non-call handling code, except that
+  // we don't put the call into CandidateInsts so that they can't be
+  // made un-key. As a result, we also have to take special care not
+  // to erase the is_stmt from the buoy, and prevent that happening
+  // in the future.
+
+  if (CandidateRank == Rank) {
+// We've seen other instructions in this group of this rank. 
Discard
+// ones we've seen in this block, keep the others.
+assert(!CandidateInsts.empty());
+SmallVector Insts;
+Insts.reserve(CandidateInsts.size());
+for (auto &PrevInst : CandidateInsts) {
+  if (PrevInst->getParent() != MI.getParent())
+Insts.push_back(PrevInst);
+  else if (PrevInst != Buoy)
+KeyInstructions.erase(PrevInst);
+}
+
+if (Insts.empty()) {
+  CandidateInsts.clear();
+  CandidateRank = 0;
+} else {
+  CandidateInsts = std::move(Insts);
+}
+
+  } else if (CandidateRank > Rank) {
+// We've seen other instructions in this group of lower precedence
+// (higher rank). Discard them.
+for (auto *Supplanted : CandidateInsts) {
+  // Don't erase the is_stmt we're using for this call.
+  if (Supplanted != Buoy)
+KeyInstructions.erase(Supplanted);
+}
+CandidateInsts.clear();
+CandidateRank = 0;
+  }
+}
+
+// Avoid floating any future is_stmts up to the call.
+Buoy = nullptr;
+continue;
+  }
+
+  auto *InlinedAt = MI.getDebugLoc()->getInlinedAt();
+  uint64_t Group = MI.getDebugLoc()->getAtomGroup();
+  uint8_t Rank = MI.getDebugLoc()->getAtomRank();
+  if (!Group || !Rank)
+continue;
+
+  // Don't let is_stmts float past instructions from different source 
atoms.
+  if (BuoyAtom && BuoyAtom != Group) {
+Buoy = &MI;
+BuoyAtom = MI.getDebugLoc()->getAtomGroup();
+  }
+
+  auto &[CandidateRank, CandidateInsts] = GroupCandidates[{Inli

[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -2087,13 +2095,18 @@ void DwarfDebug::beginInstruction(const MachineInstr 
*MI) {
 // If we have an ongoing unspecified location, nothing to do here.
 if (!DL)
   return;
-// We have an explicit location, same as the previous location.
-// But we might be coming back to it after a line 0 record.
-if ((LastAsmLine == 0 && DL.getLine() != 0) || Flags) {
-  // Reinstate the source location but not marked as a statement.
-  RecordSourceLine(DL, Flags);
+
+// Skip this if the instruction is Key, else we might accidentally miss an
+// is_stmt.
+if (!IsKey) {

jmorse wrote:

Can we not fold this test, and the comment, into the existing test? "If we have 
an explicit non-key location". (Avoids some un-necessary indentation).

https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const 
MachineFunction &MF, unsigned CUID) {
   return PrologEndLoc;
 }
 
+void DwarfDebug::findKeyInstructions(const MachineFunction *MF) {
+  // New function - reset KeyInstructions.
+  KeyInstructions.clear();
+
+  // The current candidate is_stmt instructions for each source atom.
+  // Map {(InlinedAt, Group): (Rank, Instructions)}.
+  DenseMap,
+   std::pair>>
+  GroupCandidates;
+
+  // For each instruction:
+  //   * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros.
+  //   * Check if insts in this group have been seen already in 
GroupCandidates.
+  // * If this instr rank is equal, add this instruction to 
KeyInstructions.
+  //   Remove existing instructions from KeyInstructions if they have the
+  //   same parent.
+  // * If this instr rank is higher (lower precedence), ignore it.
+  // * If this instr rank is lower (higher precedence), erase existing
+  //   instructions from KeyInstructions. Add this instr to 
KeyInstructions.
+
+  for (auto &MBB : *MF) {
+// Rather than apply is_stmt directly to Key Instructions, we "float"
+// is_stmt up to the 1st instruction with the same line number in a
+// contiguous block. That instruction is called the "buoy". The
+// buoy gets reset if we encouner an instruction with an atom
+// group.
+const MachineInstr *Buoy = nullptr;
+// The atom group number associated with Buoy which may be 0 if we haven't
+// encountered an atom group yet in this blob of instructions with the same
+// line number.
+uint64_t BuoyAtom = 0;
+
+for (auto &MI : MBB) {
+  if (MI.isMetaInstruction())
+continue;
+
+  if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine())
+continue;
+
+  // Reset the Buoy to this instruciton if it has a different line number.
+  if (!Buoy ||
+  Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) {
+Buoy = &MI;
+BuoyAtom = 0;
+  }
+
+  // Call instructions are handled specially - we always mark them as key
+  // regardless of atom info.
+  const auto &TII =
+  *MI.getParent()->getParent()->getSubtarget().getInstrInfo();
+  if (MI.isCall() || TII.isTailCall(MI)) {
+assert(MI.getDebugLoc() && "Unexpectedly missing DL");
+
+// Calls are always key.
+KeyInstructions.insert(Buoy);
+
+uint64_t Group = MI.getDebugLoc()->getAtomGroup();
+uint8_t Rank = MI.getDebugLoc()->getAtomRank();
+if (Group && Rank) {

jmorse wrote:

Early-continue instead perhaps?

https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

I get the impression that `GroupCandidates` and `KeyInstructions` are being 
kept strictly in sync; thus couldn't one instead just load KeyInstructions from 
GroupCandidates once the scan is complete? This avoids filling up the dense map 
with tombstones.

Am I right in understanding that the buoy means the "least precedence" 
instruction will get the is_stmt if the highest precedence appears after it in 
the contiguous blob? (Seems fine, just making sure I understand).

On the whole, the computation function feels like it could be simpler, but in 
some intangible way I'm not immediately sure of. (Still reading the tests).

https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const 
MachineFunction &MF, unsigned CUID) {
   return PrologEndLoc;
 }
 
+void DwarfDebug::findKeyInstructions(const MachineFunction *MF) {
+  // New function - reset KeyInstructions.
+  KeyInstructions.clear();
+
+  // The current candidate is_stmt instructions for each source atom.
+  // Map {(InlinedAt, Group): (Rank, Instructions)}.
+  DenseMap,
+   std::pair>>
+  GroupCandidates;
+
+  // For each instruction:
+  //   * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros.
+  //   * Check if insts in this group have been seen already in 
GroupCandidates.
+  // * If this instr rank is equal, add this instruction to 
KeyInstructions.
+  //   Remove existing instructions from KeyInstructions if they have the
+  //   same parent.
+  // * If this instr rank is higher (lower precedence), ignore it.
+  // * If this instr rank is lower (higher precedence), erase existing
+  //   instructions from KeyInstructions. Add this instr to 
KeyInstructions.
+
+  for (auto &MBB : *MF) {
+// Rather than apply is_stmt directly to Key Instructions, we "float"
+// is_stmt up to the 1st instruction with the same line number in a
+// contiguous block. That instruction is called the "buoy". The
+// buoy gets reset if we encouner an instruction with an atom
+// group.
+const MachineInstr *Buoy = nullptr;
+// The atom group number associated with Buoy which may be 0 if we haven't
+// encountered an atom group yet in this blob of instructions with the same
+// line number.
+uint64_t BuoyAtom = 0;
+
+for (auto &MI : MBB) {
+  if (MI.isMetaInstruction())
+continue;
+
+  if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine())
+continue;
+
+  // Reset the Buoy to this instruciton if it has a different line number.

jmorse wrote:

```suggestion
  // Reset the Buoy to this instruction if it has a different line number.
```

https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const 
MachineFunction &MF, unsigned CUID) {
   return PrologEndLoc;
 }
 
+void DwarfDebug::findKeyInstructions(const MachineFunction *MF) {
+  // New function - reset KeyInstructions.
+  KeyInstructions.clear();
+
+  // The current candidate is_stmt instructions for each source atom.
+  // Map {(InlinedAt, Group): (Rank, Instructions)}.
+  DenseMap,
+   std::pair>>
+  GroupCandidates;
+
+  // For each instruction:
+  //   * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros.
+  //   * Check if insts in this group have been seen already in 
GroupCandidates.
+  // * If this instr rank is equal, add this instruction to 
KeyInstructions.
+  //   Remove existing instructions from KeyInstructions if they have the
+  //   same parent.
+  // * If this instr rank is higher (lower precedence), ignore it.
+  // * If this instr rank is lower (higher precedence), erase existing
+  //   instructions from KeyInstructions. Add this instr to 
KeyInstructions.
+
+  for (auto &MBB : *MF) {
+// Rather than apply is_stmt directly to Key Instructions, we "float"
+// is_stmt up to the 1st instruction with the same line number in a
+// contiguous block. That instruction is called the "buoy". The
+// buoy gets reset if we encouner an instruction with an atom
+// group.
+const MachineInstr *Buoy = nullptr;
+// The atom group number associated with Buoy which may be 0 if we haven't
+// encountered an atom group yet in this blob of instructions with the same
+// line number.
+uint64_t BuoyAtom = 0;
+
+for (auto &MI : MBB) {
+  if (MI.isMetaInstruction())
+continue;
+
+  if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine())
+continue;
+
+  // Reset the Buoy to this instruciton if it has a different line number.
+  if (!Buoy ||
+  Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) {
+Buoy = &MI;
+BuoyAtom = 0;
+  }
+
+  // Call instructions are handled specially - we always mark them as key
+  // regardless of atom info.
+  const auto &TII =
+  *MI.getParent()->getParent()->getSubtarget().getInstrInfo();
+  if (MI.isCall() || TII.isTailCall(MI)) {
+assert(MI.getDebugLoc() && "Unexpectedly missing DL");
+
+// Calls are always key.
+KeyInstructions.insert(Buoy);

jmorse wrote:

```suggestion
KeyInstructions.insert(&MI);
```
Avoids any doubt about what's being inserted.

https://github.com/llvm/llvm-project/pull/133495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -1129,13 +1130,17 @@ static void 
cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses(
 
 Instruction *NewBonusInst = BonusInst.clone();
 
-if (!isa(BonusInst) &&
-PTI->getDebugLoc() != NewBonusInst->getDebugLoc()) {
-  // Unless the instruction has the same !dbg location as the original
-  // branch, drop it. When we fold the bonus instructions we want to make
-  // sure we reset their debug locations in order to avoid stepping on
-  // dead code caused by folding dead branches.
-  NewBonusInst->setDebugLoc(DebugLoc());
+if (!isa(BonusInst)) {
+  if (!NewBonusInst->getDebugLoc().isSameSourceLocation(
+  PTI->getDebugLoc())) {
+// Unless the instruction has the same !dbg location as the original
+// branch, drop it. When we fold the bonus instructions we want to make
+// sure we reset their debug locations in order to avoid stepping on
+// dead code caused by folding dead branches.
+NewBonusInst->setDebugLoc(DebugLoc());
+  } else if (const DebugLoc &DL = NewBonusInst->getDebugLoc()) {
+mapAtomInstance(DL, VMap);

jmorse wrote:

For the first part I think that answers the question (multiple instructions 
with the same source-location getting is_stmt). Would there also be potential 
for an intervening non-is_stmt instruction with the same source line between 
the two? It feels possible from instruction scheduling if not immediately in 
this pass, and if so I guess it's a general consequence of the technique.

It doesn't seem unreasonable, although perhaps we need some (debuggers) test 
coverage of how to interpret it. After all we're not trying to program the 
debugger from the compiler, we're communicating that "this source location 
really does happen to execute twice".

> I'm not sure what you mean about PTI's location becoming key? e.g. looking at 
> the test - the new branches in b and c are only "key" because they're clones 
> of the cond br from d (which is already "key").

I was wondering whether there can be a case where the instruction at PTI and 
the bonus instruction are in the same group, but the one being remapped here 
has the higher rank, and putting it in a different group causes the 
previously-lower-ranked instruction to become the highest ranked as a result. 
At face value this isn't a problem because the whole point of this function is 
we're duplicating a code path; but could there be scenarios where LLVM uses 
this utility to implement a move (i.e. clone-then-delete-old-path?)

Or to put it another way: I believe (but may be wrong) that 
`cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses` currently copies/moves 
instructions from one place to another. But, with this change, what was a plain 
copy/move now comes with changes to the stepping behaviour.

I think this boils down to saying "so what?", and the answer to that is "the 
stepping is still superior to without key instructions". I'm just trying to 
think through the consequences of how these changes compose together.

https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Remap cloned PHIs in BreakCriticalEdges (PR #133493)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133493
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Remap cloned PHIs in BreakCriticalEdges (PR #133493)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,52 @@
+; RUN: opt -passes='require,function(codegenprepare)' -S 
-mtriple=x86_64 < %s \
+; RUN: | FileCheck %s
+
+;; Check debug locations are propagated onto new PHIs.

jmorse wrote:

"...and that source locations have their atom groups remapped"

https://github.com/llvm/llvm-project/pull/133493
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Remap cloned PHIs in BreakCriticalEdges (PR #133493)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM with nit

https://github.com/llvm/llvm-project/pull/133493
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Add MIR parser support (PR #133494)

2025-04-23 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/133494
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)

2025-05-02 Thread Jeremy Morse via llvm-branch-commits

jmorse wrote:

Github has helpfully chucked away my comments; was the unit-test-error that I 
fixed corrected? If so, LGTM.

https://github.com/llvm/llvm-project/pull/133480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,2348 @@
+//===-- LVIRReader.cpp 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This implements the LVIRReader class.
+// It supports LLVM text IR and bitcode format.
+//
+//===--===//
+
+#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h"
+#include "llvm/CodeGen/DebugHandlerBase.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVLine.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVScope.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVType.h"
+#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IRReader/IRReader.h"
+#include "llvm/Object/Error.h"
+#include "llvm/Object/IRObjectFile.h"
+#include "llvm/Support/FormatAdapters.h"
+#include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/SourceMgr.h"
+
+using namespace llvm;
+using namespace llvm::object;
+using namespace llvm::logicalview;
+
+#define DEBUG_TYPE "IRReader"
+
+// Extra debug traces. Default is false
+#define DEBUG_ALL
+
+// These flavours of DINodes are not handled:
+//   DW_TAG_APPLE_property   = 19896
+//   DW_TAG_atomic_type  = 71
+//   DW_TAG_common_block = 26
+//   DW_TAG_file_type= 41
+//   DW_TAG_friend   = 42
+//   DW_TAG_generic_subrange = 69
+//   DW_TAG_immutable_type   = 75
+//   DW_TAG_module   = 30
+
+// Create a logical element and setup the following information:
+// - Name, DWARF tag, line
+// - Collect any file information
+LVElement *LVIRReader::constructElement(const DINode *DN) {
+  dwarf::Tag Tag = DN->getTag();
+  LVElement *Element = createElement(Tag);
+  if (Element) {
+Element->setTag(Tag);
+addMD(DN, Element);
+
+StringRef Name = getMDName(DN);
+if (!Name.empty())
+  Element->setName(Name);
+
+// Record any file information.
+if (const DIFile *File = getMDFile(DN))
+  getOrCreateSourceID(File);
+  }
+
+  return Element;
+}
+
+void LVIRReader::mapFortranLanguage(unsigned DWLang) {
+  switch (DWLang) {
+  case dwarf::DW_LANG_Fortran77:
+  case dwarf::DW_LANG_Fortran90:
+  case dwarf::DW_LANG_Fortran95:
+  case dwarf::DW_LANG_Fortran03:
+  case dwarf::DW_LANG_Fortran08:
+  case dwarf::DW_LANG_Fortran18:
+LanguageIsFortran = true;
+break;
+  default:
+LanguageIsFortran = false;
+  }
+}
+
+// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only
+// difference is setting the 'DICompileUnit::splitDebugFilename' to the
+// name of the split filename: "xxx.dwo".
+bool LVIRReader::includeMinimalInlineScopes() const {
+  return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly;
+}
+
+// For the given 'DIFile' generate an index 1-based to indicate the
+// source file where the logical element is declared.
+// In DWARF v4, the files are 1-indexed.
+// In DWARF v5, the files are 0-indexed.
+// The IR reader expects the indexes as 1-indexed.
+// Each compile unit, keeps track of the last assigned index.
+size_t LVIRReader::getOrCreateSourceID(const DIFile *File) {
+  if (!File)
+return 0;
+
+#ifdef DEBUG_ALL
+  LLVM_DEBUG({
+dbgs() << "\n[getOrCreateSourceID] DIFile\n";
+File->dump();
+  });
+#endif
+
+  addMD(File, CompileUnit);
+
+  LLVM_DEBUG({
+dbgs() << "Directory: '" << File->getDirectory() << "'\n";
+dbgs() << "Filename:  '" << File->getFilename() << "'\n";
+  });
+  size_t FileIndex = 0;
+  LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File);
+  if (Iter == CompileUnitFiles.cend()) {
+FileIndex = getFileIndex(CompileUnit);
+std::string Directory(File->getDirectory());
+if (Directory.empty())
+  Directory = std::string(CompileUnit->getCompilationDirectory());
+
+std::string FullName;
+raw_string_ostream Out(FullName);
+Out << Directory << "/" << llvm::sys::path::filename(File->getFilename());
+CompileUnit->addFilename(transformPath(FullName));
+CompileUnitFiles.emplace(File, ++FileIndex);
+updateFileIndex(CompileUnit, FileIndex);
+  } else {
+FileIndex = Iter->second;
+  }
+
+  LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; });
+  return FileIndex;
+}
+
+void LVIRReader::addSourceLine(LVElement *Element, unsigned Line,
+   const DIFile *File) {
+  if (Line == 0)
+return;
+
+  // After the scopes are created, the generic reader traverses the 'Children'
+  // and perform additional setting tasks (resolve types names, references,
+  // etc.). One of those tasks is select the correct string pool index based on
+  // the commmand line o

[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse commented:

Some initial comments; I made it to about 1000 lines into LVIRReader.cpp.

https://github.com/llvm/llvm-project/pull/135440
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,300 @@
+//===-- LVIRReader.h *- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the LVIRReader class, which is used to describe a
+// LLVM IR reader.
+//
+//===--===//
+
+#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+
+#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
+#include "llvm/Transforms/Utils/DebugSSAUpdater.h"
+
+namespace llvm {
+class DIFile;
+class DINode;
+class DILocation;
+class DIScope;
+class DISubprogram;
+class DIVariable;
+class BasicBlock;
+
+namespace object {
+class IRObjectFile;
+}
+
+namespace logicalview {
+
+class LVElement;
+class LVLine;
+class LVScopeCompileUnit;
+class LVSymbol;
+class LVType;
+
+class LVIRReader final : public LVReader {
+  object::IRObjectFile *BitCodeIR = nullptr;
+  MemoryBufferRef *TextualIR = nullptr;
+
+  // Symbols with locations for current compile unit.
+  LVSymbols SymbolsWithLocations;
+
+  LVSectionIndex SectionIndex = 0;
+
+  const DICompileUnit *CUNode = nullptr;
+
+  // The Dwarf Version (from the module flags).
+  unsigned DwarfVersion;
+
+  // Location index for global variables.
+  uint64_t PoolAddressIndex = 0;
+
+  // Whether to emit all linkage names, or just abstract subprograms.
+  bool UseAllLinkageNames = true;
+
+  // Dependencies on external options (llc, etc).
+  bool includeMinimalInlineScopes() const;
+  bool useAllLinkageNames() const { return UseAllLinkageNames; }
+
+  bool LanguageIsFortran = false;
+  void mapFortranLanguage(unsigned DWLang);
+  bool moduleIsInFortran() const { return LanguageIsFortran; }
+
+  // Generate logical debug line before prologue.
+  bool GenerateLineBeforePrologue = true;
+
+  // We assume a constante increase between instructions.
+  const unsigned OffsetIncrease = 4;
+  void updateLineOffset() { CurrentOffset += OffsetIncrease; }
+
+  // An anonymous type for index type.
+  LVType *NodeIndexType = nullptr;
+
+  std::unique_ptr DbgValueRanges;
+
+  // Record the last assigned file index for each compile unit.
+  using LVIndexFiles = std::map;
+  LVIndexFiles IndexFiles;

jmorse wrote:

It wasn't immediately obvious to me that this is a per-CU counter of the /next/ 
FileID to assign to a new file that gets discovered. I think it's worth 
recording in the comment here that this data-structure is to aid mapping 
DIFiles onto a DWARF-like file table, to explain what's going on.

https://github.com/llvm/llvm-project/pull/135440
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -728,14 +514,16 @@ void LVDWARFReader::createLineAndFileRecords(
 for (const DWARFDebugLine::FileNameEntry &Entry :
  Lines->Prologue.FileNames) {
   std::string Directory;
-  if (Lines->getDirectoryForEntry(Entry, Directory))
-Directory = transformPath(Directory);
+  Lines->getDirectoryForEntry(Entry, Directory);
   if (Directory.empty())
 Directory = std::string(CompileUnit->getCompilationDirectory());
-  std::string File = transformPath(dwarf::toStringRef(Entry.Name));
+  // Take just the filename component, as it may be contain the full
+  // path that would be added to the already existing path in Directory.

jmorse wrote:

This is a functional change to the DWARF reader, is it related to adding the 
IRReader, or can we defer it to some other patch instead?

https://github.com/llvm/llvm-project/pull/135440
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,2348 @@
+//===-- LVIRReader.cpp 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This implements the LVIRReader class.
+// It supports LLVM text IR and bitcode format.
+//
+//===--===//
+
+#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h"
+#include "llvm/CodeGen/DebugHandlerBase.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVLine.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVScope.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVType.h"
+#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IRReader/IRReader.h"
+#include "llvm/Object/Error.h"
+#include "llvm/Object/IRObjectFile.h"
+#include "llvm/Support/FormatAdapters.h"
+#include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/SourceMgr.h"
+
+using namespace llvm;
+using namespace llvm::object;
+using namespace llvm::logicalview;
+
+#define DEBUG_TYPE "IRReader"
+
+// Extra debug traces. Default is false
+#define DEBUG_ALL
+
+// These flavours of DINodes are not handled:
+//   DW_TAG_APPLE_property   = 19896
+//   DW_TAG_atomic_type  = 71
+//   DW_TAG_common_block = 26
+//   DW_TAG_file_type= 41
+//   DW_TAG_friend   = 42
+//   DW_TAG_generic_subrange = 69
+//   DW_TAG_immutable_type   = 75
+//   DW_TAG_module   = 30
+
+// Create a logical element and setup the following information:
+// - Name, DWARF tag, line
+// - Collect any file information
+LVElement *LVIRReader::constructElement(const DINode *DN) {
+  dwarf::Tag Tag = DN->getTag();
+  LVElement *Element = createElement(Tag);
+  if (Element) {
+Element->setTag(Tag);
+addMD(DN, Element);
+
+StringRef Name = getMDName(DN);
+if (!Name.empty())
+  Element->setName(Name);
+
+// Record any file information.
+if (const DIFile *File = getMDFile(DN))
+  getOrCreateSourceID(File);
+  }
+
+  return Element;
+}
+
+void LVIRReader::mapFortranLanguage(unsigned DWLang) {
+  switch (DWLang) {
+  case dwarf::DW_LANG_Fortran77:
+  case dwarf::DW_LANG_Fortran90:
+  case dwarf::DW_LANG_Fortran95:
+  case dwarf::DW_LANG_Fortran03:
+  case dwarf::DW_LANG_Fortran08:
+  case dwarf::DW_LANG_Fortran18:
+LanguageIsFortran = true;
+break;
+  default:
+LanguageIsFortran = false;
+  }
+}
+
+// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only
+// difference is setting the 'DICompileUnit::splitDebugFilename' to the
+// name of the split filename: "xxx.dwo".
+bool LVIRReader::includeMinimalInlineScopes() const {
+  return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly;
+}
+
+// For the given 'DIFile' generate an index 1-based to indicate the
+// source file where the logical element is declared.
+// In DWARF v4, the files are 1-indexed.
+// In DWARF v5, the files are 0-indexed.
+// The IR reader expects the indexes as 1-indexed.
+// Each compile unit, keeps track of the last assigned index.
+size_t LVIRReader::getOrCreateSourceID(const DIFile *File) {
+  if (!File)
+return 0;
+
+#ifdef DEBUG_ALL
+  LLVM_DEBUG({
+dbgs() << "\n[getOrCreateSourceID] DIFile\n";
+File->dump();
+  });
+#endif
+
+  addMD(File, CompileUnit);
+
+  LLVM_DEBUG({
+dbgs() << "Directory: '" << File->getDirectory() << "'\n";
+dbgs() << "Filename:  '" << File->getFilename() << "'\n";
+  });
+  size_t FileIndex = 0;
+  LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File);
+  if (Iter == CompileUnitFiles.cend()) {
+FileIndex = getFileIndex(CompileUnit);
+std::string Directory(File->getDirectory());
+if (Directory.empty())
+  Directory = std::string(CompileUnit->getCompilationDirectory());
+
+std::string FullName;
+raw_string_ostream Out(FullName);
+Out << Directory << "/" << llvm::sys::path::filename(File->getFilename());
+CompileUnit->addFilename(transformPath(FullName));
+CompileUnitFiles.emplace(File, ++FileIndex);
+updateFileIndex(CompileUnit, FileIndex);
+  } else {
+FileIndex = Iter->second;
+  }
+
+  LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; });
+  return FileIndex;
+}
+
+void LVIRReader::addSourceLine(LVElement *Element, unsigned Line,
+   const DIFile *File) {
+  if (Line == 0)
+return;
+
+  // After the scopes are created, the generic reader traverses the 'Children'
+  // and perform additional setting tasks (resolve types names, references,
+  // etc.). One of those tasks is select the correct string pool index based on
+  // the commmand line o

[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,300 @@
+//===-- LVIRReader.h *- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the LVIRReader class, which is used to describe a
+// LLVM IR reader.
+//
+//===--===//
+
+#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+
+#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
+#include "llvm/Transforms/Utils/DebugSSAUpdater.h"
+
+namespace llvm {
+class DIFile;
+class DINode;
+class DILocation;
+class DIScope;
+class DISubprogram;
+class DIVariable;
+class BasicBlock;
+
+namespace object {
+class IRObjectFile;
+}
+
+namespace logicalview {
+
+class LVElement;
+class LVLine;
+class LVScopeCompileUnit;
+class LVSymbol;
+class LVType;
+
+class LVIRReader final : public LVReader {
+  object::IRObjectFile *BitCodeIR = nullptr;
+  MemoryBufferRef *TextualIR = nullptr;
+
+  // Symbols with locations for current compile unit.
+  LVSymbols SymbolsWithLocations;
+
+  LVSectionIndex SectionIndex = 0;
+
+  const DICompileUnit *CUNode = nullptr;
+
+  // The Dwarf Version (from the module flags).
+  unsigned DwarfVersion;
+
+  // Location index for global variables.
+  uint64_t PoolAddressIndex = 0;
+
+  // Whether to emit all linkage names, or just abstract subprograms.
+  bool UseAllLinkageNames = true;
+
+  // Dependencies on external options (llc, etc).

jmorse wrote:

It's not clear to me what this means (more elaborated comment please).

https://github.com/llvm/llvm-project/pull/135440
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,2348 @@
+//===-- LVIRReader.cpp 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This implements the LVIRReader class.
+// It supports LLVM text IR and bitcode format.
+//
+//===--===//
+
+#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h"
+#include "llvm/CodeGen/DebugHandlerBase.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVLine.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVScope.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVType.h"
+#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IRReader/IRReader.h"
+#include "llvm/Object/Error.h"
+#include "llvm/Object/IRObjectFile.h"
+#include "llvm/Support/FormatAdapters.h"
+#include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/SourceMgr.h"
+
+using namespace llvm;
+using namespace llvm::object;
+using namespace llvm::logicalview;
+
+#define DEBUG_TYPE "IRReader"
+
+// Extra debug traces. Default is false
+#define DEBUG_ALL
+
+// These flavours of DINodes are not handled:
+//   DW_TAG_APPLE_property   = 19896
+//   DW_TAG_atomic_type  = 71
+//   DW_TAG_common_block = 26
+//   DW_TAG_file_type= 41
+//   DW_TAG_friend   = 42
+//   DW_TAG_generic_subrange = 69
+//   DW_TAG_immutable_type   = 75
+//   DW_TAG_module   = 30
+
+// Create a logical element and setup the following information:
+// - Name, DWARF tag, line
+// - Collect any file information
+LVElement *LVIRReader::constructElement(const DINode *DN) {
+  dwarf::Tag Tag = DN->getTag();
+  LVElement *Element = createElement(Tag);
+  if (Element) {
+Element->setTag(Tag);
+addMD(DN, Element);
+
+StringRef Name = getMDName(DN);
+if (!Name.empty())
+  Element->setName(Name);
+
+// Record any file information.
+if (const DIFile *File = getMDFile(DN))
+  getOrCreateSourceID(File);
+  }
+
+  return Element;
+}
+
+void LVIRReader::mapFortranLanguage(unsigned DWLang) {
+  switch (DWLang) {
+  case dwarf::DW_LANG_Fortran77:
+  case dwarf::DW_LANG_Fortran90:
+  case dwarf::DW_LANG_Fortran95:
+  case dwarf::DW_LANG_Fortran03:
+  case dwarf::DW_LANG_Fortran08:
+  case dwarf::DW_LANG_Fortran18:
+LanguageIsFortran = true;
+break;
+  default:
+LanguageIsFortran = false;
+  }
+}
+
+// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only
+// difference is setting the 'DICompileUnit::splitDebugFilename' to the
+// name of the split filename: "xxx.dwo".
+bool LVIRReader::includeMinimalInlineScopes() const {
+  return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly;
+}
+
+// For the given 'DIFile' generate an index 1-based to indicate the
+// source file where the logical element is declared.
+// In DWARF v4, the files are 1-indexed.
+// In DWARF v5, the files are 0-indexed.
+// The IR reader expects the indexes as 1-indexed.
+// Each compile unit, keeps track of the last assigned index.
+size_t LVIRReader::getOrCreateSourceID(const DIFile *File) {
+  if (!File)
+return 0;
+
+#ifdef DEBUG_ALL
+  LLVM_DEBUG({
+dbgs() << "\n[getOrCreateSourceID] DIFile\n";
+File->dump();
+  });
+#endif
+
+  addMD(File, CompileUnit);
+
+  LLVM_DEBUG({
+dbgs() << "Directory: '" << File->getDirectory() << "'\n";
+dbgs() << "Filename:  '" << File->getFilename() << "'\n";
+  });
+  size_t FileIndex = 0;
+  LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File);
+  if (Iter == CompileUnitFiles.cend()) {
+FileIndex = getFileIndex(CompileUnit);
+std::string Directory(File->getDirectory());
+if (Directory.empty())
+  Directory = std::string(CompileUnit->getCompilationDirectory());
+
+std::string FullName;
+raw_string_ostream Out(FullName);
+Out << Directory << "/" << llvm::sys::path::filename(File->getFilename());
+CompileUnit->addFilename(transformPath(FullName));
+CompileUnitFiles.emplace(File, ++FileIndex);
+updateFileIndex(CompileUnit, FileIndex);
+  } else {
+FileIndex = Iter->second;
+  }
+
+  LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; });
+  return FileIndex;
+}
+
+void LVIRReader::addSourceLine(LVElement *Element, unsigned Line,
+   const DIFile *File) {
+  if (Line == 0)
+return;
+
+  // After the scopes are created, the generic reader traverses the 'Children'
+  // and perform additional setting tasks (resolve types names, references,
+  // etc.). One of those tasks is select the correct string pool index based on
+  // the commmand line o

[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -2124,6 +2125,138 @@ layout and given the number of matches.
   -
   Total   71  8
 
+IR (Textual representation and bitcode) SUPPORT
+~~~
+The below example is used to show the IR output generated by
+:program:`llvm-debuginfo-analyzer`. We compiled the example for a
+IR 64-bit target with Clang (-O0 -g --target=x86_64-linux):
+
+.. code-block:: c++
+
+  1  using INTPTR = const int *;
+  2  int foo(INTPTR ParamPtr, unsigned ParamUnsigned, bool ParamBool) {
+  3if (ParamBool) {
+  4  typedef int INTEGER;
+  5  const INTEGER CONSTANT = 7;
+  6  return CONSTANT;
+  7}
+  8return ParamUnsigned;
+  9  }
+
+PRINT BASIC DETAILS
+^^^
+The following command prints basic details for all the logical elements
+sorted by the debug information internal offset; it includes its lexical
+level and debug info format.
+
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level,format
+  --output-sort=offset
+  --print=scopes,symbols,types,lines,instructions
+  test-clang.ll
+
+or
+
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level,format
+  --output-sort=offset
+  --print=elements
+  test-clang.ll
+
+Each row represents an element that is present within the debug
+information. The first column represents the scope level, followed by
+the associated line number (if any), and finally the description of
+the element.
+
+.. code-block:: none
+
+  Logical View:
+  [000]   {File} 'test-clang.ll' -> Textual IR
+
+  [001] {CompileUnit} 'test.cpp'
+  [002] 2 {Function} extern not_inlined 'foo' -> 'int'
+  [003] {Block}
+  [004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
+  [004] 5 {Line}
+  [004]   {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg 
!32'
+  [004] 6 {Line}
+  [004]   {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
+  [004] 6 {Line}
+  [004]   {Code} 'br label %return, !dbg !33'
+  [003] 2   {Parameter} 'ParamPtr' -> 'INTPTR'
+  [003] 2   {Parameter} 'ParamUnsigned' -> 'unsigned int'
+  [003] 2   {Parameter} 'ParamBool' -> 'bool'
+  [003] 4   {TypeAlias} 'INTEGER' -> 'int'
+  [003] 2   {Line}
+  [003] {Code} '%retval = alloca i32, align 4'
+  [003] {Code} '%ParamPtr.addr = alloca ptr, align 8'
+  [003] {Code} '%ParamUnsigned.addr = alloca i32, align 4'
+  [003] {Code} '%ParamBool.addr = alloca i8, align 1'
+  [003] {Code} '%CONSTANT = alloca i32, align 4'
+  [003] {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 
8'
+  [003] {Code} 'store i32 %ParamUnsigned, ptr 
%ParamUnsigned.addr, align 4'
+  [003] {Code} '%storedv = zext i1 %ParamBool to i8'
+  [003] {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 
1'
+  [003] 8   {Line}
+  [003] {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 
4, !dbg !34'
+  [003] 8   {Line}
+  [003] {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
+  [003] 8   {Line}
+  [003] {Code} 'br label %return, !dbg !35'
+  [003] 9   {Line}
+  [003] {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
+  [003] 9   {Line}
+  [003] {Code} 'ret i32 %2, !dbg !36'
+  [003] 3   {Line}
+  [003] 3   {Line}
+  [003] 3   {Line}
+  [003] {Code} 'br i1 %loadedv, label %if.then, label %if.end, 
!dbg !26'
+  [002] 1 {TypeAlias} 'INTPTR' -> '* const int'
+
+SELECT LOGICAL ELEMENTS
+^^^
+The following prints all *instructions*, *symbols* and *types* that
+contain **'block'** or **'.store'** in their names or types, using a tab
+layout and given the number of matches.
+
+.. code-block:: none

jmorse wrote:

I wonder if there's a more illustrative example to motivate the reader -- is it 
possible to search for  "INTPTR" and discover both the type alias and the 
parameter? That's the sort of query I think I'd end up making: "I can see this 
type in the output format, but why is it present? -> Ah, it's a parameter to a 
function".

(The current example is fine, I'm just trying to imagine a better one).

https://github.com/llvm/llvm-project/pull/135440
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,300 @@
+//===-- LVIRReader.h *- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file defines the LVIRReader class, which is used to describe a
+// LLVM IR reader.
+//
+//===--===//
+
+#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+
+#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
+#include "llvm/Transforms/Utils/DebugSSAUpdater.h"
+
+namespace llvm {
+class DIFile;
+class DINode;
+class DILocation;
+class DIScope;
+class DISubprogram;
+class DIVariable;
+class BasicBlock;
+
+namespace object {
+class IRObjectFile;
+}
+
+namespace logicalview {
+
+class LVElement;
+class LVLine;
+class LVScopeCompileUnit;
+class LVSymbol;
+class LVType;
+
+class LVIRReader final : public LVReader {
+  object::IRObjectFile *BitCodeIR = nullptr;
+  MemoryBufferRef *TextualIR = nullptr;
+
+  // Symbols with locations for current compile unit.
+  LVSymbols SymbolsWithLocations;
+
+  LVSectionIndex SectionIndex = 0;
+
+  const DICompileUnit *CUNode = nullptr;
+
+  // The Dwarf Version (from the module flags).
+  unsigned DwarfVersion;
+
+  // Location index for global variables.
+  uint64_t PoolAddressIndex = 0;
+
+  // Whether to emit all linkage names, or just abstract subprograms.
+  bool UseAllLinkageNames = true;
+
+  // Dependencies on external options (llc, etc).
+  bool includeMinimalInlineScopes() const;
+  bool useAllLinkageNames() const { return UseAllLinkageNames; }
+
+  bool LanguageIsFortran = false;
+  void mapFortranLanguage(unsigned DWLang);
+  bool moduleIsInFortran() const { return LanguageIsFortran; }
+
+  // Generate logical debug line before prologue.
+  bool GenerateLineBeforePrologue = true;
+
+  // We assume a constante increase between instructions.
+  const unsigned OffsetIncrease = 4;
+  void updateLineOffset() { CurrentOffset += OffsetIncrease; }
+
+  // An anonymous type for index type.
+  LVType *NodeIndexType = nullptr;
+
+  std::unique_ptr DbgValueRanges;
+
+  // Record the last assigned file index for each compile unit.
+  using LVIndexFiles = std::map;
+  LVIndexFiles IndexFiles;
+
+  void updateFileIndex(LVScopeCompileUnit *CompileUnit, size_t FileIndex) {
+LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+if (Iter == IndexFiles.end())
+  IndexFiles.emplace(CompileUnit, FileIndex);
+else
+  Iter->second = FileIndex;
+  }
+
+  // Get the current assigned index file for the given compile unit.
+  size_t getFileIndex(LVScopeCompileUnit *CompileUnit) {
+size_t FileIndex = 0;
+LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+if (Iter != IndexFiles.end())
+  FileIndex = Iter->second;
+return FileIndex;
+  }
+
+  // Collect the compile unit metadata files.
+  using LVCompileUnitFiles = std::map;
+  LVCompileUnitFiles CompileUnitFiles;
+
+  size_t getOrCreateSourceID(const DIFile *File);
+
+  // Associate the logical elements to metadata objects.
+  using LVMDObjects = std::map;
+  LVMDObjects MDObjects;
+
+  void addMD(const MDNode *MD, LVElement *Element) {
+if (MDObjects.find(MD) == MDObjects.end())
+  MDObjects.emplace(MD, Element);
+  }
+  LVElement *getElementForSeenMD(const MDNode *MD) const {
+LVMDObjects::const_iterator Iter = MDObjects.find(MD);
+return Iter != MDObjects.end() ? Iter->second : nullptr;
+  }
+  LVScope *getScopeForSeenMD(const MDNode *MD) const {
+return static_cast(getElementForSeenMD(MD));
+  }
+  LVSymbol *getSymbolForSeenMD(const MDNode *MD) const {
+return static_cast(getElementForSeenMD(MD));
+  }
+  LVType *getTypeForSeenMD(const MDNode *MD) const {
+return static_cast(getElementForSeenMD(MD));
+  }
+  LVType *getLineForSeenMD(const MDNode *MD) const {
+return static_cast(getElementForSeenMD(MD));
+  }
+
+  // Inlined concrete scopes with its associated inlined abstract scopes.

jmorse wrote:

```suggestion
  // Inlined concrete scopes mapped to the associated inlined abstract scopes.
```

https://github.com/llvm/llvm-project/pull/135440
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -0,0 +1,2348 @@
+//===-- LVIRReader.cpp 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This implements the LVIRReader class.
+// It supports LLVM text IR and bitcode format.
+//
+//===--===//
+
+#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h"
+#include "llvm/CodeGen/DebugHandlerBase.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVLine.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVScope.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h"
+#include "llvm/DebugInfo/LogicalView/Core/LVType.h"
+#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IRReader/IRReader.h"
+#include "llvm/Object/Error.h"
+#include "llvm/Object/IRObjectFile.h"
+#include "llvm/Support/FormatAdapters.h"
+#include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/SourceMgr.h"
+
+using namespace llvm;
+using namespace llvm::object;
+using namespace llvm::logicalview;
+
+#define DEBUG_TYPE "IRReader"
+
+// Extra debug traces. Default is false
+#define DEBUG_ALL
+
+// These flavours of DINodes are not handled:
+//   DW_TAG_APPLE_property   = 19896
+//   DW_TAG_atomic_type  = 71
+//   DW_TAG_common_block = 26
+//   DW_TAG_file_type= 41
+//   DW_TAG_friend   = 42
+//   DW_TAG_generic_subrange = 69
+//   DW_TAG_immutable_type   = 75
+//   DW_TAG_module   = 30
+
+// Create a logical element and setup the following information:
+// - Name, DWARF tag, line
+// - Collect any file information
+LVElement *LVIRReader::constructElement(const DINode *DN) {
+  dwarf::Tag Tag = DN->getTag();
+  LVElement *Element = createElement(Tag);
+  if (Element) {
+Element->setTag(Tag);
+addMD(DN, Element);
+
+StringRef Name = getMDName(DN);
+if (!Name.empty())
+  Element->setName(Name);
+
+// Record any file information.
+if (const DIFile *File = getMDFile(DN))
+  getOrCreateSourceID(File);
+  }
+
+  return Element;
+}
+
+void LVIRReader::mapFortranLanguage(unsigned DWLang) {
+  switch (DWLang) {
+  case dwarf::DW_LANG_Fortran77:
+  case dwarf::DW_LANG_Fortran90:
+  case dwarf::DW_LANG_Fortran95:
+  case dwarf::DW_LANG_Fortran03:
+  case dwarf::DW_LANG_Fortran08:
+  case dwarf::DW_LANG_Fortran18:
+LanguageIsFortran = true;
+break;
+  default:
+LanguageIsFortran = false;
+  }
+}
+
+// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only
+// difference is setting the 'DICompileUnit::splitDebugFilename' to the
+// name of the split filename: "xxx.dwo".
+bool LVIRReader::includeMinimalInlineScopes() const {
+  return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly;
+}
+
+// For the given 'DIFile' generate an index 1-based to indicate the
+// source file where the logical element is declared.
+// In DWARF v4, the files are 1-indexed.
+// In DWARF v5, the files are 0-indexed.
+// The IR reader expects the indexes as 1-indexed.
+// Each compile unit, keeps track of the last assigned index.
+size_t LVIRReader::getOrCreateSourceID(const DIFile *File) {
+  if (!File)
+return 0;
+
+#ifdef DEBUG_ALL
+  LLVM_DEBUG({
+dbgs() << "\n[getOrCreateSourceID] DIFile\n";
+File->dump();
+  });
+#endif
+
+  addMD(File, CompileUnit);
+
+  LLVM_DEBUG({
+dbgs() << "Directory: '" << File->getDirectory() << "'\n";
+dbgs() << "Filename:  '" << File->getFilename() << "'\n";
+  });
+  size_t FileIndex = 0;
+  LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File);
+  if (Iter == CompileUnitFiles.cend()) {
+FileIndex = getFileIndex(CompileUnit);
+std::string Directory(File->getDirectory());
+if (Directory.empty())
+  Directory = std::string(CompileUnit->getCompilationDirectory());
+
+std::string FullName;
+raw_string_ostream Out(FullName);
+Out << Directory << "/" << llvm::sys::path::filename(File->getFilename());
+CompileUnit->addFilename(transformPath(FullName));
+CompileUnitFiles.emplace(File, ++FileIndex);
+updateFileIndex(CompileUnit, FileIndex);
+  } else {
+FileIndex = Iter->second;
+  }
+
+  LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; });
+  return FileIndex;
+}
+
+void LVIRReader::addSourceLine(LVElement *Element, unsigned Line,
+   const DIFile *File) {
+  if (Line == 0)
+return;
+
+  // After the scopes are created, the generic reader traverses the 'Children'
+  // and perform additional setting tasks (resolve types names, references,
+  // etc.). One of those tasks is select the correct string pool index based on
+  // the commmand line o

[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)

2025-05-01 Thread Jeremy Morse via llvm-branch-commits


@@ -56,6 +61,17 @@ Error LVReaderHandler::createReader(StringRef Filename, 
LVReaders &Readers,
   return std::make_unique(Filename, FileFormatName, Pdb,
 W, ExePath);
 }
+if (isa(Input)) {
+  IRObjectFile *Ir = cast(Input);
+  return std::make_unique(Filename, FileFormatName, Ir, W);
+}
+if (isa(Input)) {
+  // If the filename extension is '.ll' create a IR reader.

jmorse wrote:

```suggestion
  // If the filename extension is '.ll' create an IR reader.
```

https://github.com/llvm/llvm-project/pull/135440
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >