[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671) (PR #106691)
jmorse wrote: > @jmorse What do you think about merging this PR to the release branch? I feel this is safe to go into the release branch; I don't think it should block the 19.0.0 release though, better to have it in a point release. This fault has been present for roughly five months and has only just been uncovered, so IMHO it's not a pressing issue. https://github.com/llvm/llvm-project/pull/106691 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671, #106723) (PR #106952)
https://github.com/jmorse approved this pull request. As with the previous rev, LGTM as a fix for an edge case. This situation has taken many months to expose, so I don't think it urgently needs to go into the imminent release, a point release should be fine. https://github.com/llvm/llvm-project/pull/106952 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 6b2e4c5 - [DebugInfo][InstrRef] Don't break up ret-sequences on debug-info instrs
Author: Jeremy Morse Date: 2021-07-29T15:08:13+01:00 New Revision: 6b2e4c5a58d7f75f52b740bd7d96dc3a00b7ef6c URL: https://github.com/llvm/llvm-project/commit/6b2e4c5a58d7f75f52b740bd7d96dc3a00b7ef6c DIFF: https://github.com/llvm/llvm-project/commit/6b2e4c5a58d7f75f52b740bd7d96dc3a00b7ef6c.diff LOG: [DebugInfo][InstrRef] Don't break up ret-sequences on debug-info instrs When we have a terminator sequence (i.e. a tailcall or return), MIIsInTerminatorSequence is used to work out where the preceding ABI-setup instructions end, i.e. the parts that were glued to the terminator instruction. This allows LLVM to split blocks safely without having to worry about ABI stuff. The function only ignores DBG_VALUE instructions, meaning that the two debug instructions I recently added can end terminator sequences early, causing various MachineVerifier errors. This patch promotes the test for debug instructions from "isDebugValue" to "isDebugInstr", thus avoiding any debug-info interfering with this function. Differential Revision: https://reviews.llvm.org/D106660 (cherry picked from commit 8612417e5a54cfef941ab45de55e48b4a0c4e8b4) Added: llvm/test/DebugInfo/ARM/instr-ref-tcreturn.ll Modified: llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Removed: diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp index 1415cce3b1dff..09627ee6a1645 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp @@ -1660,7 +1660,7 @@ static bool MIIsInTerminatorSequence(const MachineInstr &MI) { // physical registers if there is debug info associated with the terminator // of our mbb. We want to include said debug info in our terminator // sequence, so we return true in that case. -return MI.isDebugValue(); +return MI.isDebugInstr(); // We have left the terminator sequence if we are not doing one of the // following: diff --git a/llvm/test/DebugInfo/ARM/instr-ref-tcreturn.ll b/llvm/test/DebugInfo/ARM/instr-ref-tcreturn.ll new file mode 100644 index 0..f86e35061dc22 --- /dev/null +++ b/llvm/test/DebugInfo/ARM/instr-ref-tcreturn.ll @@ -0,0 +1,66 @@ +; RUN: llc %s -o - -stop-after=finalize-isel -verify-machineinstrs -experimental-debug-variable-locations | FileCheck %s + +; In the sequence below, the sdiv is converted to a function call to __divsi3, +; which is then tail call optimised. The dbg.value is suddenly stuck between +; terminators, and the corresponding DBG_INSTR_REF is forced-placed to be +; immediately before the TCRETURN. +; However, with the function having the sspstrong attribute, we then try to +; peel apart the terminator sequence, DBG_INSTR_REF is interpreted as being +; a "real" instruction, and the stack check is inserted at that point rather +; than before the copies-to-physreg setting up the call. This breaks the +; code, and MachineVerifier complains. +; +; Check that the tail sequence is stack-protected, and split at the correct +; position, ignoring the DBG_INSTR_REF + +target datalayout = "e-m:o-p:32:32-Fi8-f64:32:64-v64:32:64-v128:32:128-a:0:32-n32-S32" +target triple = "thumbv7-apple-ios7.0.0" + +; CHECK-LABEL: bb.0.entry: +; CHECK:LOAD_STACK_GUARD + +; CHECK-LABEL: bb.2.entry: +; CHECK:tBL {{.*}} &__stack_chk_fail, + +; CHECK-LABEL: bb.1.entry: +; CHECK: $r0 = COPY %0 +; CHECK-NEXT: $r1 = COPY %1 +; CHECK-NEXT: DBG_INSTR_REF 1, 0 +; CHECK-NEXT: TCRETURNdi &__divsi3, 0, implicit $sp, implicit $r0, implicit $r1 + +declare i1 @ext() + +define i32 @test(i32 %a1, i32 %a2) #1 !dbg !5 { +entry: + %foo = alloca i32, i32 %a1 + %bool = call i1 @ext() + %res = sdiv i32 %a1, %a2 + call void @llvm.dbg.value(metadata i32 %a1, metadata !13, metadata !DIExpression()), !dbg !16 + ret i32 %res +} + +attributes #1 = {sspstrong} + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!3, !4} + +!0 = distinct !DICompileUnit(language: DW_LANG_Swift, file: !1, producer: "Swift", isOptimized: true, runtimeVersion: 5, emissionKind: FullDebug) +!1 = !DIFile(filename: "foo.swift", directory: "/tmp") +!2 = !{} +!3 = !{i32 2, !"Debug Info Version", i32 3} +!4 = !{i32 1, !"Swift Minor Version", i8 3} +!5 = distinct !DISubprogram(name: "n0", linkageName: "n1", scope: !7, file: !6, line: 86, type: !8, scopeLine: 86, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0) +!6 = !DIFile(filename: "bar.swift", directory: "") +!7 = !DIModule(scope: null, name: "Swift") +!8 = !DISubroutineType(types: !9) +!9 = !{!10, !12} +!10 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "Int", scope: !7, file: !11, size: 32, elements: !2, runtimeLang: DW_LANG_Swift, identifier: "$i1") +!11 = !DIFile(f
[llvm-branch-commits] [lld] eff6e75 - [LLD][ELF] Correct test temporary file paths
Author: Jeremy Morse Date: 2021-01-20T11:46:11Z New Revision: eff6e75c3f7c5471f0326526dc3f0b8b10f8a4df URL: https://github.com/llvm/llvm-project/commit/eff6e75c3f7c5471f0326526dc3f0b8b10f8a4df DIFF: https://github.com/llvm/llvm-project/commit/eff6e75c3f7c5471f0326526dc3f0b8b10f8a4df.diff LOG: [LLD][ELF] Correct test temporary file paths In 8031785f4a7ebd the temporary object being built was moved to %t/main.o, but not all run lines were updated to reflect this. Observe the failure on this buildbot: http://lab.llvm.org:8011/#/builders/5/builds/3646/steps/9/logs/stdio It might pass locally for some people due to a stale %t.o hanging around the build directory. Added: Modified: lld/test/ELF/wrap-shlib-undefined.s Removed: diff --git a/lld/test/ELF/wrap-shlib-undefined.s b/lld/test/ELF/wrap-shlib-undefined.s index f46ebe36b779..0c4a79171e27 100644 --- a/lld/test/ELF/wrap-shlib-undefined.s +++ b/lld/test/ELF/wrap-shlib-undefined.s @@ -31,7 +31,7 @@ # CHECK2-NEXT: NOTYPE GLOBAL DEFAULT6 foo ## __wrap_bar is undefined. -# RUN: ld.lld -shared %t.o --wrap=bar -o %t3.so +# RUN: ld.lld -shared %t/main.o --wrap=bar -o %t3.so # RUN: llvm-readelf -r --dyn-syms %t3.so | FileCheck %s --check-prefix=CHECK3 # CHECK3: R_X86_64_JUMP_SLOT __wrap_bar + 0 # CHECK3: Symbol table '.dynsym' contains 4 entries: @@ -41,7 +41,7 @@ # CHECK3-NEXT: NOTYPE GLOBAL DEFAULT6 foo ## __wrap_bar is defined in %t/wrap.so. -# RUN: ld.lld -shared %t.o %t/wrap.so --wrap=bar -o %t4.so +# RUN: ld.lld -shared %t/main.o %t/wrap.so --wrap=bar -o %t4.so # RUN: llvm-readelf -r --dyn-syms %t4.so | FileCheck %s --check-prefix=CHECK4 # CHECK4: R_X86_64_JUMP_SLOT {{.*}} __wrap_bar + 0 # CHECK4: Symbol table '.dynsym' contains 4 entries: @@ -50,7 +50,7 @@ # CHECK4-NEXT: NOTYPE GLOBAL DEFAULT6 _start # CHECK4-NEXT: NOTYPE GLOBAL DEFAULT6 foo -# RUN: ld.lld %t.o %t/wrap.so --wrap bar -o %t1 +# RUN: ld.lld %t/main.o %t/wrap.so --wrap bar -o %t1 # RUN: llvm-readelf --dyn-syms %t1 | FileCheck %s --check-prefix=DYNSYM # RUN: llvm-objdump -d %t1 | FileCheck %s --check-prefix=ASM ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 914066f - [DebugInfo] Avoid LSR crash on large integer inputs
Author: Jeremy Morse Date: 2021-01-05T10:25:37Z New Revision: 914066fe38a93c004b742a696ec337701eb738ec URL: https://github.com/llvm/llvm-project/commit/914066fe38a93c004b742a696ec337701eb738ec DIFF: https://github.com/llvm/llvm-project/commit/914066fe38a93c004b742a696ec337701eb738ec.diff LOG: [DebugInfo] Avoid LSR crash on large integer inputs Loop strength reduction tries to recover debug variable values by looking for simple offsets from PHI values. In really extreme conditions there may be an offset used that won't fit in an int64_t, hitting an APInt assertion. This patch adds a regression test and adjusts the equivalent value collecting code to filter out any values where the offset can't be represented by an int64_t. This means that for very large integers with very large offsets, the variable location will become undef, which is the same behaviour as before 2a6782bb9f1 / D87494. Differential Revision: https://reviews.llvm.org/D94016 Added: llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-1.ll Modified: llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp Removed: diff --git a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp index 7742ebafcac4..6e64a1ddf787 100644 --- a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp +++ b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp @@ -5796,8 +5796,9 @@ static void DbgGatherEqualValues(Loop *L, ScalarEvolution &SE, if (!SE.isSCEVable(Phi.getType())) continue; auto PhiSCEV = SE.getSCEV(&Phi); -if (Optional Offset = -SE.computeConstantDifference(DbgValueSCEV, PhiSCEV)) +Optional Offset = +SE.computeConstantDifference(DbgValueSCEV, PhiSCEV); +if (Offset && Offset->getMinSignedBits() <= 64) EqSet.emplace_back(std::make_tuple( &Phi, Offset.getValue().getSExtValue(), DVI->getExpression())); } diff --git a/llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-1.ll b/llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-1.ll new file mode 100644 index ..d651ff239a46 --- /dev/null +++ b/llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-1.ll @@ -0,0 +1,73 @@ +; RUN: opt < %s -loop-reduce -S | FileCheck %s +; +; Test that LSR avoids crashing on very large integer inputs. It should +; discard the variable location by creating an undef dbg.value. +; +; CHECK: call void @llvm.dbg.value(metadata i128 undef, + +source_filename = "" +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +define dso_local void @qux() local_unnamed_addr !dbg !10 { +entry: + br label %for.body, !dbg !17 + +for.body: ; preds = %for.body.for.body_crit_edge, %entry + %0 = phi i128 [ 0, %entry ], [ %.pre8, %for.body.for.body_crit_edge ], !dbg !19 + %add6.i.i = add i128 %0, 18446744073709551615, !dbg !35 + call void @llvm.dbg.value(metadata i128 %add6.i.i, metadata !25, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 128)), !dbg !36 + br label %for.body.for.body_crit_edge + +for.body.for.body_crit_edge: ; preds = %for.body + %.pre8 = load i128, i128* undef, align 16, !dbg !19, !tbaa !37 + br label %for.body, !dbg !17 +} + +; Function Attrs: nofree nosync nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !3, globals: !2, splitDebugInlining: false, nameTableKind: None) +!1 = !DIFile(filename: "/tmp/beans.c", directory: "/tmp") +!2 = !{} +!3 = !{!4} +!4 = !DIDerivedType(tag: DW_TAG_typedef, name: "uint64_t", file: !5, line: 1, baseType: !6) +!5 = !DIFile(filename: "/tmp/beans.c", directory: "") +!6 = !DIBasicType(name: "long unsigned int", size: 64, encoding: DW_ATE_unsigned) +!7 = !{i32 7, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = distinct !DISubprogram(name: "qux", scope: !5, file: !5, line: 27, type: !11, scopeLine: 27, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !2) +!11 = !DISubroutineType(types: !12) +!12 = !{null, !13, !13} +!13 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !14, size: 64) +!14 = !DIDerivedType(tag: DW_TAG_typedef, name: "croix", file: !5, line: 2, baseType: !15) +!15 = !DIDerivedType(tag: DW_TAG_typedef, name: "__uint128_t", file: !1, baseType: !16) +!16 = !DIBasicType(name: "unsigned __int128", size: 128, encoding: DW_ATE_unsigned) +!17 = !DILocation(line: 30, column: 3, scope: !18) +
[llvm-branch-commits] [llvm] df2b9a3 - [DebugInfo] Avoid re-ordering assignments in LCSSA
Author: Nabeel Omer Date: 2020-12-17T16:17:32Z New Revision: df2b9a3e02ca3bd7b60af6c65571909a7d3ab317 URL: https://github.com/llvm/llvm-project/commit/df2b9a3e02ca3bd7b60af6c65571909a7d3ab317 DIFF: https://github.com/llvm/llvm-project/commit/df2b9a3e02ca3bd7b60af6c65571909a7d3ab317.diff LOG: [DebugInfo] Avoid re-ordering assignments in LCSSA The LCSSA pass makes use of a function insertDebugValuesForPHIs() to propogate dbg.value() intrinsics to newly inserted PHI instructions. Faulty behaviour occurs when the parent PHI of a newly inserted PHI is not the most recent assignment to a source variable. insertDebugValuesForPHIs ends up propagating a value that isn't the most recent assignemnt. This change removes the call to insertDebugValuesForPHIs() from LCSSA, preventing incorrect dbg.value intrinsics from being propagated. Propagating variable locations between blocks will occur later, during LiveDebugValues. Differential Revision: https://reviews.llvm.org/D92576 Added: llvm/test/Transforms/LCSSA/DontInsertDebugValuesForPHIs.ll Modified: llvm/lib/Transforms/Utils/LCSSA.cpp llvm/test/Transforms/LCSSA/basictest.ll llvm/test/Transforms/LoopIdiom/X86/left-shift-until-bittest.ll Removed: diff --git a/llvm/lib/Transforms/Utils/LCSSA.cpp b/llvm/lib/Transforms/Utils/LCSSA.cpp index 1bcf40e11d10..a601ec9349e0 100644 --- a/llvm/lib/Transforms/Utils/LCSSA.cpp +++ b/llvm/lib/Transforms/Utils/LCSSA.cpp @@ -265,15 +265,11 @@ bool llvm::formLCSSAForInstructions(SmallVectorImpl &Worklist, Worklist.push_back(PostProcessPN); // Keep track of PHI nodes that we want to remove because they did not have -// any uses rewritten. If the new PHI is used, store it so that we can -// try to propagate dbg.value intrinsics to it. -SmallVector NeedDbgValues; +// any uses rewritten. for (PHINode *PN : AddedPHIs) if (PN->use_empty()) LocalPHIsToRemove.insert(PN); - else -NeedDbgValues.push_back(PN); -insertDebugValuesForPHIs(InstBB, NeedDbgValues); + Changed = true; } diff --git a/llvm/test/Transforms/LCSSA/DontInsertDebugValuesForPHIs.ll b/llvm/test/Transforms/LCSSA/DontInsertDebugValuesForPHIs.ll new file mode 100644 index ..b140bc96f5d0 --- /dev/null +++ b/llvm/test/Transforms/LCSSA/DontInsertDebugValuesForPHIs.ll @@ -0,0 +1,57 @@ +; RUN: opt < %s -lcssa -S | FileCheck %s + +; This test ensures that LCSSA does not insert dbg.value intrinsics using +; insertDebugValuesForPHIs() which effectively cause assignments to be +; re-ordered. +; See PR48206 for more information. + +define dso_local i32 @_Z5lcssab(i1 zeroext %S2) { +entry: + br label %loop.interior + +loop.interior:; preds = %post.if, %entry + br i1 %S2, label %if.true, label %if.false + +if.true: ; preds = %loop.interior + %X1 = add i32 0, 0 + br label %post.if + +if.false: ; preds = %loop.interior + %X2 = add i32 0, 1 + br label %post.if + +post.if: ; preds = %if.false, %if.true + %X3 = phi i32 [ %X1, %if.true ], [ %X2, %if.false ], !dbg !21 + call void @llvm.dbg.value(metadata i32 %X3, metadata !9, metadata !DIExpression()), !dbg !21 + %Y1 = add i32 4, %X3, !dbg !22 + call void @llvm.dbg.value(metadata i32 %Y1, metadata !9, metadata !DIExpression()), !dbg !22 + br i1 %S2, label %loop.exit, label %loop.interior, !dbg !23 + +loop.exit:; preds = %post.if +; CHECK: loop.exit: +; CHECK-NEXT: %X3.lcssa = phi i32 +; CHECK-NOT: call void @llvm.dbg.value + %X4 = add i32 3, %X3 + ret i32 %X4 +} + +declare void @llvm.dbg.value(metadata, metadata, metadata) + +!llvm.dbg.cu = !{!0} +!llvm.debugify = !{!3, !4} +!llvm.module.flags = !{!5} + +!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify and Author", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2) +!1 = !DIFile(filename: "./testcase.ll", directory: "/") +!2 = !{} +!3 = !{i32 11} +!4 = !{i32 5} +!5 = !{i32 2, !"Debug Info Version", i32 3} +!6 = distinct !DISubprogram(name: "_Z5lcssab", linkageName: "_Z5lcssab", scope: null, file: !1, line: 1, type: !7, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !{!9}) +!7 = !DISubroutineType(types: !2) +!9 = !DILocalVariable(name: "var", scope: !6, file: !1, line: 3, type: !10) +!10 = !DIBasicType(name: "ty32", size: 32, encoding: DW_ATE_unsigned) +!21 = !DILocation(line: 7, column: 1, scope: !6) +!22 = !DILocation(line: 8, column: 1, scope: !6) +!23 = !DILocation(line: 9, column: 1, scope: !6) + diff --git a/llvm/test/Transforms/LCSSA/basictest.ll b/llvm/test/Transforms/LCSSA/basictest.ll index 7ca552039b63..30d5e762eedf 100644 --- a/llvm/test/Transforms/LCSSA/basictest.ll +++ b/
[llvm-branch-commits] [llvm] cda32ab - [DebugInfo][Docs] Document MIR language debug-info constructs
Author: Jeremy Morse Date: 2020-12-08T11:01:55Z New Revision: cda32aba4f46033eea4eea6eda703459dc3c7a1e URL: https://github.com/llvm/llvm-project/commit/cda32aba4f46033eea4eea6eda703459dc3c7a1e DIFF: https://github.com/llvm/llvm-project/commit/cda32aba4f46033eea4eea6eda703459dc3c7a1e.diff LOG: [DebugInfo][Docs] Document MIR language debug-info constructs This patch documents the MIR syntax for a number of things relevant to debugging information: * Trailing 'debug-location' metadata that becomes a DebugLoc, * Variable location metadata for stack slots, * Syntax for DBG_VALUE metainstructions, * Syntax for DBG_INSTR_REF, including trailing instruction numbers attached to MIR instructions. Differential Revision: https://reviews.llvm.org/D89337 Added: Modified: llvm/docs/MIRLangRef.rst Removed: diff --git a/llvm/docs/MIRLangRef.rst b/llvm/docs/MIRLangRef.rst index 0380bcced5fc..ff8029c5c646 100644 --- a/llvm/docs/MIRLangRef.rst +++ b/llvm/docs/MIRLangRef.rst @@ -790,3 +790,105 @@ For an int eq predicate ``ICMP_EQ``, the syntax is: instructions debug location attribute. .. TODO: Describe the syntax of the register live out machine operands. .. TODO: Describe the syntax of the machine memory operands. + +Debug-Info constructs +- + +Most of the debugging information in a MIR file is to be found in the metadata +of the embedded module. Within a machine function, that metadata is referred to +by various constructs to describe source locations and variable locations. + +Source locations + + +Every MIR instruction may optionally have a trailing reference to a +``DILocation`` metadata node, after all operands and symbols, but before +memory operands: + +.. code-block:: text + + $rbp = MOV64rr $rdi, debug-location !12 + +The source location attachment is synonymous with the ``!dbg`` metadata +attachment in LLVM-IR. The absence of a source location attachment will be +represented by an empty ``DebugLoc`` object in the machine instruction. + +Fixed variable locations + + +There are several ways of specifying variable locations. The simpliest is +describing a variable that is permanently located on the stack. In the stack +or fixedStack attribute of the machine function, the variable, scope, and +any qualifying location modifier are provided: + +.. code-block:: text + +- { id: 0, name: offset.addr, offset: -24, size: 8, alignment: 8, stack-id: default, + 4 debug-info-variable: '!1', debug-info-expression: '!DIExpression()', +debug-info-location: '!2' } + +Where: + +- ``debug-info-variable`` identifies a DILocalVariable metadata node, + +- ``debug-info-expression`` adds qualifiers to the variable location, + +- ``debug-info-location`` identifies a DILocation metadata node. + +These metadata attributes correspond to the operands of a ``llvm.dbg.declare`` +IR intrinsic, see the :ref:`source level debugging` +documentation. + +Varying variable locations +^^ + +Variables that are not always on the stack or change location are specified +with the ``DBG_VALUE`` meta machine instruction. It is synonymous with the +``llvm.dbg.value`` IR intrinsic, and is written: + +.. code-block:: text + +DBG_VALUE $rax, $noreg, !123, !DIExpression(), debug-location !456 + +The operands to which respectively: + +1. Identifies a machine location such as a register, immediate, or frame index, + +2. Is either $noreg, or immediate value zero if an extra level of indirection is to be added to the first operand, + +3. Identifies a ``DILocalVariable`` metadata node, + +4. Specifies an expression qualifying the variable location, either inline or as a metadata node reference, + +While the source location identifies the ``DILocation`` for the scope of the +variable. The second operand (``IsIndirect``) is deprecated and to be deleted. +All additional qualifiers for the variable location should be made through the +expression metadata. + +Instruction referencing locations +^ + +This experimental feature aims to separate the specification of variable +*values* from the program point where a variable takes on that value. Changes +in variable value occur in the same manner as ``DBG_VALUE`` meta instructions +but using ``DBG_INSTR_REF``. Variable values are identified by a pair of +instruction number and operand number. Consider the example below: + +.. code-block:: text + +$rbp = MOV64ri 0, debug-instr-number 1, debug-location !12 +DBG_INSTR_REF 1, 0, !123, !DIExpression(), debug-location !456 + +Instruction numbers are directly attached to machine instructions with an +optional ``debug-instr-number`` attachment, before the optional +``debug-location`` attachment. The value defined in ``$rbp`` in the code +above would be identified by the pair ``<1, 0>``. + +The first two operands of the ``
[llvm-branch-commits] [compiler-rt] [TySan] Improved compatability for tests (PR #96507)
@@ -23,8 +24,8 @@ void f(int m) { } // CHECK: TypeSanitizer: type-aliasing-violation on address - // CHECK-NEXT: READ of size 2 at {{.+}} with type short accesses an existing object of type long long - // CHECK-NEXT:in f violation-pr47137.c:30 + // CHECK-NEXT: READ of size 2 at {{.+}} with type short accesses an existing object of type {{(long)+}} jmorse wrote: IMO having an optional (with `?`) long is preferable to having an unbound number of longs. https://github.com/llvm/llvm-project/pull/96507 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Improved compatability for tests (PR #96507)
@@ -18,7 +18,7 @@ int main(void) { // CHECK: TypeSanitizer: type-aliasing-violation on address // CHECK-NEXT: WRITE of size 8 at {{.+}} with type double accesses an existing object of type float - // CHECK-NEXT: in main violation-pr45282.c:25 + // CHECK-NEXT: in main {{.*violation-pr45282.c:25.*}} jmorse wrote: Nit: IMO having `{{.*}}` to swallow extra path components is slightly preferable, as that avoids the filename being part of a regex. Almost certainly not ever going to be a problem, but it reduces what can go wrong. Does your environment also have trailing components after the line number? If there's any whitespace between '5' and the extra text then FileCheck will (99% confident) ignore those anyway. Similar thoughts apply to all the other similar regexes. https://github.com/llvm/llvm-project/pull/96507 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)
@@ -221,7 +221,24 @@ __tysan_check(void *addr, int size, tysan_type_descriptor *td, int flags) { OldTDPtr -= i; OldTD = *OldTDPtr; -if (!isAliasingLegal(td, OldTD)) +// When shadow memory is set for global objects, the entire object is tagged with the struct type +// This means that when you access a member variable, tysan reads that as you accessing a struct midway +// through, with 'i' being the offset +// Therefore, if you are accessing a struct, we need to find the member type. We can go through the +// members of the struct type and see if there is a member at the offset you are accessing the struct by. +// If there is indeed a member starting at offset 'i' in the struct, we should check aliasing legality +// with that type. If there isn't, we run alias checking on the struct with will give us the correct error. +tysan_type_descriptor *InternalMember = OldTD; +if (OldTD->Tag == TYSAN_STRUCT_TD) { + for (int j = 0; j < OldTD->Struct.MemberCount; j++) { jmorse wrote: Style guide says `++j` https://github.com/llvm/llvm-project/pull/95387 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)
https://github.com/jmorse commented: I'm not familiar with TySan, but here's some style feedback and similar inline. The comments in tysan.cpp are too long (clang-format-bot would have bitten you if it was live when you uploaded this), please clang-format. https://github.com/llvm/llvm-project/pull/95387 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/95387 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)
@@ -0,0 +1,31 @@ +// RUN: %clang_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct X { + int a, b, c; +} x; + +static struct X xArray[2]; + +int main() { + x.a = 1; + x.b = 2; + x.c = 3; + + printf("%d %d %d\n", x.a, x.b, x.c); + // CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation + + for (size_t i = 0; i < 2; i++) { +xArray[i].a = 1; +xArray[i].b = 1; +xArray[i].c = 1; + } + + struct X *xPtr = (struct X *)&(xArray[0].c); + xPtr->a = 1; + // CHECK: ERROR: TypeSanitizer: type-aliasing-violation + // CHECK: WRITE of size 4 at {{.*}} with type int (in X at offset 0) accesses an existing object of type int (in X at offset 8) + // CHECK: {{#0 0x.* in main .*struct-members.c:}}[[@LINE-3]] jmorse wrote: Is it possible to strengthen this test even further by making the latter two "CHECK-NEXT", to ensure the error appears all in one block? (Reduces the risk of spurious extra output matching CHECK lines). I'd recommend reducing the portions of the CHECK-line that are regexes to be as small as possible, so `{{.*}}struct-members.c` and similar. https://github.com/llvm/llvm-project/pull/95387 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)
@@ -221,7 +221,24 @@ __tysan_check(void *addr, int size, tysan_type_descriptor *td, int flags) { OldTDPtr -= i; OldTD = *OldTDPtr; -if (!isAliasingLegal(td, OldTD)) +// When shadow memory is set for global objects, the entire object is tagged with the struct type +// This means that when you access a member variable, tysan reads that as you accessing a struct midway +// through, with 'i' being the offset +// Therefore, if you are accessing a struct, we need to find the member type. We can go through the +// members of the struct type and see if there is a member at the offset you are accessing the struct by. +// If there is indeed a member starting at offset 'i' in the struct, we should check aliasing legality +// with that type. If there isn't, we run alias checking on the struct with will give us the correct error. jmorse wrote: ```suggestion // with that type. If there isn't, we run alias checking on the struct which will give us the correct error. ``` Doesn't parse without this change? https://github.com/llvm/llvm-project/pull/95387 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)
@@ -221,7 +221,17 @@ __tysan_check(void *addr, int size, tysan_type_descriptor *td, int flags) { OldTDPtr -= i; OldTD = *OldTDPtr; -if (!isAliasingLegal(td, OldTD)) +tysan_type_descriptor *InternalMember = OldTD; jmorse wrote: I feel like you should name the variable declared here "AccessedType" or similar, as the default-path is to check access to OldTD rather than an internal member of it, right? https://github.com/llvm/llvm-project/pull/95387 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)
@@ -0,0 +1,31 @@ +// RUN: %clang_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct X { + int a, b, c; +} x; + +static struct X xArray[2]; + +int main() { + x.a = 1; + x.b = 2; + x.c = 3; + + printf("%d %d %d\n", x.a, x.b, x.c); + // CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation jmorse wrote: I'm not familiar with the output of TySan, but this CHECK-NOT paired with the CHECK on line 28 is going to test that "there's no error until there's an error", which makes this specific CHECK-NOT redundant. If you're looking to ensure there's only one instance of that error-output then you need to put another CHECK-NOT under the CHECK lines below, or add `--implicit-check-not` to FileCheck. Also consider just not having this CHECK-NOT line, if TySan will never produce more than one error. https://github.com/llvm/llvm-project/pull/95387 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
@@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA, break; } + //You can't have negative offset, you must be partially inside the last type jmorse wrote: Space at start of comment; better to write the comment in a passive tone, "The offset can't be negative, thus we must be..." https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
@@ -0,0 +1,31 @@ +// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct inner { + char buffer; + int i; +}; + +void init_inner(inner *iPtr) { + iPtr->i = 0; +} + +struct outer { + inner foo; +char buffer; +}; + +int main(void) { +outer *l = new outer(); + +init_inner(&l->foo); + +int access_offsets_with_different_base = l->foo.i; +printf("%d\n", access_offsets_with_different_base); + +return 0; +} + +// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation jmorse wrote: Nit: please also include a test for /something/ in the output to ensure that the test is positively checking for something. Otherwise we could replace %clangxx_tysan with any program that succeeds , such as `true`, and this test would pass. https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
https://github.com/jmorse commented: (Minor style and test feedbacks) https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
https://github.com/jmorse commented: Are there any expected interactions between atom-group-numbers and loading bitcode? i.e., if we serialise the literal atom-group-number to the output and then read it back in again, then it might conflict with atom-group-numbers seen in other functions in other bitcode files. It doesn't appear that they get re-numbered in the textual IR parsing patch for example. Possibly part of the design here is to simply not care, if it's only about internal consistency within a Function (does that hold after inlining too). Apologies if this is all explained in a later patch. The answers to that should ultimately be documented somewhere; I imagine that's in the patch stack or coming later. https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
@@ -1366,6 +1367,43 @@ TEST_F(DILocationTest, discriminatorSpecialCases) { EXPECT_EQ(std::nullopt, L4->cloneByMultiplyingDuplicationFactor(0x1000)); } +TEST_F(DILocationTest, KeyInstructions) { + Context.pImpl->NextAtomGroup = 1; + + EXPECT_EQ(Context.pImpl->NextAtomGroup, 1u); + DILocation *A1 = DILocation::get(Context, 1, 0, getSubprogram(), nullptr, false, 1, 2); + // The group is only applied to the DILocation if the build has opted into + // the additional DILocation fields needed for the feature. jmorse wrote: Style nit: I feel "the build has opted into" is a bit too abstract, and is like the code referring to itself in the third person. "if we have been built with..." feels a lot cleaner IMHO, YMMV. https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
@@ -335,6 +335,14 @@ class LLVMContext { StringRef getDefaultTargetFeatures(); void setDefaultTargetFeatures(StringRef Features); + /// Key Instructions: update the highest number atom group emitted for any + /// function. + void updateAtomGroupWaterline(uint64_t G); + + /// Key Instructions: get the next free atom group number and increment + /// the global tracker. + uint64_t incNextAtomGroup(); + jmorse wrote: IMO in isolation it's not clear that this is to do with debugging information and source locations; could we shoe-horn `DILocation` into the comments to make it clear what it affects? (Thinking purely about someone stumbling on this and not immediately knowing whether it's relevant to what they're studying) https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -87,6 +87,8 @@ class ValueMap { using ValueMapCVH = ValueMapCallbackVH; using MapT = DenseMap>; using MDMapT = DenseMap; + /// Map {(InlinedAt, old atom number) -> new atom number}. + using DMAtomT = DenseMap, uint64_t>; jmorse wrote: Consider using SmallDenseMap simply to reduce the initial allocations in the non-debug-info codepath? https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -105,6 +105,13 @@ enum RemapFlags { /// Any global values not in value map are mapped to null instead of mapping /// to self. Illegal if RF_IgnoreMissingLocals is also set. RF_NullMapMissingGlobalValues = 8, + + /// Do not remap atom instances. Only safe if to do this if the cloned + /// instructions being remapped are inserted into a new function, or an + /// existing function where the inlined-at fields are updated. If in doubt, + /// don't use this flag. It's used for compiler performance reasons rather + /// than correctness. jmorse wrote: ```suggestion /// don't use this flag. It's used when remapping is known to be un-necessary /// to save some compile-time. ``` https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -117,9 +118,21 @@ struct ClonedCodeInfo { /// If you would like to collect additional information about the cloned /// function, you can specify a ClonedCodeInfo object with the optional fifth /// parameter. +/// +/// Set \p MapAtoms to false to skip mapping source atoms for later remapping. jmorse wrote: IMO "source-location atoms" to make it even clearer that this is a debugging feature. Also IMO it's better to discuss when this flag is necessary instead of when it's not necessary, as it'll enlighten the reader what it's for. AFAIUI, something like "Must be true when you duplicate a code path and a source line is intended to appear twice in the generated instructions. Can be set to false if you are transplanting code from one place to another". https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -105,6 +105,13 @@ enum RemapFlags { /// Any global values not in value map are mapped to null instead of mapping /// to self. Illegal if RF_IgnoreMissingLocals is also set. RF_NullMapMissingGlobalValues = 8, + + /// Do not remap atom instances. Only safe if to do this if the cloned + /// instructions being remapped are inserted into a new function, or an + /// existing function where the inlined-at fields are updated. If in doubt, + /// don't use this flag. It's used for compiler performance reasons rather + /// than correctness. jmorse wrote: IMO suggesting that it's not related to correctness is misleading, because the presence/absence of the flag can lead to correctness issues. Better to just avoid saying that and say something else as suggested. https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -105,6 +105,13 @@ enum RemapFlags { /// Any global values not in value map are mapped to null instead of mapping /// to self. Illegal if RF_IgnoreMissingLocals is also set. RF_NullMapMissingGlobalValues = 8, + + /// Do not remap atom instances. Only safe if to do this if the cloned jmorse wrote: IMO needs "source location atom" instead of just "atom" to ensure the random reader knows it's about debug-info. https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
https://github.com/jmorse approved this pull request. LGTM, code and tests are good. As ever all my comments are about comments and maintainability! https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -284,6 +291,9 @@ inline void RemapInstruction(Instruction *I, ValueToValueMapTy &VM, .remapInstruction(*I); } +/// Remap source atom. Called by RemapInstruction. jmorse wrote: IMO too terse; needs some purpose and context. https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NPM] Port LiveDebugValues to NPM (PR #131563)
https://github.com/jmorse approved this pull request. LGTM, although I'm not especially familiar with the transition to the new pass manager. https://github.com/llvm/llvm-project/pull/131563 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NPM] Port LiveDebugValues to NPM (PR #131563)
@@ -77,36 +78,68 @@ class LiveDebugValues : public MachineFunctionPass { AU.setPreservesCFG(); MachineFunctionPass::getAnalysisUsage(AU); } +}; + +struct LiveDebugValues { + LiveDebugValues(); + ~LiveDebugValues() = default; + bool run(MachineFunction &MF, bool ShouldEmitDebugEntryValues); private: std::unique_ptr InstrRefImpl; std::unique_ptr VarLocImpl; - TargetPassConfig *TPC = nullptr; MachineDominatorTree MDT; }; } // namespace -char LiveDebugValues::ID = 0; +char LiveDebugValuesLegacy::ID = 0; -char &llvm::LiveDebugValuesID = LiveDebugValues::ID; +char &llvm::LiveDebugValuesID = LiveDebugValuesLegacy::ID; -INITIALIZE_PASS(LiveDebugValues, DEBUG_TYPE, "Live DEBUG_VALUE analysis", false, -false) +INITIALIZE_PASS(LiveDebugValuesLegacy, DEBUG_TYPE, "Live DEBUG_VALUE analysis", +false, false) /// Default construct and initialize the pass. -LiveDebugValues::LiveDebugValues() : MachineFunctionPass(ID) { - initializeLiveDebugValuesPass(*PassRegistry::getPassRegistry()); +LiveDebugValuesLegacy::LiveDebugValuesLegacy() : MachineFunctionPass(ID) { + initializeLiveDebugValuesLegacyPass(*PassRegistry::getPassRegistry()); +} + +LiveDebugValues::LiveDebugValues() { InstrRefImpl = std::unique_ptr(llvm::makeInstrRefBasedLiveDebugValues()); VarLocImpl = std::unique_ptr(llvm::makeVarLocBasedLiveDebugValues()); } -bool LiveDebugValues::runOnMachineFunction(MachineFunction &MF) { +PreservedAnalyses +LiveDebugValuesPass::run(MachineFunction &MF, + MachineFunctionAnalysisManager &MFAM) { + if (!LiveDebugValues().run(MF, ShouldEmitDebugEntryValues)) +return PreservedAnalyses::all(); + auto PA = getMachineFunctionPassPreservedAnalyses(); jmorse wrote: As far as I'm aware this should preserve all information in analyses -- LDV will insert new DBG_VALUE instructions and should do absolutely nothing else. https://github.com/llvm/llvm-project/pull/131563 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NPM] Port LiveDebugValues to NPM (PR #131563)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/131563 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
https://github.com/jmorse commented: Looks good, although I think the atom-group selection could be refactored so that it's simpler to read. There's already multiple levels of nesting happening in that function, best to reduce that as much as possible (possibly put it in a completely different helper function to separate the concerns?) https://github.com/llvm/llvm-project/pull/133480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
@@ -226,8 +230,44 @@ DILocation *DILocation::getMergedLocation(DILocation *LocA, DILocation *LocB) { bool SameCol = L1->getColumn() == L2->getColumn(); unsigned Line = SameLine ? L1->getLine() : 0; unsigned Col = SameLine && SameCol ? L1->getColumn() : 0; - -return DILocation::get(C, Line, Col, Scope, InlinedAt); +bool IsImplicitCode = L1->isImplicitCode() && L2->isImplicitCode(); +uint64_t Group = 0; +uint64_t Rank = 0; +if (SameLine) { + if (L1->getAtomGroup() || L2->getAtomGroup()) { jmorse wrote: Early-exit the lambda/delegate instead? https://github.com/llvm/llvm-project/pull/133480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
@@ -1243,6 +1243,140 @@ TEST_F(DILocationTest, Merge) { auto *M2 = DILocation::getMergedLocation(A2, B); EXPECT_EQ(M1, M2); } + +#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS +#define EXPECT_ATOM(Loc, Group, Rank) \ + EXPECT_EQ(Group, M->getAtomGroup()); \ + EXPECT_EQ(Rank, M->getAtomRank()); +#else +#define EXPECT_ATOM(Loc, Group, Rank) \ + EXPECT_EQ(0u, M->getAtomGroup()); \ + EXPECT_EQ(0u, M->getAtomRank()); \ + (void)Group; \ + (void)Rank; +#endif + // Identical, including source atom numbers. + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +// DILocations are uniqued, so we can check equality by ptr. +EXPECT_EQ(M, DILocation::getMergedLocation(A, B)); + } + + // Identical but different atom ranks (same atom) - choose the lowest nonzero + // rank. + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0); +B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 2u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Identical but different atom ranks (different atom) - choose the lowest + // nonzero rank. + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0); +B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 2u, 2u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Identical but equal atom rank (different atom) - choose the lowest non-zero + // group (arbitrary choice for deterministic behaviour). + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 2, 7, N, nullptr, false, 0, 1); +B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 2u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Completely different except same atom numbers. Zero out the atoms. + { +auto *I = DILocation::get(Context, 2, 7, N); +auto *A = DILocation::get(Context, 1, 6, S, I, false, 1, 1); +auto *B = +DILocation::get(Context, 2, 7, getSubprogram(), nullptr, false, 1, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_EQ(0u, M->getLine()); +EXPECT_EQ(0u, M->getColumn()); +EXPECT_TRUE(isa(M->getScope())); +EXPECT_EQ(S, M->getScope()); +EXPECT_EQ(nullptr, M->getInlinedAt()); + } + + // Same inlined-at chain but different atoms. Choose the lowest + // non-zero group (arbitrary choice for deterministic behaviour). + { +auto *I = DILocation::get(Context, 1, 7, N); +auto *F = getSubprogram(); +auto *A = DILocation::get(Context, 1, 1, F, I, false, 1, 2); +auto *B = DILocation::get(Context, 1, 1, F, I, false, 2, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 2u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 1, 1, F, I, false, 1, 2); +B = DILocation::get(Context, 1, 1, F, I, false, 2, 0); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 2u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Partially equal inlined-at chain but different atoms. Generate a new atom + // group (if either have a group number). This configuration seems unlikely + // to occur as line numbers must match, but isn't impossible. + { jmorse wrote: I believe it's a desired and supported configuration for this to occur when a function is inlined multiple times then the copies are merged: we keep the line number of the instruction, but put line-zero in the inlinedAt field so that the inlining-callstack refers to a no
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
@@ -1243,6 +1243,140 @@ TEST_F(DILocationTest, Merge) { auto *M2 = DILocation::getMergedLocation(A2, B); EXPECT_EQ(M1, M2); } + +#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS +#define EXPECT_ATOM(Loc, Group, Rank) \ + EXPECT_EQ(Group, M->getAtomGroup()); \ + EXPECT_EQ(Rank, M->getAtomRank()); +#else +#define EXPECT_ATOM(Loc, Group, Rank) \ + EXPECT_EQ(0u, M->getAtomGroup()); \ + EXPECT_EQ(0u, M->getAtomRank()); \ + (void)Group; \ + (void)Rank; +#endif + // Identical, including source atom numbers. + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +// DILocations are uniqued, so we can check equality by ptr. +EXPECT_EQ(M, DILocation::getMergedLocation(A, B)); + } + + // Identical but different atom ranks (same atom) - choose the lowest nonzero + // rank. + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0); +B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 2u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Identical but different atom ranks (different atom) - choose the lowest + // nonzero rank. + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0); +B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 2u, 2u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Identical but equal atom rank (different atom) - choose the lowest non-zero + // group (arbitrary choice for deterministic behaviour). + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 2, 7, N, nullptr, false, 0, 1); +B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 2u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Completely different except same atom numbers. Zero out the atoms. + { +auto *I = DILocation::get(Context, 2, 7, N); +auto *A = DILocation::get(Context, 1, 6, S, I, false, 1, 1); +auto *B = +DILocation::get(Context, 2, 7, getSubprogram(), nullptr, false, 1, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_EQ(0u, M->getLine()); +EXPECT_EQ(0u, M->getColumn()); +EXPECT_TRUE(isa(M->getScope())); +EXPECT_EQ(S, M->getScope()); +EXPECT_EQ(nullptr, M->getInlinedAt()); + } + + // Same inlined-at chain but different atoms. Choose the lowest + // non-zero group (arbitrary choice for deterministic behaviour). + { +auto *I = DILocation::get(Context, 1, 7, N); +auto *F = getSubprogram(); +auto *A = DILocation::get(Context, 1, 1, F, I, false, 1, 2); +auto *B = DILocation::get(Context, 1, 1, F, I, false, 2, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 2u, 1u); jmorse wrote: This is an awkwardness I forsee in the future: because these are just integers, I tend to forget which way around things are. And so in this example: isn't the first integer to EXPECT_ATOM the group, which according to the comment here should be the lowest, but you're expecting the highest (2 instead of 1)? (Looks like the rank-test is applied before the atom test?) https://github.com/llvm/llvm-project/pull/133480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
https://github.com/jmorse commented: Looks good, question inline. I worried a little about whether someone out there might inadvertantly re-use this function for some other purpose, but it looks like it's dedicated to `llvm::foldBranchToCommonDest`, so pretty much a single use case. https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][debugify] Add --debugify-atoms to add key instructions metadata (PR #133483)
https://github.com/jmorse approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/133483 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
@@ -1129,13 +1130,17 @@ static void cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses( Instruction *NewBonusInst = BonusInst.clone(); -if (!isa(BonusInst) && -PTI->getDebugLoc() != NewBonusInst->getDebugLoc()) { - // Unless the instruction has the same !dbg location as the original - // branch, drop it. When we fold the bonus instructions we want to make - // sure we reset their debug locations in order to avoid stepping on - // dead code caused by folding dead branches. - NewBonusInst->setDebugLoc(DebugLoc()); +if (!isa(BonusInst)) { + if (!NewBonusInst->getDebugLoc().isSameSourceLocation( + PTI->getDebugLoc())) { +// Unless the instruction has the same !dbg location as the original +// branch, drop it. When we fold the bonus instructions we want to make +// sure we reset their debug locations in order to avoid stepping on +// dead code caused by folding dead branches. +NewBonusInst->setDebugLoc(DebugLoc()); + } else if (const DebugLoc &DL = NewBonusInst->getDebugLoc()) { +mapAtomInstance(DL, VMap); jmorse wrote: What are the consequences for this when stepping -- this is an abnormal situation seeing how the source location is identical to where it's being hoisted to. Will developers potentially step on the same source location twice? I understand the general situation of "code that is duplicated needs new atoms, because there are now multiple key instructions", but wouldn't this be different when the source-location at PTI potentially becomes key _because_ we've remapped a key instruction we've hoisted to beneath it? https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
@@ -0,0 +1,81 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 +; RUN: opt %s -S -passes=simplifycfg -bonus-inst-threshold=2 | FileCheck %s + +;; Block d gets folded into preds b and c. Check the cloned instructions get jmorse wrote: Copy the block diagram from this review into here? https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms after duplication for threading (PR #133484)
https://github.com/jmorse approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/133484 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)
https://github.com/jmorse approved this pull request. LGTM with nit https://github.com/llvm/llvm-project/pull/133485 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133485 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (PR #133485)
@@ -1821,9 +1821,9 @@ static DebugLoc inlineDebugLoc(DebugLoc OrigDL, DILocation *InlinedAt, /// to encode location where these instructions are inlined. static void fixupLineNumbers(Function *Fn, Function::iterator FI, Instruction *TheCall, bool CalleeHasDebugInfo) { - const DebugLoc &TheCallDL = TheCall->getDebugLoc(); - if (!TheCallDL) + if (!TheCall->getDebugLoc()) return; + DebugLoc TheCallDL = TheCall->getDebugLoc().get()->getOrCloneWithoutAtom(); jmorse wrote: Comment on why this is necessary desired. https://github.com/llvm/llvm-project/pull/133485 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)
@@ -2314,6 +2326,7 @@ void JumpThreadingPass::threadThroughTwoBasicBlocks(BasicBlock *PredPredBB, {DominatorTree::Insert, PredPredBB, NewBB}, {DominatorTree::Delete, PredPredBB, PredBB}}); + remapSourceAtoms(ValueMapping, NewBB->begin(), NewBB->end()); jmorse wrote: I'd say "comment pls", but I suppose the fact there's "source" in the name gives a hint as to what's going on. https://github.com/llvm/llvm-project/pull/133486 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)
@@ -0,0 +1,68 @@ +; RUN: opt -S -passes=jump-threading,verify %s | FileCheck %s + +;; Modified from llvm/test/Transforms/JumpThreading/thread-two-bbs.ll +;; +;; JumpThreading duplicates bb.cond2 to thread through bb.file to bb.f2 or exit. +;; +;; Check the duplicated instructions get remapped atom groups. + +; CHECK: bb.cond2: jmorse wrote: CHECK-LABEL? https://github.com/llvm/llvm-project/pull/133486 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)
@@ -2087,8 +2097,10 @@ void JumpThreadingPass::cloneInstructions(ValueToValueMapTy &ValueMapping, adaptNoAliasScopes(New, ClonedScopes, Context); CloneAndRemapDbgInfo(New, &*BI); +if (const DebugLoc &DL = New->getDebugLoc()) + mapAtomInstance(DL, ValueMapping); -if (RetargetDbgValueIfPossible(New)) + if (RetargetDbgValueIfPossible(New)) jmorse wrote: Accidental whitespace https://github.com/llvm/llvm-project/pull/133486 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)
https://github.com/jmorse approved this pull request. LGTM with nits https://github.com/llvm/llvm-project/pull/133486 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms in blocks duplicated for threading (PR #133486)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133486 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms after threading (PR #133487)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133487 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms after threading (PR #133487)
@@ -0,0 +1,78 @@ +; RUN: opt %s --passes=jump-threading -S -o - -S | FileCheck %s + +;;+-> T1 -+ +;;| v +-> T2 +;; Entry -+ Merge -+ +;;| ^ +-> F2 +;;+-> F1 -+ +;; +;; Thread T1 -> T2 and F1 -> F2 through Merge. +;; +;;+-> T1 +;;| +;; Entry -+ +;;| +;;+-> F1 jmorse wrote: Diagram should include T2 / F2, or is that deliberate? https://github.com/llvm/llvm-project/pull/133487 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms after threading (PR #133487)
https://github.com/jmorse commented: The previous patch in this stack #133486 calls remapSourceAtoms before calling calling threadEdge; and this patch adds a call to remapSourceAtoms into threadEdge, so it'll be called twice. Is this just inconsequential or deliberate, or meaningless? https://github.com/llvm/llvm-project/pull/133487 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][JumpThreading] Remap atoms duping bb with cond br on phi into pred (PR #133488)
https://github.com/jmorse approved this pull request. Whilst being some complicated threading, LGTM https://github.com/llvm/llvm-project/pull/133488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][LoopUnroll] Remap atoms while unrolling (PR #133489)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133489 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][LoopUnroll] Remap atoms while unrolling (PR #133489)
@@ -752,6 +752,14 @@ llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo *LI, } } + // Remap source location atom instance. Do this now, rather than + // when we remap instructions, because remap is called once we've + // cloned all blocks (all the clones would get the same atom + // number). jmorse wrote: I get why all the clones need to get distinct atom numbers; what is it here that ensures that's the case, don't we need to reset something in VMap to ensure that's the case? i.e., if we called RemapSourceAtom later what would be different? (I get the overall picture, but not how it's achieved here.) https://github.com/llvm/llvm-project/pull/133489 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][LoopUnroll] Remap atoms while unrolling (PR #133489)
https://github.com/jmorse commented: Looks good, but question inline https://github.com/llvm/llvm-project/pull/133489 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][LoopRotate] Remap atoms of duplicated instructions (PR #133490)
https://github.com/jmorse approved this pull request. LGTM (although the hand-written tests are more fun and easier to read) https://github.com/llvm/llvm-project/pull/133490 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][LoopRotate] Remap atoms of duplicated instructions (PR #133490)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133490 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][LoopUnswitch] Remap cloned instructions' atoms (PR #133491)
https://github.com/jmorse approved this pull request. Seems fine, but the test is hard to decipher, future generations will praise us if it can be simplified. https://github.com/llvm/llvm-project/pull/133491 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Propagate DebugLocs on phis in BreakCriticalEdges (PR #133492)
jmorse wrote: It can't hurt, let's making things slightly more correct! https://github.com/llvm/llvm-project/pull/133492 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const MachineFunction &MF, unsigned CUID) { return PrologEndLoc; } +void DwarfDebug::findKeyInstructions(const MachineFunction *MF) { + // New function - reset KeyInstructions. + KeyInstructions.clear(); + + // The current candidate is_stmt instructions for each source atom. + // Map {(InlinedAt, Group): (Rank, Instructions)}. + DenseMap, + std::pair>> + GroupCandidates; + + // For each instruction: + // * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros. + // * Check if insts in this group have been seen already in GroupCandidates. + // * If this instr rank is equal, add this instruction to KeyInstructions. + // Remove existing instructions from KeyInstructions if they have the + // same parent. + // * If this instr rank is higher (lower precedence), ignore it. + // * If this instr rank is lower (higher precedence), erase existing + // instructions from KeyInstructions. Add this instr to KeyInstructions. + + for (auto &MBB : *MF) { +// Rather than apply is_stmt directly to Key Instructions, we "float" +// is_stmt up to the 1st instruction with the same line number in a +// contiguous block. That instruction is called the "buoy". The +// buoy gets reset if we encouner an instruction with an atom +// group. +const MachineInstr *Buoy = nullptr; +// The atom group number associated with Buoy which may be 0 if we haven't +// encountered an atom group yet in this blob of instructions with the same +// line number. +uint64_t BuoyAtom = 0; + +for (auto &MI : MBB) { + if (MI.isMetaInstruction()) +continue; + + if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine()) +continue; + + // Reset the Buoy to this instruciton if it has a different line number. + if (!Buoy || + Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) { +Buoy = &MI; +BuoyAtom = 0; + } + + // Call instructions are handled specially - we always mark them as key + // regardless of atom info. + const auto &TII = + *MI.getParent()->getParent()->getSubtarget().getInstrInfo(); + if (MI.isCall() || TII.isTailCall(MI)) { +assert(MI.getDebugLoc() && "Unexpectedly missing DL"); + +// Calls are always key. +KeyInstructions.insert(Buoy); + +uint64_t Group = MI.getDebugLoc()->getAtomGroup(); +uint8_t Rank = MI.getDebugLoc()->getAtomRank(); +if (Group && Rank) { + auto *InlinedAt = MI.getDebugLoc()->getInlinedAt(); + auto &[CandidateRank, CandidateInsts] = GroupCandidates[{InlinedAt, Group}]; + + // This looks similar to the non-call handling code, except that + // we don't put the call into CandidateInsts so that they can't be + // made un-key. As a result, we also have to take special care not + // to erase the is_stmt from the buoy, and prevent that happening + // in the future. + + if (CandidateRank == Rank) { +// We've seen other instructions in this group of this rank. Discard +// ones we've seen in this block, keep the others. +assert(!CandidateInsts.empty()); +SmallVector Insts; +Insts.reserve(CandidateInsts.size()); +for (auto &PrevInst : CandidateInsts) { + if (PrevInst->getParent() != MI.getParent()) +Insts.push_back(PrevInst); + else if (PrevInst != Buoy) +KeyInstructions.erase(PrevInst); +} + +if (Insts.empty()) { + CandidateInsts.clear(); + CandidateRank = 0; +} else { + CandidateInsts = std::move(Insts); +} + + } else if (CandidateRank > Rank) { +// We've seen other instructions in this group of lower precedence +// (higher rank). Discard them. +for (auto *Supplanted : CandidateInsts) { + // Don't erase the is_stmt we're using for this call. + if (Supplanted != Buoy) +KeyInstructions.erase(Supplanted); +} +CandidateInsts.clear(); +CandidateRank = 0; + } +} + +// Avoid floating any future is_stmts up to the call. +Buoy = nullptr; +continue; + } + + auto *InlinedAt = MI.getDebugLoc()->getInlinedAt(); + uint64_t Group = MI.getDebugLoc()->getAtomGroup(); + uint8_t Rank = MI.getDebugLoc()->getAtomRank(); + if (!Group || !Rank) +continue; + + // Don't let is_stmts float past instructions from different source atoms. + if (BuoyAtom && BuoyAtom != Group) { +Buoy = &MI; +BuoyAtom = MI.getDebugLoc()->getAtomGroup(); jmorse wrote: ```suggestion BuoyAtom = Group; ``` Avo
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -0,0 +1,78 @@ +# RUN: llc %s --start-after=livedebugvalues --dwarf-use-key-instructions --filetype=obj -o - \ +# RUN: | llvm-objdump -d - --no-show-raw-insn \ +# RUN: | FileCheck %s --check-prefix=OBJ + +# RUN: llc %s --start-after=livedebugvalues --dwarf-use-key-instructions --filetype=obj -o - \ +# RUN: | llvm-dwarfdump - --debug-line \ +# RUN: | FileCheck %s --check-prefix=DBG + +# OBJ: <_Z1fPiii>: +# OBJ-NEXT: 0: movl$0x0, %ebx +# OBJ-NEXT: 5: movl$0x1, %ebx +# OBJ-NEXT: a: movl$0x2, %ebx +# OBJ-NEXT: f: movl$0x3, %ebx +# OBJ-NEXT: 14: movl$0x4, %eax +# OBJ-NEXT: 19: movl$0x5, %eax +# OBJ-NEXT: 1e: movl$0x6, %eax +# OBJ-NEXT: 23: movl$0x7, %eax +# OBJ-NEXT: 28: retq + +# DBG: AddressLine Column File ISA Discriminator OpIndex Flags +# DBG-NEXT: -- -- -- -- --- - --- - +# DBG-NEXT: 0x 1 0 0 0 0 0 is_stmt prologue_end +# DBG-NEXT: 0x0005 2 0 0 0 0 0 is_stmt +# DBG-NEXT: 0x000a 2 0 0 0 0 0 +# DBG-NEXT: 0x000f 2 0 0 0 0 0 +# DBG-NEXT: 0x0014 2 0 0 0 0 0 +# DBG-NEXT: 0x0019 2 0 0 0 0 0 +# DBG-NEXT: 0x001e 2 0 0 0 0 0 is_stmt +# DBG-NEXT: 0x0023 2 0 0 0 0 0 is_stmt +# DBG-NEXT: 0x0029 2 0 0 0 0 0 is_stmt end_sequence + +## Check that interleaving atoms on the same line still produces reasonable +## is_stmt placement (the is_stmts want to "float up" to the first instruction +## in a contiguous set with the same line, but we don't let them float past +## other atom groups). + +--- | + target triple = "x86_64-unknown-linux-gnu" + + define hidden noundef i32 @_Z1fPiii(ptr %a, i32 %b, i32 %c, i1 %cond) local_unnamed_addr !dbg !5 { + entry: +ret i32 2 + } + + !llvm.dbg.cu = !{!0} + !llvm.module.flags = !{!2, !3} + !llvm.ident = !{!4} + + !0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_17, file: !1, producer: "clang version 19.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None) + !1 = !DIFile(filename: "test.cpp", directory: "/") + !2 = !{i32 7, !"Dwarf Version", i32 5} + !3 = !{i32 2, !"Debug Info Version", i32 3} + !4 = !{!"clang version 19.0.0"} + !5 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 1, type: !6, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0) + !6 = !DISubroutineType(types: !7) + !7 = !{} + !8 = !DILocalVariable(name: "x", scope: !5, file: !1, line: 1, type: !7) + +... +--- +name:_Z1fPiii +alignment: 16 +body: | + bb.0.entry: +$ebx = MOV32ri 0, debug-location !DILocation(line: 1, scope: !5) +;; is_stmt floats up here from mov 3. +$ebx = MOV32ri 1, debug-location !DILocation(line: 2, scope: !5, atomGroup: 1, atomRank: 1) +$ebx = MOV32ri 2, debug-location !DILocation(line: 2, scope: !5, atomGroup: 1, atomRank: 2) +$ebx = MOV32ri 3, debug-location !DILocation(line: 2, scope: !5, atomGroup: 1, atomRank: 1) +$eax = MOV32ri 4, debug-location !DILocation(line: 2, scope: !5) +$eax = MOV32ri 5, debug-location !DILocation(line: 2, scope: !5, atomGroup: 2, atomRank: 1) jmorse wrote: My understanding is that this instruction should get an is_stmt in its own entry because it's a separate span of atom group 2, is that right? But doesn't (address 0x19 above?) as far as I can tell. https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -0,0 +1,117 @@ +; RUN: llc %s --filetype=obj -o - --dwarf-use-key-instructions \ +; RUN: | llvm-objdump -d - --no-show-raw-insn \ +; RUN: | FileCheck %s --check-prefix=OBJ + +; RUN: llc %s --filetype=obj -o - --dwarf-use-key-instructions \ +; RUN: | llvm-dwarfdump - --debug-line \ +; RUN: | FileCheck %s --check-prefix=DBG + +; OBJ: : +; OBJ-NEXT: 0: pushq %rbp +; OBJ-NEXT: 1: pushq %r14 +; OBJ-NEXT: 3: pushq %rbx +; OBJ-NEXT: 4: movq(%rip), %rax +; OBJ-NEXT: b: movl(%rax), %ebp +; OBJ-NEXT: d: callq 0x12 +; OBJ-NEXT: 12: callq 0x17 +; OBJ-NEXT: 17: movl%eax, %ebx +; OBJ-NEXT: 19: addl%ebp, %ebx +; OBJ-NEXT: 1b: movq(%rip), %r14 +; OBJ-NEXT: 22: movl$0x1, (%r14) +; OBJ-NEXT: 29: callq 0x2e +; OBJ-NEXT: 2e: movl$0x2, (%r14) +; OBJ-NEXT: 35: callq 0x3a +; OBJ-NEXT: 3a: movl$0x3, (%r14) +; OBJ-NEXT: 41: callq 0x46 +; OBJ-NEXT: 46: movl$0x4, (%r14) +; OBJ-NEXT: 4d: callq 0x52 +; OBJ-NEXT: 52: movl%ebx, %eax +; OBJ-NEXT: 54: popq%rbx +; OBJ-NEXT: 55: popq%r14 +; OBJ-NEXT: 57: popq%rbp +; OBJ-NEXT: 58: retq + +; DBG: AddressLine Column File ISA Discriminator OpIndex Flags +; DBG-NEXT: -- -- -- -- --- - --- - +; DBG-NEXT: 0x 1 0 0 0 0 0 is_stmt +; DBG-NEXT: 0x0004 2 0 0 0 0 0 is_stmt prologue_end + +;; Test A: +;; Check the 1st call (line 3) gets is_stmt despite having no atom group. +; DBG-NEXT: 0x000d 3 0 0 0 0 0 is_stmt + +;; Test B: +;; Check the 2nd call (line 4) gets is_stmt applied despite being part of group +;; 1 and having lower precedence than the add. Check that the add stil gets +;; is_stmt applied. +;; There are two is_stmt line 4 entries are is_stmt because we don't float jmorse wrote: ```suggestion ;; There are two is_stmt line 4 entries because we don't float ``` https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
https://github.com/jmorse commented: Tests: I'd personally prefer the input source and explanation at the top of the file, although this is a style thing. My understanding of this code is that within a basic block, it should be possible for there to be two sequences of instructions of equal group and rank that both get a buoy and is_stmt, if they're separated by some other atom group. Perhaps I'm wrong; but if I'm right, it probably wants explicit test coverage. There's a risk that the code being generated is brittle and frequently needs updating; I don't know what probability of that we find acceptable, let's just see how often it needs updating I guess. https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
https://github.com/jmorse approved this pull request. I continued my ramblings inline; I see the potential for some slightly unexpected stepping behaviour, but still strictly better than todays "everything is a breakpoint" approach. I don't think I can put my finger on a specific bad behaviour that might come from this, so LGTM. (Would still be nice to have the diagram in the test). https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
@@ -1243,6 +1243,140 @@ TEST_F(DILocationTest, Merge) { auto *M2 = DILocation::getMergedLocation(A2, B); EXPECT_EQ(M1, M2); } + +#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS +#define EXPECT_ATOM(Loc, Group, Rank) \ + EXPECT_EQ(Group, M->getAtomGroup()); \ + EXPECT_EQ(Rank, M->getAtomRank()); +#else +#define EXPECT_ATOM(Loc, Group, Rank) \ + EXPECT_EQ(0u, M->getAtomGroup()); \ + EXPECT_EQ(0u, M->getAtomRank()); \ + (void)Group; \ + (void)Rank; +#endif + // Identical, including source atom numbers. + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +// DILocations are uniqued, so we can check equality by ptr. +EXPECT_EQ(M, DILocation::getMergedLocation(A, B)); + } + + // Identical but different atom ranks (same atom) - choose the lowest nonzero + // rank. + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0); +B = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 2); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 2u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Identical but different atom ranks (different atom) - choose the lowest + // nonzero rank. + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 0); +B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 2); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 2u, 2u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Identical but equal atom rank (different atom) - choose the lowest non-zero + // group (arbitrary choice for deterministic behaviour). + { +auto *A = DILocation::get(Context, 2, 7, N, nullptr, false, 1, 1); +auto *B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 1u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + +A = DILocation::get(Context, 2, 7, N, nullptr, false, 0, 1); +B = DILocation::get(Context, 2, 7, N, nullptr, false, 2, 1); +M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 2u, 1u); +EXPECT_EQ(M, DILocation::getMergedLocation(B, A)); + } + + // Completely different except same atom numbers. Zero out the atoms. + { +auto *I = DILocation::get(Context, 2, 7, N); +auto *A = DILocation::get(Context, 1, 6, S, I, false, 1, 1); +auto *B = +DILocation::get(Context, 2, 7, getSubprogram(), nullptr, false, 1, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_EQ(0u, M->getLine()); +EXPECT_EQ(0u, M->getColumn()); +EXPECT_TRUE(isa(M->getScope())); +EXPECT_EQ(S, M->getScope()); +EXPECT_EQ(nullptr, M->getInlinedAt()); + } + + // Same inlined-at chain but different atoms. Choose the lowest + // non-zero group (arbitrary choice for deterministic behaviour). + { +auto *I = DILocation::get(Context, 1, 7, N); +auto *F = getSubprogram(); +auto *A = DILocation::get(Context, 1, 1, F, I, false, 1, 2); +auto *B = DILocation::get(Context, 1, 1, F, I, false, 2, 1); +auto *M = DILocation::getMergedLocation(A, B); +EXPECT_ATOM(M, 2u, 1u); jmorse wrote: No real suggestions; I think the C++17 way might be to invent user-defined-literals and an integer-like object, but that seems like waaa overkill here. Or maybe enum classes? The unit tests will also (IMHO YMMV?) be the only place where literals get fed into this. An ugly alternative is to put the pattern in the name, i.e. "DILocation::getWithAtomRank", so that every time it appears you're reminded what argument order it should be. https://github.com/llvm/llvm-project/pull/133480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
https://github.com/jmorse approved this pull request. LGTM with the style nits https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const MachineFunction &MF, unsigned CUID) { return PrologEndLoc; } +void DwarfDebug::findKeyInstructions(const MachineFunction *MF) { + // New function - reset KeyInstructions. + KeyInstructions.clear(); + + // The current candidate is_stmt instructions for each source atom. + // Map {(InlinedAt, Group): (Rank, Instructions)}. + DenseMap, + std::pair>> + GroupCandidates; + + // For each instruction: + // * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros. + // * Check if insts in this group have been seen already in GroupCandidates. + // * If this instr rank is equal, add this instruction to KeyInstructions. + // Remove existing instructions from KeyInstructions if they have the + // same parent. + // * If this instr rank is higher (lower precedence), ignore it. + // * If this instr rank is lower (higher precedence), erase existing + // instructions from KeyInstructions. Add this instr to KeyInstructions. + + for (auto &MBB : *MF) { +// Rather than apply is_stmt directly to Key Instructions, we "float" +// is_stmt up to the 1st instruction with the same line number in a +// contiguous block. That instruction is called the "buoy". The +// buoy gets reset if we encouner an instruction with an atom +// group. +const MachineInstr *Buoy = nullptr; +// The atom group number associated with Buoy which may be 0 if we haven't +// encountered an atom group yet in this blob of instructions with the same +// line number. +uint64_t BuoyAtom = 0; + +for (auto &MI : MBB) { + if (MI.isMetaInstruction()) +continue; + + if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine()) +continue; + + // Reset the Buoy to this instruciton if it has a different line number. + if (!Buoy || + Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) { +Buoy = &MI; +BuoyAtom = 0; + } + + // Call instructions are handled specially - we always mark them as key + // regardless of atom info. + const auto &TII = + *MI.getParent()->getParent()->getSubtarget().getInstrInfo(); + if (MI.isCall() || TII.isTailCall(MI)) { +assert(MI.getDebugLoc() && "Unexpectedly missing DL"); + +// Calls are always key. +KeyInstructions.insert(Buoy); + +uint64_t Group = MI.getDebugLoc()->getAtomGroup(); +uint8_t Rank = MI.getDebugLoc()->getAtomRank(); +if (Group && Rank) { + auto *InlinedAt = MI.getDebugLoc()->getInlinedAt(); + auto &[CandidateRank, CandidateInsts] = GroupCandidates[{InlinedAt, Group}]; + + // This looks similar to the non-call handling code, except that + // we don't put the call into CandidateInsts so that they can't be + // made un-key. As a result, we also have to take special care not + // to erase the is_stmt from the buoy, and prevent that happening + // in the future. + + if (CandidateRank == Rank) { +// We've seen other instructions in this group of this rank. Discard +// ones we've seen in this block, keep the others. +assert(!CandidateInsts.empty()); +SmallVector Insts; +Insts.reserve(CandidateInsts.size()); +for (auto &PrevInst : CandidateInsts) { + if (PrevInst->getParent() != MI.getParent()) +Insts.push_back(PrevInst); + else if (PrevInst != Buoy) +KeyInstructions.erase(PrevInst); +} + +if (Insts.empty()) { + CandidateInsts.clear(); + CandidateRank = 0; +} else { + CandidateInsts = std::move(Insts); +} + + } else if (CandidateRank > Rank) { +// We've seen other instructions in this group of lower precedence +// (higher rank). Discard them. +for (auto *Supplanted : CandidateInsts) { + // Don't erase the is_stmt we're using for this call. + if (Supplanted != Buoy) +KeyInstructions.erase(Supplanted); +} +CandidateInsts.clear(); +CandidateRank = 0; + } +} + +// Avoid floating any future is_stmts up to the call. +Buoy = nullptr; +continue; + } + + auto *InlinedAt = MI.getDebugLoc()->getInlinedAt(); + uint64_t Group = MI.getDebugLoc()->getAtomGroup(); + uint8_t Rank = MI.getDebugLoc()->getAtomRank(); + if (!Group || !Rank) +continue; + + // Don't let is_stmts float past instructions from different source atoms. + if (BuoyAtom && BuoyAtom != Group) { +Buoy = &MI; +BuoyAtom = MI.getDebugLoc()->getAtomGroup(); + } + + auto &[CandidateRank, CandidateInsts] = GroupCandidates[{Inli
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const MachineFunction &MF, unsigned CUID) { return PrologEndLoc; } +void DwarfDebug::findKeyInstructions(const MachineFunction *MF) { + // New function - reset KeyInstructions. + KeyInstructions.clear(); + + // The current candidate is_stmt instructions for each source atom. + // Map {(InlinedAt, Group): (Rank, Instructions)}. + DenseMap, + std::pair>> + GroupCandidates; jmorse wrote: This will be big due to each dense-map element containing a SmallVector of pointers; IMO we need to do some profiling and set a SmallVector allocation size. https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const MachineFunction &MF, unsigned CUID) { return PrologEndLoc; } +void DwarfDebug::findKeyInstructions(const MachineFunction *MF) { + // New function - reset KeyInstructions. + KeyInstructions.clear(); + + // The current candidate is_stmt instructions for each source atom. + // Map {(InlinedAt, Group): (Rank, Instructions)}. + DenseMap, + std::pair>> + GroupCandidates; + + // For each instruction: + // * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros. + // * Check if insts in this group have been seen already in GroupCandidates. + // * If this instr rank is equal, add this instruction to KeyInstructions. + // Remove existing instructions from KeyInstructions if they have the + // same parent. + // * If this instr rank is higher (lower precedence), ignore it. + // * If this instr rank is lower (higher precedence), erase existing + // instructions from KeyInstructions. Add this instr to KeyInstructions. + + for (auto &MBB : *MF) { +// Rather than apply is_stmt directly to Key Instructions, we "float" +// is_stmt up to the 1st instruction with the same line number in a +// contiguous block. That instruction is called the "buoy". The +// buoy gets reset if we encouner an instruction with an atom +// group. +const MachineInstr *Buoy = nullptr; +// The atom group number associated with Buoy which may be 0 if we haven't +// encountered an atom group yet in this blob of instructions with the same +// line number. +uint64_t BuoyAtom = 0; + +for (auto &MI : MBB) { + if (MI.isMetaInstruction()) +continue; + + if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine()) +continue; + + // Reset the Buoy to this instruciton if it has a different line number. + if (!Buoy || + Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) { +Buoy = &MI; +BuoyAtom = 0; + } + + // Call instructions are handled specially - we always mark them as key + // regardless of atom info. + const auto &TII = + *MI.getParent()->getParent()->getSubtarget().getInstrInfo(); + if (MI.isCall() || TII.isTailCall(MI)) { +assert(MI.getDebugLoc() && "Unexpectedly missing DL"); + +// Calls are always key. +KeyInstructions.insert(Buoy); + +uint64_t Group = MI.getDebugLoc()->getAtomGroup(); +uint8_t Rank = MI.getDebugLoc()->getAtomRank(); +if (Group && Rank) { + auto *InlinedAt = MI.getDebugLoc()->getInlinedAt(); + auto &[CandidateRank, CandidateInsts] = GroupCandidates[{InlinedAt, Group}]; + + // This looks similar to the non-call handling code, except that + // we don't put the call into CandidateInsts so that they can't be + // made un-key. As a result, we also have to take special care not + // to erase the is_stmt from the buoy, and prevent that happening + // in the future. + + if (CandidateRank == Rank) { +// We've seen other instructions in this group of this rank. Discard +// ones we've seen in this block, keep the others. +assert(!CandidateInsts.empty()); +SmallVector Insts; +Insts.reserve(CandidateInsts.size()); +for (auto &PrevInst : CandidateInsts) { + if (PrevInst->getParent() != MI.getParent()) +Insts.push_back(PrevInst); + else if (PrevInst != Buoy) +KeyInstructions.erase(PrevInst); +} + +if (Insts.empty()) { + CandidateInsts.clear(); + CandidateRank = 0; +} else { + CandidateInsts = std::move(Insts); +} + + } else if (CandidateRank > Rank) { +// We've seen other instructions in this group of lower precedence +// (higher rank). Discard them. +for (auto *Supplanted : CandidateInsts) { + // Don't erase the is_stmt we're using for this call. + if (Supplanted != Buoy) +KeyInstructions.erase(Supplanted); +} +CandidateInsts.clear(); +CandidateRank = 0; + } +} + +// Avoid floating any future is_stmts up to the call. +Buoy = nullptr; +continue; + } + + auto *InlinedAt = MI.getDebugLoc()->getInlinedAt(); + uint64_t Group = MI.getDebugLoc()->getAtomGroup(); + uint8_t Rank = MI.getDebugLoc()->getAtomRank(); + if (!Group || !Rank) +continue; + + // Don't let is_stmts float past instructions from different source atoms. + if (BuoyAtom && BuoyAtom != Group) { +Buoy = &MI; +BuoyAtom = MI.getDebugLoc()->getAtomGroup(); + } + + auto &[CandidateRank, CandidateInsts] = GroupCandidates[{Inli
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -2087,13 +2095,18 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) { // If we have an ongoing unspecified location, nothing to do here. if (!DL) return; -// We have an explicit location, same as the previous location. -// But we might be coming back to it after a line 0 record. -if ((LastAsmLine == 0 && DL.getLine() != 0) || Flags) { - // Reinstate the source location but not marked as a statement. - RecordSourceLine(DL, Flags); + +// Skip this if the instruction is Key, else we might accidentally miss an +// is_stmt. +if (!IsKey) { jmorse wrote: Can we not fold this test, and the comment, into the existing test? "If we have an explicit non-key location". (Avoids some un-necessary indentation). https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const MachineFunction &MF, unsigned CUID) { return PrologEndLoc; } +void DwarfDebug::findKeyInstructions(const MachineFunction *MF) { + // New function - reset KeyInstructions. + KeyInstructions.clear(); + + // The current candidate is_stmt instructions for each source atom. + // Map {(InlinedAt, Group): (Rank, Instructions)}. + DenseMap, + std::pair>> + GroupCandidates; + + // For each instruction: + // * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros. + // * Check if insts in this group have been seen already in GroupCandidates. + // * If this instr rank is equal, add this instruction to KeyInstructions. + // Remove existing instructions from KeyInstructions if they have the + // same parent. + // * If this instr rank is higher (lower precedence), ignore it. + // * If this instr rank is lower (higher precedence), erase existing + // instructions from KeyInstructions. Add this instr to KeyInstructions. + + for (auto &MBB : *MF) { +// Rather than apply is_stmt directly to Key Instructions, we "float" +// is_stmt up to the 1st instruction with the same line number in a +// contiguous block. That instruction is called the "buoy". The +// buoy gets reset if we encouner an instruction with an atom +// group. +const MachineInstr *Buoy = nullptr; +// The atom group number associated with Buoy which may be 0 if we haven't +// encountered an atom group yet in this blob of instructions with the same +// line number. +uint64_t BuoyAtom = 0; + +for (auto &MI : MBB) { + if (MI.isMetaInstruction()) +continue; + + if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine()) +continue; + + // Reset the Buoy to this instruciton if it has a different line number. + if (!Buoy || + Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) { +Buoy = &MI; +BuoyAtom = 0; + } + + // Call instructions are handled specially - we always mark them as key + // regardless of atom info. + const auto &TII = + *MI.getParent()->getParent()->getSubtarget().getInstrInfo(); + if (MI.isCall() || TII.isTailCall(MI)) { +assert(MI.getDebugLoc() && "Unexpectedly missing DL"); + +// Calls are always key. +KeyInstructions.insert(Buoy); + +uint64_t Group = MI.getDebugLoc()->getAtomGroup(); +uint8_t Rank = MI.getDebugLoc()->getAtomRank(); +if (Group && Rank) { jmorse wrote: Early-continue instead perhaps? https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
https://github.com/jmorse commented: I get the impression that `GroupCandidates` and `KeyInstructions` are being kept strictly in sync; thus couldn't one instead just load KeyInstructions from GroupCandidates once the scan is complete? This avoids filling up the dense map with tombstones. Am I right in understanding that the buoy means the "least precedence" instruction will get the is_stmt if the highest precedence appears after it in the contiguous blob? (Seems fine, just making sure I understand). On the whole, the computation function feels like it could be simpler, but in some intangible way I'm not immediately sure of. (Still reading the tests). https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const MachineFunction &MF, unsigned CUID) { return PrologEndLoc; } +void DwarfDebug::findKeyInstructions(const MachineFunction *MF) { + // New function - reset KeyInstructions. + KeyInstructions.clear(); + + // The current candidate is_stmt instructions for each source atom. + // Map {(InlinedAt, Group): (Rank, Instructions)}. + DenseMap, + std::pair>> + GroupCandidates; + + // For each instruction: + // * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros. + // * Check if insts in this group have been seen already in GroupCandidates. + // * If this instr rank is equal, add this instruction to KeyInstructions. + // Remove existing instructions from KeyInstructions if they have the + // same parent. + // * If this instr rank is higher (lower precedence), ignore it. + // * If this instr rank is lower (higher precedence), erase existing + // instructions from KeyInstructions. Add this instr to KeyInstructions. + + for (auto &MBB : *MF) { +// Rather than apply is_stmt directly to Key Instructions, we "float" +// is_stmt up to the 1st instruction with the same line number in a +// contiguous block. That instruction is called the "buoy". The +// buoy gets reset if we encouner an instruction with an atom +// group. +const MachineInstr *Buoy = nullptr; +// The atom group number associated with Buoy which may be 0 if we haven't +// encountered an atom group yet in this blob of instructions with the same +// line number. +uint64_t BuoyAtom = 0; + +for (auto &MI : MBB) { + if (MI.isMetaInstruction()) +continue; + + if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine()) +continue; + + // Reset the Buoy to this instruciton if it has a different line number. jmorse wrote: ```suggestion // Reset the Buoy to this instruction if it has a different line number. ``` https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][DwarfDebug] Add is_stmt emission support (PR #133495)
@@ -2333,6 +2352,170 @@ DwarfDebug::emitInitialLocDirective(const MachineFunction &MF, unsigned CUID) { return PrologEndLoc; } +void DwarfDebug::findKeyInstructions(const MachineFunction *MF) { + // New function - reset KeyInstructions. + KeyInstructions.clear(); + + // The current candidate is_stmt instructions for each source atom. + // Map {(InlinedAt, Group): (Rank, Instructions)}. + DenseMap, + std::pair>> + GroupCandidates; + + // For each instruction: + // * Skip insts without DebugLoc, AtomGroup or AtomRank, and line zeros. + // * Check if insts in this group have been seen already in GroupCandidates. + // * If this instr rank is equal, add this instruction to KeyInstructions. + // Remove existing instructions from KeyInstructions if they have the + // same parent. + // * If this instr rank is higher (lower precedence), ignore it. + // * If this instr rank is lower (higher precedence), erase existing + // instructions from KeyInstructions. Add this instr to KeyInstructions. + + for (auto &MBB : *MF) { +// Rather than apply is_stmt directly to Key Instructions, we "float" +// is_stmt up to the 1st instruction with the same line number in a +// contiguous block. That instruction is called the "buoy". The +// buoy gets reset if we encouner an instruction with an atom +// group. +const MachineInstr *Buoy = nullptr; +// The atom group number associated with Buoy which may be 0 if we haven't +// encountered an atom group yet in this blob of instructions with the same +// line number. +uint64_t BuoyAtom = 0; + +for (auto &MI : MBB) { + if (MI.isMetaInstruction()) +continue; + + if (!MI.getDebugLoc() || !MI.getDebugLoc().getLine()) +continue; + + // Reset the Buoy to this instruciton if it has a different line number. + if (!Buoy || + Buoy->getDebugLoc().getLine() != MI.getDebugLoc().getLine()) { +Buoy = &MI; +BuoyAtom = 0; + } + + // Call instructions are handled specially - we always mark them as key + // regardless of atom info. + const auto &TII = + *MI.getParent()->getParent()->getSubtarget().getInstrInfo(); + if (MI.isCall() || TII.isTailCall(MI)) { +assert(MI.getDebugLoc() && "Unexpectedly missing DL"); + +// Calls are always key. +KeyInstructions.insert(Buoy); jmorse wrote: ```suggestion KeyInstructions.insert(&MI); ``` Avoids any doubt about what's being inserted. https://github.com/llvm/llvm-project/pull/133495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
@@ -1129,13 +1130,17 @@ static void cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses( Instruction *NewBonusInst = BonusInst.clone(); -if (!isa(BonusInst) && -PTI->getDebugLoc() != NewBonusInst->getDebugLoc()) { - // Unless the instruction has the same !dbg location as the original - // branch, drop it. When we fold the bonus instructions we want to make - // sure we reset their debug locations in order to avoid stepping on - // dead code caused by folding dead branches. - NewBonusInst->setDebugLoc(DebugLoc()); +if (!isa(BonusInst)) { + if (!NewBonusInst->getDebugLoc().isSameSourceLocation( + PTI->getDebugLoc())) { +// Unless the instruction has the same !dbg location as the original +// branch, drop it. When we fold the bonus instructions we want to make +// sure we reset their debug locations in order to avoid stepping on +// dead code caused by folding dead branches. +NewBonusInst->setDebugLoc(DebugLoc()); + } else if (const DebugLoc &DL = NewBonusInst->getDebugLoc()) { +mapAtomInstance(DL, VMap); jmorse wrote: For the first part I think that answers the question (multiple instructions with the same source-location getting is_stmt). Would there also be potential for an intervening non-is_stmt instruction with the same source line between the two? It feels possible from instruction scheduling if not immediately in this pass, and if so I guess it's a general consequence of the technique. It doesn't seem unreasonable, although perhaps we need some (debuggers) test coverage of how to interpret it. After all we're not trying to program the debugger from the compiler, we're communicating that "this source location really does happen to execute twice". > I'm not sure what you mean about PTI's location becoming key? e.g. looking at > the test - the new branches in b and c are only "key" because they're clones > of the cond br from d (which is already "key"). I was wondering whether there can be a case where the instruction at PTI and the bonus instruction are in the same group, but the one being remapped here has the higher rank, and putting it in a different group causes the previously-lower-ranked instruction to become the highest ranked as a result. At face value this isn't a problem because the whole point of this function is we're duplicating a code path; but could there be scenarios where LLVM uses this utility to implement a move (i.e. clone-then-delete-old-path?) Or to put it another way: I believe (but may be wrong) that `cloneInstructionsIntoPredecessorBlockAndUpdateSSAUses` currently copies/moves instructions from one place to another. But, with this change, what was a plain copy/move now comes with changes to the stepping behaviour. I think this boils down to saying "so what?", and the answer to that is "the stepping is still superior to without key instructions". I'm just trying to think through the consequences of how these changes compose together. https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr][SimplifyCFG] Remap atoms when folding br to common succ into pred (PR #133482)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Remap cloned PHIs in BreakCriticalEdges (PR #133493)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133493 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Remap cloned PHIs in BreakCriticalEdges (PR #133493)
@@ -0,0 +1,52 @@ +; RUN: opt -passes='require,function(codegenprepare)' -S -mtriple=x86_64 < %s \ +; RUN: | FileCheck %s + +;; Check debug locations are propagated onto new PHIs. jmorse wrote: "...and that source locations have their atom groups remapped" https://github.com/llvm/llvm-project/pull/133493 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Remap cloned PHIs in BreakCriticalEdges (PR #133493)
https://github.com/jmorse approved this pull request. LGTM with nit https://github.com/llvm/llvm-project/pull/133493 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add MIR parser support (PR #133494)
https://github.com/jmorse approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/133494 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Merge atoms in DILocation::getMergedLocation (PR #133480)
jmorse wrote: Github has helpfully chucked away my comments; was the unit-test-error that I fixed corrected? If so, LGTM. https://github.com/llvm/llvm-project/pull/133480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -0,0 +1,2348 @@ +//===-- LVIRReader.cpp ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This implements the LVIRReader class. +// It supports LLVM text IR and bitcode format. +// +//===--===// + +#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h" +#include "llvm/CodeGen/DebugHandlerBase.h" +#include "llvm/DebugInfo/LogicalView/Core/LVLine.h" +#include "llvm/DebugInfo/LogicalView/Core/LVScope.h" +#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h" +#include "llvm/DebugInfo/LogicalView/Core/LVType.h" +#include "llvm/IR/DebugInfoMetadata.h" +#include "llvm/IR/Instructions.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/Module.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Object/Error.h" +#include "llvm/Object/IRObjectFile.h" +#include "llvm/Support/FormatAdapters.h" +#include "llvm/Support/FormatVariadic.h" +#include "llvm/Support/SourceMgr.h" + +using namespace llvm; +using namespace llvm::object; +using namespace llvm::logicalview; + +#define DEBUG_TYPE "IRReader" + +// Extra debug traces. Default is false +#define DEBUG_ALL + +// These flavours of DINodes are not handled: +// DW_TAG_APPLE_property = 19896 +// DW_TAG_atomic_type = 71 +// DW_TAG_common_block = 26 +// DW_TAG_file_type= 41 +// DW_TAG_friend = 42 +// DW_TAG_generic_subrange = 69 +// DW_TAG_immutable_type = 75 +// DW_TAG_module = 30 + +// Create a logical element and setup the following information: +// - Name, DWARF tag, line +// - Collect any file information +LVElement *LVIRReader::constructElement(const DINode *DN) { + dwarf::Tag Tag = DN->getTag(); + LVElement *Element = createElement(Tag); + if (Element) { +Element->setTag(Tag); +addMD(DN, Element); + +StringRef Name = getMDName(DN); +if (!Name.empty()) + Element->setName(Name); + +// Record any file information. +if (const DIFile *File = getMDFile(DN)) + getOrCreateSourceID(File); + } + + return Element; +} + +void LVIRReader::mapFortranLanguage(unsigned DWLang) { + switch (DWLang) { + case dwarf::DW_LANG_Fortran77: + case dwarf::DW_LANG_Fortran90: + case dwarf::DW_LANG_Fortran95: + case dwarf::DW_LANG_Fortran03: + case dwarf::DW_LANG_Fortran08: + case dwarf::DW_LANG_Fortran18: +LanguageIsFortran = true; +break; + default: +LanguageIsFortran = false; + } +} + +// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only +// difference is setting the 'DICompileUnit::splitDebugFilename' to the +// name of the split filename: "xxx.dwo". +bool LVIRReader::includeMinimalInlineScopes() const { + return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly; +} + +// For the given 'DIFile' generate an index 1-based to indicate the +// source file where the logical element is declared. +// In DWARF v4, the files are 1-indexed. +// In DWARF v5, the files are 0-indexed. +// The IR reader expects the indexes as 1-indexed. +// Each compile unit, keeps track of the last assigned index. +size_t LVIRReader::getOrCreateSourceID(const DIFile *File) { + if (!File) +return 0; + +#ifdef DEBUG_ALL + LLVM_DEBUG({ +dbgs() << "\n[getOrCreateSourceID] DIFile\n"; +File->dump(); + }); +#endif + + addMD(File, CompileUnit); + + LLVM_DEBUG({ +dbgs() << "Directory: '" << File->getDirectory() << "'\n"; +dbgs() << "Filename: '" << File->getFilename() << "'\n"; + }); + size_t FileIndex = 0; + LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File); + if (Iter == CompileUnitFiles.cend()) { +FileIndex = getFileIndex(CompileUnit); +std::string Directory(File->getDirectory()); +if (Directory.empty()) + Directory = std::string(CompileUnit->getCompilationDirectory()); + +std::string FullName; +raw_string_ostream Out(FullName); +Out << Directory << "/" << llvm::sys::path::filename(File->getFilename()); +CompileUnit->addFilename(transformPath(FullName)); +CompileUnitFiles.emplace(File, ++FileIndex); +updateFileIndex(CompileUnit, FileIndex); + } else { +FileIndex = Iter->second; + } + + LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; }); + return FileIndex; +} + +void LVIRReader::addSourceLine(LVElement *Element, unsigned Line, + const DIFile *File) { + if (Line == 0) +return; + + // After the scopes are created, the generic reader traverses the 'Children' + // and perform additional setting tasks (resolve types names, references, + // etc.). One of those tasks is select the correct string pool index based on + // the commmand line o
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
https://github.com/jmorse commented: Some initial comments; I made it to about 1000 lines into LVIRReader.cpp. https://github.com/llvm/llvm-project/pull/135440 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -0,0 +1,300 @@ +//===-- LVIRReader.h *- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file defines the LVIRReader class, which is used to describe a +// LLVM IR reader. +// +//===--===// + +#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H +#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H + +#include "llvm/DebugInfo/LogicalView/Core/LVReader.h" +#include "llvm/Transforms/Utils/DebugSSAUpdater.h" + +namespace llvm { +class DIFile; +class DINode; +class DILocation; +class DIScope; +class DISubprogram; +class DIVariable; +class BasicBlock; + +namespace object { +class IRObjectFile; +} + +namespace logicalview { + +class LVElement; +class LVLine; +class LVScopeCompileUnit; +class LVSymbol; +class LVType; + +class LVIRReader final : public LVReader { + object::IRObjectFile *BitCodeIR = nullptr; + MemoryBufferRef *TextualIR = nullptr; + + // Symbols with locations for current compile unit. + LVSymbols SymbolsWithLocations; + + LVSectionIndex SectionIndex = 0; + + const DICompileUnit *CUNode = nullptr; + + // The Dwarf Version (from the module flags). + unsigned DwarfVersion; + + // Location index for global variables. + uint64_t PoolAddressIndex = 0; + + // Whether to emit all linkage names, or just abstract subprograms. + bool UseAllLinkageNames = true; + + // Dependencies on external options (llc, etc). + bool includeMinimalInlineScopes() const; + bool useAllLinkageNames() const { return UseAllLinkageNames; } + + bool LanguageIsFortran = false; + void mapFortranLanguage(unsigned DWLang); + bool moduleIsInFortran() const { return LanguageIsFortran; } + + // Generate logical debug line before prologue. + bool GenerateLineBeforePrologue = true; + + // We assume a constante increase between instructions. + const unsigned OffsetIncrease = 4; + void updateLineOffset() { CurrentOffset += OffsetIncrease; } + + // An anonymous type for index type. + LVType *NodeIndexType = nullptr; + + std::unique_ptr DbgValueRanges; + + // Record the last assigned file index for each compile unit. + using LVIndexFiles = std::map; + LVIndexFiles IndexFiles; jmorse wrote: It wasn't immediately obvious to me that this is a per-CU counter of the /next/ FileID to assign to a new file that gets discovered. I think it's worth recording in the comment here that this data-structure is to aid mapping DIFiles onto a DWARF-like file table, to explain what's going on. https://github.com/llvm/llvm-project/pull/135440 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -728,14 +514,16 @@ void LVDWARFReader::createLineAndFileRecords( for (const DWARFDebugLine::FileNameEntry &Entry : Lines->Prologue.FileNames) { std::string Directory; - if (Lines->getDirectoryForEntry(Entry, Directory)) -Directory = transformPath(Directory); + Lines->getDirectoryForEntry(Entry, Directory); if (Directory.empty()) Directory = std::string(CompileUnit->getCompilationDirectory()); - std::string File = transformPath(dwarf::toStringRef(Entry.Name)); + // Take just the filename component, as it may be contain the full + // path that would be added to the already existing path in Directory. jmorse wrote: This is a functional change to the DWARF reader, is it related to adding the IRReader, or can we defer it to some other patch instead? https://github.com/llvm/llvm-project/pull/135440 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -0,0 +1,2348 @@ +//===-- LVIRReader.cpp ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This implements the LVIRReader class. +// It supports LLVM text IR and bitcode format. +// +//===--===// + +#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h" +#include "llvm/CodeGen/DebugHandlerBase.h" +#include "llvm/DebugInfo/LogicalView/Core/LVLine.h" +#include "llvm/DebugInfo/LogicalView/Core/LVScope.h" +#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h" +#include "llvm/DebugInfo/LogicalView/Core/LVType.h" +#include "llvm/IR/DebugInfoMetadata.h" +#include "llvm/IR/Instructions.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/Module.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Object/Error.h" +#include "llvm/Object/IRObjectFile.h" +#include "llvm/Support/FormatAdapters.h" +#include "llvm/Support/FormatVariadic.h" +#include "llvm/Support/SourceMgr.h" + +using namespace llvm; +using namespace llvm::object; +using namespace llvm::logicalview; + +#define DEBUG_TYPE "IRReader" + +// Extra debug traces. Default is false +#define DEBUG_ALL + +// These flavours of DINodes are not handled: +// DW_TAG_APPLE_property = 19896 +// DW_TAG_atomic_type = 71 +// DW_TAG_common_block = 26 +// DW_TAG_file_type= 41 +// DW_TAG_friend = 42 +// DW_TAG_generic_subrange = 69 +// DW_TAG_immutable_type = 75 +// DW_TAG_module = 30 + +// Create a logical element and setup the following information: +// - Name, DWARF tag, line +// - Collect any file information +LVElement *LVIRReader::constructElement(const DINode *DN) { + dwarf::Tag Tag = DN->getTag(); + LVElement *Element = createElement(Tag); + if (Element) { +Element->setTag(Tag); +addMD(DN, Element); + +StringRef Name = getMDName(DN); +if (!Name.empty()) + Element->setName(Name); + +// Record any file information. +if (const DIFile *File = getMDFile(DN)) + getOrCreateSourceID(File); + } + + return Element; +} + +void LVIRReader::mapFortranLanguage(unsigned DWLang) { + switch (DWLang) { + case dwarf::DW_LANG_Fortran77: + case dwarf::DW_LANG_Fortran90: + case dwarf::DW_LANG_Fortran95: + case dwarf::DW_LANG_Fortran03: + case dwarf::DW_LANG_Fortran08: + case dwarf::DW_LANG_Fortran18: +LanguageIsFortran = true; +break; + default: +LanguageIsFortran = false; + } +} + +// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only +// difference is setting the 'DICompileUnit::splitDebugFilename' to the +// name of the split filename: "xxx.dwo". +bool LVIRReader::includeMinimalInlineScopes() const { + return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly; +} + +// For the given 'DIFile' generate an index 1-based to indicate the +// source file where the logical element is declared. +// In DWARF v4, the files are 1-indexed. +// In DWARF v5, the files are 0-indexed. +// The IR reader expects the indexes as 1-indexed. +// Each compile unit, keeps track of the last assigned index. +size_t LVIRReader::getOrCreateSourceID(const DIFile *File) { + if (!File) +return 0; + +#ifdef DEBUG_ALL + LLVM_DEBUG({ +dbgs() << "\n[getOrCreateSourceID] DIFile\n"; +File->dump(); + }); +#endif + + addMD(File, CompileUnit); + + LLVM_DEBUG({ +dbgs() << "Directory: '" << File->getDirectory() << "'\n"; +dbgs() << "Filename: '" << File->getFilename() << "'\n"; + }); + size_t FileIndex = 0; + LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File); + if (Iter == CompileUnitFiles.cend()) { +FileIndex = getFileIndex(CompileUnit); +std::string Directory(File->getDirectory()); +if (Directory.empty()) + Directory = std::string(CompileUnit->getCompilationDirectory()); + +std::string FullName; +raw_string_ostream Out(FullName); +Out << Directory << "/" << llvm::sys::path::filename(File->getFilename()); +CompileUnit->addFilename(transformPath(FullName)); +CompileUnitFiles.emplace(File, ++FileIndex); +updateFileIndex(CompileUnit, FileIndex); + } else { +FileIndex = Iter->second; + } + + LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; }); + return FileIndex; +} + +void LVIRReader::addSourceLine(LVElement *Element, unsigned Line, + const DIFile *File) { + if (Line == 0) +return; + + // After the scopes are created, the generic reader traverses the 'Children' + // and perform additional setting tasks (resolve types names, references, + // etc.). One of those tasks is select the correct string pool index based on + // the commmand line o
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -0,0 +1,300 @@ +//===-- LVIRReader.h *- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file defines the LVIRReader class, which is used to describe a +// LLVM IR reader. +// +//===--===// + +#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H +#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H + +#include "llvm/DebugInfo/LogicalView/Core/LVReader.h" +#include "llvm/Transforms/Utils/DebugSSAUpdater.h" + +namespace llvm { +class DIFile; +class DINode; +class DILocation; +class DIScope; +class DISubprogram; +class DIVariable; +class BasicBlock; + +namespace object { +class IRObjectFile; +} + +namespace logicalview { + +class LVElement; +class LVLine; +class LVScopeCompileUnit; +class LVSymbol; +class LVType; + +class LVIRReader final : public LVReader { + object::IRObjectFile *BitCodeIR = nullptr; + MemoryBufferRef *TextualIR = nullptr; + + // Symbols with locations for current compile unit. + LVSymbols SymbolsWithLocations; + + LVSectionIndex SectionIndex = 0; + + const DICompileUnit *CUNode = nullptr; + + // The Dwarf Version (from the module flags). + unsigned DwarfVersion; + + // Location index for global variables. + uint64_t PoolAddressIndex = 0; + + // Whether to emit all linkage names, or just abstract subprograms. + bool UseAllLinkageNames = true; + + // Dependencies on external options (llc, etc). jmorse wrote: It's not clear to me what this means (more elaborated comment please). https://github.com/llvm/llvm-project/pull/135440 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -0,0 +1,2348 @@ +//===-- LVIRReader.cpp ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This implements the LVIRReader class. +// It supports LLVM text IR and bitcode format. +// +//===--===// + +#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h" +#include "llvm/CodeGen/DebugHandlerBase.h" +#include "llvm/DebugInfo/LogicalView/Core/LVLine.h" +#include "llvm/DebugInfo/LogicalView/Core/LVScope.h" +#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h" +#include "llvm/DebugInfo/LogicalView/Core/LVType.h" +#include "llvm/IR/DebugInfoMetadata.h" +#include "llvm/IR/Instructions.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/Module.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Object/Error.h" +#include "llvm/Object/IRObjectFile.h" +#include "llvm/Support/FormatAdapters.h" +#include "llvm/Support/FormatVariadic.h" +#include "llvm/Support/SourceMgr.h" + +using namespace llvm; +using namespace llvm::object; +using namespace llvm::logicalview; + +#define DEBUG_TYPE "IRReader" + +// Extra debug traces. Default is false +#define DEBUG_ALL + +// These flavours of DINodes are not handled: +// DW_TAG_APPLE_property = 19896 +// DW_TAG_atomic_type = 71 +// DW_TAG_common_block = 26 +// DW_TAG_file_type= 41 +// DW_TAG_friend = 42 +// DW_TAG_generic_subrange = 69 +// DW_TAG_immutable_type = 75 +// DW_TAG_module = 30 + +// Create a logical element and setup the following information: +// - Name, DWARF tag, line +// - Collect any file information +LVElement *LVIRReader::constructElement(const DINode *DN) { + dwarf::Tag Tag = DN->getTag(); + LVElement *Element = createElement(Tag); + if (Element) { +Element->setTag(Tag); +addMD(DN, Element); + +StringRef Name = getMDName(DN); +if (!Name.empty()) + Element->setName(Name); + +// Record any file information. +if (const DIFile *File = getMDFile(DN)) + getOrCreateSourceID(File); + } + + return Element; +} + +void LVIRReader::mapFortranLanguage(unsigned DWLang) { + switch (DWLang) { + case dwarf::DW_LANG_Fortran77: + case dwarf::DW_LANG_Fortran90: + case dwarf::DW_LANG_Fortran95: + case dwarf::DW_LANG_Fortran03: + case dwarf::DW_LANG_Fortran08: + case dwarf::DW_LANG_Fortran18: +LanguageIsFortran = true; +break; + default: +LanguageIsFortran = false; + } +} + +// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only +// difference is setting the 'DICompileUnit::splitDebugFilename' to the +// name of the split filename: "xxx.dwo". +bool LVIRReader::includeMinimalInlineScopes() const { + return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly; +} + +// For the given 'DIFile' generate an index 1-based to indicate the +// source file where the logical element is declared. +// In DWARF v4, the files are 1-indexed. +// In DWARF v5, the files are 0-indexed. +// The IR reader expects the indexes as 1-indexed. +// Each compile unit, keeps track of the last assigned index. +size_t LVIRReader::getOrCreateSourceID(const DIFile *File) { + if (!File) +return 0; + +#ifdef DEBUG_ALL + LLVM_DEBUG({ +dbgs() << "\n[getOrCreateSourceID] DIFile\n"; +File->dump(); + }); +#endif + + addMD(File, CompileUnit); + + LLVM_DEBUG({ +dbgs() << "Directory: '" << File->getDirectory() << "'\n"; +dbgs() << "Filename: '" << File->getFilename() << "'\n"; + }); + size_t FileIndex = 0; + LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File); + if (Iter == CompileUnitFiles.cend()) { +FileIndex = getFileIndex(CompileUnit); +std::string Directory(File->getDirectory()); +if (Directory.empty()) + Directory = std::string(CompileUnit->getCompilationDirectory()); + +std::string FullName; +raw_string_ostream Out(FullName); +Out << Directory << "/" << llvm::sys::path::filename(File->getFilename()); +CompileUnit->addFilename(transformPath(FullName)); +CompileUnitFiles.emplace(File, ++FileIndex); +updateFileIndex(CompileUnit, FileIndex); + } else { +FileIndex = Iter->second; + } + + LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; }); + return FileIndex; +} + +void LVIRReader::addSourceLine(LVElement *Element, unsigned Line, + const DIFile *File) { + if (Line == 0) +return; + + // After the scopes are created, the generic reader traverses the 'Children' + // and perform additional setting tasks (resolve types names, references, + // etc.). One of those tasks is select the correct string pool index based on + // the commmand line o
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -2124,6 +2125,138 @@ layout and given the number of matches. - Total 71 8 +IR (Textual representation and bitcode) SUPPORT +~~~ +The below example is used to show the IR output generated by +:program:`llvm-debuginfo-analyzer`. We compiled the example for a +IR 64-bit target with Clang (-O0 -g --target=x86_64-linux): + +.. code-block:: c++ + + 1 using INTPTR = const int *; + 2 int foo(INTPTR ParamPtr, unsigned ParamUnsigned, bool ParamBool) { + 3if (ParamBool) { + 4 typedef int INTEGER; + 5 const INTEGER CONSTANT = 7; + 6 return CONSTANT; + 7} + 8return ParamUnsigned; + 9 } + +PRINT BASIC DETAILS +^^^ +The following command prints basic details for all the logical elements +sorted by the debug information internal offset; it includes its lexical +level and debug info format. + +.. code-block:: none + + llvm-debuginfo-analyzer --attribute=level,format + --output-sort=offset + --print=scopes,symbols,types,lines,instructions + test-clang.ll + +or + +.. code-block:: none + + llvm-debuginfo-analyzer --attribute=level,format + --output-sort=offset + --print=elements + test-clang.ll + +Each row represents an element that is present within the debug +information. The first column represents the scope level, followed by +the associated line number (if any), and finally the description of +the element. + +.. code-block:: none + + Logical View: + [000] {File} 'test-clang.ll' -> Textual IR + + [001] {CompileUnit} 'test.cpp' + [002] 2 {Function} extern not_inlined 'foo' -> 'int' + [003] {Block} + [004] 5 {Variable} 'CONSTANT' -> 'const INTEGER' + [004] 5 {Line} + [004] {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32' + [004] 6 {Line} + [004] {Code} 'store i32 7, ptr %retval, align 4, !dbg !33' + [004] 6 {Line} + [004] {Code} 'br label %return, !dbg !33' + [003] 2 {Parameter} 'ParamPtr' -> 'INTPTR' + [003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int' + [003] 2 {Parameter} 'ParamBool' -> 'bool' + [003] 4 {TypeAlias} 'INTEGER' -> 'int' + [003] 2 {Line} + [003] {Code} '%retval = alloca i32, align 4' + [003] {Code} '%ParamPtr.addr = alloca ptr, align 8' + [003] {Code} '%ParamUnsigned.addr = alloca i32, align 4' + [003] {Code} '%ParamBool.addr = alloca i8, align 1' + [003] {Code} '%CONSTANT = alloca i32, align 4' + [003] {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8' + [003] {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4' + [003] {Code} '%storedv = zext i1 %ParamBool to i8' + [003] {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1' + [003] 8 {Line} + [003] {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34' + [003] 8 {Line} + [003] {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35' + [003] 8 {Line} + [003] {Code} 'br label %return, !dbg !35' + [003] 9 {Line} + [003] {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36' + [003] 9 {Line} + [003] {Code} 'ret i32 %2, !dbg !36' + [003] 3 {Line} + [003] 3 {Line} + [003] 3 {Line} + [003] {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26' + [002] 1 {TypeAlias} 'INTPTR' -> '* const int' + +SELECT LOGICAL ELEMENTS +^^^ +The following prints all *instructions*, *symbols* and *types* that +contain **'block'** or **'.store'** in their names or types, using a tab +layout and given the number of matches. + +.. code-block:: none jmorse wrote: I wonder if there's a more illustrative example to motivate the reader -- is it possible to search for "INTPTR" and discover both the type alias and the parameter? That's the sort of query I think I'd end up making: "I can see this type in the output format, but why is it present? -> Ah, it's a parameter to a function". (The current example is fine, I'm just trying to imagine a better one). https://github.com/llvm/llvm-project/pull/135440 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -0,0 +1,300 @@ +//===-- LVIRReader.h *- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file defines the LVIRReader class, which is used to describe a +// LLVM IR reader. +// +//===--===// + +#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H +#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H + +#include "llvm/DebugInfo/LogicalView/Core/LVReader.h" +#include "llvm/Transforms/Utils/DebugSSAUpdater.h" + +namespace llvm { +class DIFile; +class DINode; +class DILocation; +class DIScope; +class DISubprogram; +class DIVariable; +class BasicBlock; + +namespace object { +class IRObjectFile; +} + +namespace logicalview { + +class LVElement; +class LVLine; +class LVScopeCompileUnit; +class LVSymbol; +class LVType; + +class LVIRReader final : public LVReader { + object::IRObjectFile *BitCodeIR = nullptr; + MemoryBufferRef *TextualIR = nullptr; + + // Symbols with locations for current compile unit. + LVSymbols SymbolsWithLocations; + + LVSectionIndex SectionIndex = 0; + + const DICompileUnit *CUNode = nullptr; + + // The Dwarf Version (from the module flags). + unsigned DwarfVersion; + + // Location index for global variables. + uint64_t PoolAddressIndex = 0; + + // Whether to emit all linkage names, or just abstract subprograms. + bool UseAllLinkageNames = true; + + // Dependencies on external options (llc, etc). + bool includeMinimalInlineScopes() const; + bool useAllLinkageNames() const { return UseAllLinkageNames; } + + bool LanguageIsFortran = false; + void mapFortranLanguage(unsigned DWLang); + bool moduleIsInFortran() const { return LanguageIsFortran; } + + // Generate logical debug line before prologue. + bool GenerateLineBeforePrologue = true; + + // We assume a constante increase between instructions. + const unsigned OffsetIncrease = 4; + void updateLineOffset() { CurrentOffset += OffsetIncrease; } + + // An anonymous type for index type. + LVType *NodeIndexType = nullptr; + + std::unique_ptr DbgValueRanges; + + // Record the last assigned file index for each compile unit. + using LVIndexFiles = std::map; + LVIndexFiles IndexFiles; + + void updateFileIndex(LVScopeCompileUnit *CompileUnit, size_t FileIndex) { +LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit); +if (Iter == IndexFiles.end()) + IndexFiles.emplace(CompileUnit, FileIndex); +else + Iter->second = FileIndex; + } + + // Get the current assigned index file for the given compile unit. + size_t getFileIndex(LVScopeCompileUnit *CompileUnit) { +size_t FileIndex = 0; +LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit); +if (Iter != IndexFiles.end()) + FileIndex = Iter->second; +return FileIndex; + } + + // Collect the compile unit metadata files. + using LVCompileUnitFiles = std::map; + LVCompileUnitFiles CompileUnitFiles; + + size_t getOrCreateSourceID(const DIFile *File); + + // Associate the logical elements to metadata objects. + using LVMDObjects = std::map; + LVMDObjects MDObjects; + + void addMD(const MDNode *MD, LVElement *Element) { +if (MDObjects.find(MD) == MDObjects.end()) + MDObjects.emplace(MD, Element); + } + LVElement *getElementForSeenMD(const MDNode *MD) const { +LVMDObjects::const_iterator Iter = MDObjects.find(MD); +return Iter != MDObjects.end() ? Iter->second : nullptr; + } + LVScope *getScopeForSeenMD(const MDNode *MD) const { +return static_cast(getElementForSeenMD(MD)); + } + LVSymbol *getSymbolForSeenMD(const MDNode *MD) const { +return static_cast(getElementForSeenMD(MD)); + } + LVType *getTypeForSeenMD(const MDNode *MD) const { +return static_cast(getElementForSeenMD(MD)); + } + LVType *getLineForSeenMD(const MDNode *MD) const { +return static_cast(getElementForSeenMD(MD)); + } + + // Inlined concrete scopes with its associated inlined abstract scopes. jmorse wrote: ```suggestion // Inlined concrete scopes mapped to the associated inlined abstract scopes. ``` https://github.com/llvm/llvm-project/pull/135440 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -0,0 +1,2348 @@ +//===-- LVIRReader.cpp ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This implements the LVIRReader class. +// It supports LLVM text IR and bitcode format. +// +//===--===// + +#include "llvm/DebugInfo/LogicalView/Readers/LVIRReader.h" +#include "llvm/CodeGen/DebugHandlerBase.h" +#include "llvm/DebugInfo/LogicalView/Core/LVLine.h" +#include "llvm/DebugInfo/LogicalView/Core/LVScope.h" +#include "llvm/DebugInfo/LogicalView/Core/LVSymbol.h" +#include "llvm/DebugInfo/LogicalView/Core/LVType.h" +#include "llvm/IR/DebugInfoMetadata.h" +#include "llvm/IR/Instructions.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/Module.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Object/Error.h" +#include "llvm/Object/IRObjectFile.h" +#include "llvm/Support/FormatAdapters.h" +#include "llvm/Support/FormatVariadic.h" +#include "llvm/Support/SourceMgr.h" + +using namespace llvm; +using namespace llvm::object; +using namespace llvm::logicalview; + +#define DEBUG_TYPE "IRReader" + +// Extra debug traces. Default is false +#define DEBUG_ALL + +// These flavours of DINodes are not handled: +// DW_TAG_APPLE_property = 19896 +// DW_TAG_atomic_type = 71 +// DW_TAG_common_block = 26 +// DW_TAG_file_type= 41 +// DW_TAG_friend = 42 +// DW_TAG_generic_subrange = 69 +// DW_TAG_immutable_type = 75 +// DW_TAG_module = 30 + +// Create a logical element and setup the following information: +// - Name, DWARF tag, line +// - Collect any file information +LVElement *LVIRReader::constructElement(const DINode *DN) { + dwarf::Tag Tag = DN->getTag(); + LVElement *Element = createElement(Tag); + if (Element) { +Element->setTag(Tag); +addMD(DN, Element); + +StringRef Name = getMDName(DN); +if (!Name.empty()) + Element->setName(Name); + +// Record any file information. +if (const DIFile *File = getMDFile(DN)) + getOrCreateSourceID(File); + } + + return Element; +} + +void LVIRReader::mapFortranLanguage(unsigned DWLang) { + switch (DWLang) { + case dwarf::DW_LANG_Fortran77: + case dwarf::DW_LANG_Fortran90: + case dwarf::DW_LANG_Fortran95: + case dwarf::DW_LANG_Fortran03: + case dwarf::DW_LANG_Fortran08: + case dwarf::DW_LANG_Fortran18: +LanguageIsFortran = true; +break; + default: +LanguageIsFortran = false; + } +} + +// Looking at IR generated with the '-gdwarf -gsplit-dwarf=split' the only +// difference is setting the 'DICompileUnit::splitDebugFilename' to the +// name of the split filename: "xxx.dwo". +bool LVIRReader::includeMinimalInlineScopes() const { + return getCUNode()->getEmissionKind() == DICompileUnit::LineTablesOnly; +} + +// For the given 'DIFile' generate an index 1-based to indicate the +// source file where the logical element is declared. +// In DWARF v4, the files are 1-indexed. +// In DWARF v5, the files are 0-indexed. +// The IR reader expects the indexes as 1-indexed. +// Each compile unit, keeps track of the last assigned index. +size_t LVIRReader::getOrCreateSourceID(const DIFile *File) { + if (!File) +return 0; + +#ifdef DEBUG_ALL + LLVM_DEBUG({ +dbgs() << "\n[getOrCreateSourceID] DIFile\n"; +File->dump(); + }); +#endif + + addMD(File, CompileUnit); + + LLVM_DEBUG({ +dbgs() << "Directory: '" << File->getDirectory() << "'\n"; +dbgs() << "Filename: '" << File->getFilename() << "'\n"; + }); + size_t FileIndex = 0; + LVCompileUnitFiles::iterator Iter = CompileUnitFiles.find(File); + if (Iter == CompileUnitFiles.cend()) { +FileIndex = getFileIndex(CompileUnit); +std::string Directory(File->getDirectory()); +if (Directory.empty()) + Directory = std::string(CompileUnit->getCompilationDirectory()); + +std::string FullName; +raw_string_ostream Out(FullName); +Out << Directory << "/" << llvm::sys::path::filename(File->getFilename()); +CompileUnit->addFilename(transformPath(FullName)); +CompileUnitFiles.emplace(File, ++FileIndex); +updateFileIndex(CompileUnit, FileIndex); + } else { +FileIndex = Iter->second; + } + + LLVM_DEBUG({ dbgs() << "FileIndex: " << FileIndex << "\n"; }); + return FileIndex; +} + +void LVIRReader::addSourceLine(LVElement *Element, unsigned Line, + const DIFile *File) { + if (Line == 0) +return; + + // After the scopes are created, the generic reader traverses the 'Children' + // and perform additional setting tasks (resolve types names, references, + // etc.). One of those tasks is select the correct string pool index based on + // the commmand line o
[llvm-branch-commits] [llvm] [llvm-debuginfo-analyzer] Add support for LLVM IR format. (PR #135440)
@@ -56,6 +61,17 @@ Error LVReaderHandler::createReader(StringRef Filename, LVReaders &Readers, return std::make_unique(Filename, FileFormatName, Pdb, W, ExePath); } +if (isa(Input)) { + IRObjectFile *Ir = cast(Input); + return std::make_unique(Filename, FileFormatName, Ir, W); +} +if (isa(Input)) { + // If the filename extension is '.ll' create a IR reader. jmorse wrote: ```suggestion // If the filename extension is '.ll' create an IR reader. ``` https://github.com/llvm/llvm-project/pull/135440 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits