dblaikie added inline comments.
================
Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295
+ EntryIterator Absolute =
+ getAbsoluteLocations(
+ SectionedAddress{BaseAddr, SectionedAddress::UndefSection},
+ LookupPooledAddress)
+ .begin();
----------------
labath wrote:
> dblaikie wrote:
> > labath wrote:
> > > This parallel iteration is not completely nice, but I think it's worth
> > > being able to reuse the absolute range computation code. I'm open to
> > > ideas for improvement though.
> > Ah, I see - this is what you meant about "In particular it makes it
> > possible to reuse this stuff in the dumping code, which would have been
> > pretty hard with callbacks.".
> >
> > I'm wondering if that might be worth revisiting somewhat. A full iterator
> > abstraction for one user here (well, two once you include lldb - but I
> > assume it's likely going to build its own data structure from the iteration
> > anyway, right? (it's not going to keep the iterator around, do anything
> > interesting like partial iterations, re-iterate/etc - such that a callback
> > would suffice))
> >
> > I could imagine two callback APIs for this - one that gets entries and
> > locations and one that only gets locations by filtering on the entry
> > version.
> >
> > eg:
> >
> > // for non-verbose output:
> > LL.forEachEntry([&](const Entry &E, Expected<DWARFLocation> L) {
> > if (Verbose && actually dumping debug_loc)
> > print(E) // print any LLE_*, raw parameters, etc
> > if (L)
> > print(*L) // print the resulting address range, section name (if
> > verbose),
> > else
> > print(error stuff)
> > });
> >
> > One question would be "when/where do we print the DWARF expression" - if
> > there's an error computing the address range, we can still print the
> > expression, so maybe that happens unconditionally at the end of the
> > callback, using the expression in the Entry? (then, arguably, the
> > expression doesn't need to be in the DWARFLocation - and I'd say make the
> > DWARFLocation a sectioned range, exactly the same type as for ranges so
> > that part of the dumping code, etc, can be maximally reused)
> Actually, what lldb currently does is that it does not build any data
> structures at all (except storing the pointer to the right place in the
> debug_loc section. Then, whenever it wants to do something to the loclist, it
> parses it afresh. I don't know why it does this exactly, but I assume it has
> something to do with most locations never being used, or being only a couple
> of times, and the actual parsing being fairly fast. What this means is that
> lldb is not really a single "user", but there are like four or five places
> where it iterates through the list, depending on what does it actually want
> to do with it. It also does partial iteration where it stops as soon as it
> find the entry it was interested in.
> Now, all of that is possible with a callback (though I am generally trying to
> avoid them), but it does resurface the issue of what should be the value of
> the second argument for DW_LLE_base_address entries (the thing which I
> originally used a error type for).
> Maybe this should be actually one callback API, taking two callback
> functions, with one of them being invoked for base_address entries, and one
> for others? However, if we stick to the current approaches in both LLE and
> RLE of making the address pool resolution function a parameter (which I'd
> like to keep, as it makes my job in lldb easier), then this would actually be
> three callbacks, which starts to get unwieldy. Though one of those callbacks
> could be removed with the "DWARFUnit implementing a AddrOffsetResolver
> interface" idea, which I really like. :)
Ah, thanks for the details on LLDB's location parsing logic. That's interesting
indeed!
I can appreciate an iterator-based API if that's the sort of usage we've got,
though I expect it doesn't have any interest in the low-level encoding & just
wants the fully processed address ranges/locations - it doesn't want
base_address or end_of_list entries? & I think the dual-iteration is a fairly
awkward API design, trying to iterate them in lock-step, etc. I'd rather avoid
that if reasonably possible.
Either having an iterator API that gives only the fully processed data/semantic
view & a completely different API if you want to access the low level
primitives (LLE, etc) (this is how ranges works - there's an API that gives a
collection of ranges & abstracts over v4/v5/rnglists/etc - though that's partly
motivated by a strong multi-client need for that functionality for symbolizing,
etc - but I think it's a good abstraction/model anyway (& one of the reasons
the inline range list printing doesn't include encoding information, the API it
uses is too high level to even have access to it))
> Now, all of that is possible with a callback (though I am generally trying to
> avoid them), but it does resurface the issue of what should be the value of
> the second argument for DW_LLE_base_address entries (the thing which I
> originally used a error type for).
Sorry, my intent in the above API was for the second argument to be Optional's
"None" state when... oh, I see, I did use Expected there, rather than Optional,
because there are legit error cases.
I know it's sort of awkward, but I might be inclined to use
Optional<Expected<AddressRange>> there. I realize two layers of wrapping is a
bit weird, but I think it'd be nicer than having an error state for what, I
think, isn't erroneous.
> Maybe this should be actually one callback API, taking two callback
> functions, with one of them being invoked for base_address entries, and one
> for others? However, if we stick to the current approaches in both LLE and
> RLE of making the address pool resolution function a parameter (which I'd
> like to keep, as it makes my job in lldb easier), then this would actually be
> three callbacks, which starts to get unwieldy.
Don't mind three callbacks too much.
> Though one of those callbacks could be removed with the "DWARFUnit
> implementing a AddrOffsetResolver interface" idea, which I really like. :)
Sorry, I haven't really looked at where the address resolver callback is
registered and alternative designs being discussed - but yeah, going off just
the one-sentence, it seems reasonable to have the DWARFUnit own an address
resolver/be the thing you consult when you want to resolve an address (just
through a normal function call in DWARFUnit, perhaps - which might, internally,
use a callback registered when it was constructed).
================
Comment at: test/CodeGen/X86/debug-loclists.ll:16
; CHECK-NEXT: 0x00000000:
-; CHECK-NEXT: [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0
-; CHECK-NEXT: [0x0000000000000004, 0x0000000000000012): DW_OP_breg3 RBX+0
-
-; There is no way to use llvm-dwarfdump atm (2018, october) to verify the
DW_LLE_* codes emited,
-; because dumper is not yet implements that. Use asm code to do this check
instead.
-;
-; RUN: llc -mtriple=x86_64-pc-linux -filetype=asm < %s -o - | FileCheck %s
--check-prefix=ASM
-; ASM: .section .debug_loclists,"",@progbits
-; ASM-NEXT: .long .Ldebug_loclist_table_end0-.Ldebug_loclist_table_start0 #
Length
-; ASM-NEXT: .Ldebug_loclist_table_start0:
-; ASM-NEXT: .short 5 # Version
-; ASM-NEXT: .byte 8 # Address size
-; ASM-NEXT: .byte 0 # Segment selector size
-; ASM-NEXT: .long 0 # Offset entry count
-; ASM-NEXT: .Lloclists_table_base0:
-; ASM-NEXT: .Ldebug_loc0:
-; ASM-NEXT: .byte 4 # DW_LLE_offset_pair
-; ASM-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0 # starting offset
-; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # ending offset
-; ASM-NEXT: .byte 2 # Loc expr size
-; ASM-NEXT: .byte 117 # DW_OP_breg5
-; ASM-NEXT: .byte 0 # 0
-; ASM-NEXT: .byte 4 # DW_LLE_offset_pair
-; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # starting offset
-; ASM-NEXT: .uleb128 .Ltmp1-.Lfunc_begin0 # ending offset
-; ASM-NEXT: .byte 2 # Loc expr size
-; ASM-NEXT: .byte 115 # DW_OP_breg3
-; ASM-NEXT: .byte 0 # 0
-; ASM-NEXT: .byte 0 # DW_LLE_end_of_list
-; ASM-NEXT: .Ldebug_loclist_table_end0:
+; CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000000, 0x0000000000000004
=> [0x0000000000000000, 0x0000000000000004) DW_OP_breg5 RDI+0
+; CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000004, 0x0000000000000012
=> [0x0000000000000004, 0x0000000000000012) DW_OP_breg3 RBX+0
----------------
labath wrote:
> dblaikie wrote:
> > labath wrote:
> > > This tries to follow the RLE format as closely as possible, but I think
> > > something like
> > > ```
> > > [DW_LLE_offset_pair, 0x0000000000000000, 0x0000000000000004] =>
> > > [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0
> > > ```
> > > would make more sense (both here and for RLE).
> > Yep, that'd make more sense to me - are you planning to unify the codepaths
> > for this? I think that'd be for the best.
> >
> > If I were picking a printing from scratch, I might go with:
> >
> > DW_LLE_offset_pair(0x0000, 0x0004) => [0x0000, 0x0004): DW_OP_breg5 RDI+0
> >
> > Making it look a bit more like a function call and function arguments.
> > Though the () might be confusing with the range notation.
> >
> > I'm also undecided on the " => " separator. Whether a ':' might be
> > better/fine, etc.
> >
> > Totally open to ideas, but mostly I'd really love these to use loclist and
> > ranges to use the same code as much as possible, so we can get consistency
> > and any readability benefits, etc in both.
> I like the function call format. I hoping to get some code reuse, though it's
> still not fully clear to me how to achieve that..
I've posted my unification of range/loc/v4/v5 emission here:
https://reviews.llvm.org/D68620 - & I'd imagine something similar in the
parsing side.
================
Comment at: test/DebugInfo/X86/dwarfdump-debug-loclists.test:7
# CHECK: DW_AT_location [DW_FORM_sec_offset] (0x0000000c
-# CHECK-NEXT: [0x0000000000000010, 0x0000000000000020): DW_OP_breg5 RDI+0
-# CHECK-NEXT: [0x0000000000000530, 0x0000000000000540): DW_OP_breg6 RBP-8,
DW_OP_deref
-# CHECK-NEXT: [0x0000000000000700, 0x0000000000000710): DW_OP_breg5 RDI+0
+# CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000000,
0x0000000000000010 => [0x0000000000000010, 0x0000000000000020) DW_OP_breg5 RDI+0
+# CHECK-NEXT: [DW_LLE_base_address ]: 0x0000000000000500
----------------
labath wrote:
> dblaikie wrote:
> > I don't think the inline dumping should print the encoding - I'd borrow a
> > lot from/try to unify with the ranges printing, which doesn't. I think
> > verbose ranges print the same as non-verbose except they also add the
> > section name/number.
> Sure, I can do that, though I think that means there won't be a single place
> where one can see both the raw encodings and their interpretation --
> section-based dumping will not show the interpretation (would you want me to
> show still show them I they happen to be interpretable without the base
> address or the address pool?), and the debug_info dumping will not show the
> encoding. Is that bad? -- I don't know...
Fair - that comes back to the issue I mentioned in a previous comment about
potentially limiting dumping of non-debug_info sections based on the presence
of a CU that references it (& only dumping it that way, rather than trying to
parse it without a CU). DWARF isn't really designed to be parsed without the CU
anyway. (could leave it in as best-effort to parse things without a referencing
CU for debugging, etc).
Mostly I'm interested in unification perhaps more/primarily, than feature
improvements - then we can make feature improvements to both ranges and locs
without having to duplicate things.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D68270/new/
https://reviews.llvm.org/D68270
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits