[Lldb-commits] [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry

David Blaikie via Phabricator via lldb-commits Wed, 09 Oct 2019 08:46:34 -0700

dblaikie added a comment.

In D68270#1700108 <https://reviews.llvm.org/D68270#1700108>, @probinson wrote:


> Do we care whether llvm-dwarfdump's output bears any similarities to the 
> output from GNU readelf or objdump?  There has been a push lately to get the 
> LLVM "binutils" to behave more like GNU's, although AFAIK it hasn't gotten to 
> the DWARF dumping part.


Generally I hope not to deal with that until there's a user with a need for it 
who wants to do the work & has a specific use-case that can help motivate which 
similarities are desirable and which ones don't matter (& perhaps if there's 
enough that they start to tradeoff usability - maybe the "compatibility mode" 
is a separate tool or separate flag to the existing tool).

My broader hope is probably that llvm-dwarfdump is more for interactive uses 
than other dumpers, so fewer people might try to build automated things on top 
of it & thus expect specific output (this gives us both the freedom not to 
match the GNU tools, and the freedom not to match previous llvm-dwarfdump 
behavior (which we've done a fair bit in the past - which seems to support the 
theory that people don't seem to be building much on top of this))



================
Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295
+  EntryIterator Absolute =
+      getAbsoluteLocations(
+          SectionedAddress{BaseAddr, SectionedAddress::UndefSection},
+          LookupPooledAddress)
+          .begin();
----------------
labath wrote:
> dblaikie wrote:
> > labath wrote:
> > > dblaikie wrote:
> > > > labath wrote:
> > > > > This parallel iteration is not completely nice, but I think it's 
> > > > > worth being able to reuse the absolute range computation code. I'm 
> > > > > open to ideas for improvement though.
> > > > Ah, I see - this is what you meant about "In particular it makes it 
> > > > possible to reuse this stuff in the dumping code, which would have been 
> > > > pretty hard with callbacks.".
> > > > 
> > > > I'm wondering if that might be worth revisiting somewhat. A full 
> > > > iterator abstraction for one user here (well, two once you include lldb 
> > > > - but I assume it's likely going to build its own data structure from 
> > > > the iteration anyway, right? (it's not going to keep the iterator 
> > > > around, do anything interesting like partial iterations, re-iterate/etc 
> > > > - such that a callback would suffice))
> > > > 
> > > > I could imagine two callback APIs for this - one that gets entries and 
> > > > locations and one that only gets locations by filtering on the entry 
> > > > version.
> > > > 
> > > > eg:
> > > > 
> > > >   // for non-verbose output:
> > > >   LL.forEachEntry([&](const Entry &E, Expected<DWARFLocation> L) {
> > > >     if (Verbose && actually dumping debug_loc)
> > > >       print(E) // print any LLE_*, raw parameters, etc
> > > >     if (L)
> > > >       print(*L) // print the resulting address range, section name (if 
> > > > verbose), 
> > > >     else
> > > >       print(error stuff)
> > > >   });
> > > > 
> > > > One question would be "when/where do we print the DWARF expression" - 
> > > > if there's an error computing the address range, we can still print the 
> > > > expression, so maybe that happens unconditionally at the end of the 
> > > > callback, using the expression in the Entry? (then, arguably, the 
> > > > expression doesn't need to be in the DWARFLocation - and I'd say make 
> > > > the DWARFLocation a sectioned range, exactly the same type as for 
> > > > ranges so that part of the dumping code, etc, can be maximally reused)
> > > Actually, what lldb currently does is that it does not build any data 
> > > structures at all (except storing the pointer to the right place in the 
> > > debug_loc section. Then, whenever it wants to do something to the 
> > > loclist, it parses it afresh. I don't know why it does this exactly, but 
> > > I assume it has something to do with most locations never being used, or 
> > > being only a couple of times, and the actual parsing being fairly fast. 
> > > What this means is that lldb is not really a single "user", but there are 
> > > like four or five places where it iterates through the list, depending on 
> > > what does it actually want to do with it. It also does partial iteration 
> > > where it stops as soon as it find the entry it was interested in.
> > > Now, all of that is possible with a callback (though I am generally 
> > > trying to avoid them), but it does resurface the issue of what should be 
> > > the value of the second argument for DW_LLE_base_address entries (the 
> > > thing which I originally used a error type for).
> > > Maybe this should be actually one callback API, taking two callback 
> > > functions, with one of them being invoked for base_address entries, and 
> > > one for others? However, if we stick to the current approaches in both 
> > > LLE and RLE of making the address pool resolution function a parameter 
> > > (which I'd like to keep, as it makes my job in lldb easier), then this 
> > > would actually be three callbacks, which starts to get unwieldy. Though 
> > > one of those callbacks could be removed with the "DWARFUnit implementing 
> > > a AddrOffsetResolver interface" idea, which I really like. :)
> > Ah, thanks for the details on LLDB's location parsing logic. That's 
> > interesting indeed!
> > 
> > I can appreciate an iterator-based API if that's the sort of usage we've 
> > got, though I expect it doesn't have any interest in the low-level encoding 
> > & just wants the fully processed address ranges/locations - it doesn't want 
> > base_address or end_of_list entries? & I think the dual-iteration is a 
> > fairly awkward API design, trying to iterate them in lock-step, etc. I'd 
> > rather avoid that if reasonably possible.
> > 
> > Either having an iterator API that gives only the fully processed 
> > data/semantic view & a completely different API if you want to access the 
> > low level primitives (LLE, etc) (this is how ranges works - there's an API 
> > that gives a collection of ranges & abstracts over v4/v5/rnglists/etc - 
> > though that's partly motivated by a strong multi-client need for that 
> > functionality for symbolizing, etc - but I think it's a good 
> > abstraction/model anyway (& one of the reasons the inline range list 
> > printing doesn't include encoding information, the API it uses is too high 
> > level to even have access to it))
> > 
> > > Now, all of that is possible with a callback (though I am generally 
> > > trying to avoid them), but it does resurface the issue of what should be 
> > > the value of the second argument for DW_LLE_base_address entries (the 
> > > thing which I originally used a error type for).
> > 
> > Sorry, my intent in the above API was for the second argument to be 
> > Optional's "None" state when... oh, I see, I did use Expected there, rather 
> > than Optional, because there are legit error cases.
> > 
> > I know it's sort of awkward, but I might be inclined to use 
> > Optional<Expected<AddressRange>> there. I realize two layers of wrapping is 
> > a bit weird, but I think it'd be nicer than having an error state for what, 
> > I think, isn't erroneous.
> > 
> > > Maybe this should be actually one callback API, taking two callback 
> > > functions, with one of them being invoked for base_address entries, and 
> > > one for others? However, if we stick to the current approaches in both 
> > > LLE and RLE of making the address pool resolution function a parameter 
> > > (which I'd like to keep, as it makes my job in lldb easier), then this 
> > > would actually be three callbacks, which starts to get unwieldy.
> > 
> > Don't mind three callbacks too much.
> > 
> > > Though one of those callbacks could be removed with the "DWARFUnit 
> > > implementing a AddrOffsetResolver interface" idea, which I really like. :)
> > 
> > Sorry, I haven't really looked at where the address resolver callback is 
> > registered and alternative designs being discussed - but yeah, going off 
> > just the one-sentence, it seems reasonable to have the DWARFUnit own an 
> > address resolver/be the thing you consult when you want to resolve an 
> > address (just through a normal function call in DWARFUnit, perhaps - which 
> > might, internally, use a callback registered when it was constructed).
> > I know it's sort of awkward, but I might be inclined to use 
> > Optional<Expected<AddressRange>> there. I realize two layers of wrapping is 
> > a bit weird, but I think it'd be nicer than having an error state for what, 
> > I think, isn't erroneous.
> Actually, my very first attempt at this patch used an 
> `Expected<Optional<Whatever>>`, but then I scrapped it because I didn't think 
> you'd like it. It's not the friendliest of APIs, but I think we can go with 
> that.
> 
> > Sorry, I haven't really looked at where the address resolver callback is 
> > registered and alternative designs being discussed - but yeah, going off 
> > just the one-sentence, it seems reasonable to have the DWARFUnit own an 
> > address resolver/be the thing you consult when you want to resolve an 
> > address (just through a normal function call in DWARFUnit, perhaps - which 
> > might, internally, use a callback registered when it was constructed).
> 
> I think you got that backwards. I don't want the DWARFUnit to be the source 
> of truth for address pool resolutions, as that would make it hard to use from 
> lldb (it's far from ready to start using the llvm version right now). What I 
> wanted was to replace the lambda/function_ref with a single-method interface. 
> Then both DWARFUnits could implement that interface so that passing a 
> DWARFUnit& would "just work" (but you wouldn't be limited to DWARFUnits as 
> anyone could implement that interface, just like anyone can write a lambda).
As for Expected<Optional<Whatever>> (or Optional<Expected<>>) - yeah, I think 
this is a non-obvious API (both the general problem and this specific 
solution). I think it's probably worth discussing this design a bit more to 
save you time writing/rewriting things a bit. I guess there are a few layers of 
failure here.

There's the possibility that the iteration itself could fail - even for 
debug_loc style lists (if we reached the end of the section before encountering 
a terminating {0,0}). That would suggest a fallible iterator idiom: 
http://llvm.org/docs/ProgrammersManual.html#building-fallible-iterators-and-iterator-ranges

But then, yes, when looking at the "processed"/semantic view, that could fail 
too in the case of an invalid address index, etc.

The generic/processed/abstracted-over-ranges-and-rnglists API for ranges 
produces a fully computer vector (& then returns Expected<vector> of that 
range) - is that reasonable? (this does mean manifesting a whole location in 
memory, which may not be needed so I could understand avoiding that even 
without fully implementing & demonstrating the vector solution is inadequate).

But I /think/ maybe the we could/should have two APIs - one generic API that 
abstracts over loc/loclists and only provides the fully processed view, and 
another that is type specific for dumping the underlying representation (only 
used in dumping debug_loclists). 


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68270/new/

https://reviews.llvm.org/D68270



_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry

Reply via email to