[Bug debuginfod/25509] Break a cyclic dependency by core packages

2020-06-25 Thread mark at klomp dot org via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=25509

--- Comment #17 from Mark Wielaard  ---
Just to be clear, the current setup is:

--enable-debuginfod or --enable-debuginfod=yes: builds all debuginfod
server/client artifacts (requires libcurl)
--disable-debuginfod or --enable-debuginfod=no: builds none of the debuginfod
server/client artifacts.

The proposed patch introduces:

--enable-libdebuginfod or --enable-libdebuginfod=yes: builds the debuginfod
client artifacts
--disable-libdebuginfod or --enable-libdebuginfod=no: don't build the
debuginfod client artifacts
--enable-libdebuginfod=dummy: builds all debuginfod clients artifacts, but the
libdebuginfod.so is just a stub (does not depend on libcurl).
--enable-debuginfod or --enable-debuginfod=yes: builds all debuginfod server
artifacts (requires either --enable-libdebuginfod=yes or
--enable-libdebuginfod=dummy and sqlite3, microhttpd, libarchive)
--disable-debuginfod or --enable-debuginfod=no: build none of the debuginfod
server artifacts.

I am hoping that helps both the Suse use case which would like a
bootstrap/dummy libdebuginfod.so (--enable-libdebuginfod=dummy
--disable-debuginfod) and the Arch use case which is to only have the client
library, but not the debuginfod server (--enable-libdebuginfod
--disable-debuginfod).

I admit that having the combination --enable-libdebuginfod=dummy and
--enable-libdebuginfod is somewhat redundant/non-sensical, but it helps with
(build time) testing. Other testing matrix would imho be as complicated (you'll
get extra install flags or need to setup compile time or runtime environment
variables).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[Bug debuginfod/25509] Break a cyclic dependency by core packages

2020-06-25 Thread mliska at suse dot cz via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=25509

--- Comment #18 from Martin Liska  ---
> I am hoping that helps both the Suse use case which would like a
> bootstrap/dummy libdebuginfod.so (--enable-libdebuginfod=dummy
> --disable-debuginfod)

Yes, I can confirm the suggested scenario will work for us.
Thanks for working on that!

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Re: location list

2020-06-25 Thread Mark Wielaard
Hi Sasha,

On Tue, 2020-06-23 at 16:34 +, Sasha Da Rocha Pinheiro wrote:
> Since we are now using not only executables and .so, but ".o" files
> too, I'm trying to decide if I can use the same functions to all of
> them, like the code you pointed out to deal with ".o". Would that
> work for EXEC, SHARED, and RELOC?

Yes, it would work. The relocation resolving logic only triggers for .o
(ET_REL) files. But everything else works as expected also for ET_EXEC
and ET_DYN files.

> The idea is not to have two codes to parse modules and DIEs, two ways
> because as you pointed out ".o" files need some relocation to be
> performed, therefore using dwfl_*. Meanwhile for executables and .so
> we only use dwarf_* functions.
> In face of that, do you foresee bigger changes or things we should
> worry that we would have in case we use only dwfl_* to open all the
> ELF files with dwarf data, and drop the way we used to open them?
> Because our code base for a long time has only used the dwarf_*
> functions, this would be a big change.

The real "value" from the Dwfl interface comes from it trying to layout
objects as if dynamically loaded. So you can mimic a process even if it
isn't loaded (or a kernel plus modules). This is why some functions
return an "bias" indicating the difference between the Dwfl_Module
"assigned" addresses and any addresses you might read directly from the
Elf or Dwarf. But you can of course ignore that functionality and just
treat each object file independently.

Besides resolving those relocations for ET_REL files, Dwfl also
provides various (default/standard) callbacks to find/associate
separate debuginfo to an Elf file. See Dwfl_Callbacks and the "Standard
callbacks" in the libdwfl.h file. If you do use it, it might
override/change some search paths for where to get the Dwarf data/file
from. Again, you could not use this functionality if you don't like it.
(Dwfl also works when you provide it the Dwarf data files directly.)

Just look at what you need/want.

Cheers,

Mark


[Bug tools/26043] eu-addr2line debuginfo-path option with relative path doesn't find file:line information

2020-06-25 Thread mark at klomp dot org via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=26043

--- Comment #8 from Mark Wielaard  ---
(In reply to devel.origin from comment #7)
> For that demo the debug files are ending up in a custom directory (the
> build-id-demo/.build-id/) for the packge builder to pick up. So they can't
> be found by GDB. When packages are installed the debug info is naturally in
> /usr/lib/debug.
> 
> To make the GDB work when launching a developer build from the build
> directory the build script puts a copy of debug info of all libraries into
> one .build-id/ on the top level of the build directory (I haven't added this
> part to the testcase project). In .gdbinit there is an option to look in
> current dir: "set debug-file-directory /usr/lib/debug:.", so GDB finds debug
> files if the current dir has a .build-id/ subdir.

Thanks. I have a clear picture now.

I think it would make sense to change the search for .build-id based files in
the default search from only checking absolute paths to also include relative
paths. I am just pondering how to prevent an "explosion" of extra stats/checks.
With the default setting it would add 2 extra stat calls (which would normally
always fail). Maybe that isn't too bad?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Re: Can dwarf_getscopes{,_die} performance be improved?

2020-06-25 Thread Mark Wielaard
Hi Milian,

On Mon, 2020-06-22 at 10:29 +0200, Milian Wolff wrote:
> On Montag, 15. Juni 2020 18:54:41 CEST Josh Stone wrote:
> > On 6/13/20 10:40 AM, Milian Wolff wrote:
> > > Has anyone an idea on how to to post-process the DWARF data to optimize
> > > the
> > > lookup of inlined frames?
> > 
> > SystemTap implements its own cache for repeated lookups -- see
> > dwflpp::get_die_parents().
> 
> Thanks, I've come up with something similar over the weekend before reading 
> your mail. The performance boost is huge (5x and more).
> 
> Looking at your code, I think that I'm not yet handling a few corner cases 
> (such as imported units). That, paired with the fact that at least three 
> users 
> of this API have apparently by now come up with a similar solution clearly 
> makes a case for upstreaming this into a common API.

Yes, I think having an elfutils/libdw API for this would be very
useful. And it would also be useful for eu-addr2line and eu-stack when
looking up inlined functions.

The imported (partial) units are a little tricky because they cross CUs
(and sometimes even Dwarfs for example when dealing with dwz/multi-
files). It also means that a Die inside the partial unit can have
multiple parents, because they might have been imported through
different imports. But I guess that if we associate a parent cache with
one CU, then this is clean (unless the CU imports the same partial unit
multiple times...).

> I believe that there is a lot of data that potentially needs to be cached. 
> Additionally, doing it behind the scenes may raise questions regarding multi 
> threaded usage of the API (see my other mail relating to that).
> 
> Which means: an explicit API to opt-in to this behavior is safest and best I 
> believe. Maybe something as simple as the following would be sufficient?
> 
> ```
> /* Cache parent DIEs chain to speed up repeated dwarf_getscopes calls.
> 
>Returns -1 for errors or 0 if the parent chain was cached already. */
> extern int dwarf_cache_parent_dies(Dwarf_Die *cudie);
> ```
> 
> Alternatively a function that returns the cache could be considered, which 
> would then require new versions of dwarf_getscopes* that take the cache as an 
> argument.

I think an API that makes the "caching" explicit might be best. Maybe
we can call it a DieTree? We would then have a function to create a
DieTree and (new) functions that take a DieTree (and a Dwarf_Die) to
operate on it. The user can then also destroy the DieTree again when
done.

Cheers,

Mark


Re: Questions regarding editing an elf executable

2020-06-25 Thread Mark Wielaard
Hi Anastasios,

On Thu, 2020-06-25 at 02:46 +0100, Anastasios Andronidis via Elfutils-
devel wrote:
> My end goal is to add a DT_NEEDED entry into an arbitrary elf file,
> but before this I should just print the DT_NEEDED entries like this:
> 
> ```C
> 
> // Some code that copies a source_elf_file to a new target_elf_file
> so
> // we don't destroy the original binary.
> 
> int fd = open(target_elf_file, O_RDWR, 0);
> 
> Elf *e = elf_begin(fd, ELF_C_RDWR_MMAP, NULL);

In general I would recommend against using ELF_C_RDWR_MMAP and use
ELF_C_RDWR instead. Unless you know all your changes are "in place".
ELF_C_RDWR_MAP sometimes has problems extending the mmap. It really is
an optimization only useful for in-place replacing of data.

> Elf_Scn *scn = NULL;
> while ((scn = elf_nextscn(e, scn)) != NULL) {
>   GElf_Shdr shdr;
>   gelf_getshdr(scn, &shdr);
> 
>   if (shdr.sh_type == SHT_DYNAMIC) {
> Elf_Data *data = elf_getdata(scn, data);
> 
> size_t sh_entsize = gelf_fsize(e, ELF_T_DYN, 1, EV_CURRENT);
> 
> for (size_t i = 0; i < shdr.sh_size / sh_entsize; i++) {
>   GElf_Dyn dynmem;
>   GElf_Dyn *dyn = gelf_getdyn(data, i, &dynmem);
> 
>   if (dyn->d_tag == DT_NEEDED) {
> printf("Lib: %s\n", elf_strptr(e, shdr.sh_link, dyn-
> >d_un.d_val));
>   }
>}
> }
> 
> elf_update(e, ELF_C_WRITE);
> elf_end(e);
> close(fd);
> ```
> 
> 1) Notice that I run `elf_update(e, ELF_C_WRITE);` in the end as I
> had the impression that this command should actually do nothing
> because I modified nothing. Unfortunately this is not what happens.
> The resulting elf executable is corrupted. I expected libelf would
> leave the elf file as it was, or at least produce a valid executable.

libelf is too "smart". When it sees the elf_update() it believe it can
help "optimize" the way the sections are placed together in the ELF
file. Normally that is fine. But you are operating on an ELF file with
both a section header and a program header. libelf knows very little
about the program header. It also doesn't know how (and if) the program
header segments map to the sections.

Wikipedia has a nice graphic showing ELF segments and sections: 
https://en.wikipedia.org/wiki/Executable_and_Linkable_Format#File_layout

By updating the section data offsets it messes up the program header
segment offsets.

If you don't want libelf to be "helpful" then call:
elf_flagelf (e, ELF_C_SET, ELF_F_LAYOUT);

Then elf_update will still write out any changed (dirty) data, but you
are responsible for making sure the section header offsets and sizes
are correct. If you add some extra data and don't update the size and
offset [of the next section], elf_update will happily write over the
data in the next section instead of moving it.

> 2b) I created a new data and copied the old to new. Didn't work.
> 
> ```C
> ...
> if (shdr.sh_type == SHT_DYNAMIC) {
>   ...
>   
>   // Create a new data container for SHT_DYNAMIC and copy all old
> values.
>   Elf_Data *new_data = elf_newdata(scn);
>   *new_data = *data;
> 
>   // We will add 1 more entry.
>   size_t new_data_size = data->d_size + sh_entsize;
>   // Allocate and set the new data buffer.
>   void *new_data_buf = malloc(new_data_size);
>   new_data->d_buf = new_data_buf;
>   new_data->d_size = new_data_size;
> 
>   // Copy old data to the new buffer.
>   memcpy(new_data->d_buf, data->d_buf, data->d_size);
> 
>   // Add our new entry.
>   GElf_Dyn *new_data_dyn = new_data->d_buf;
>   new_data_dyn[new_data_size / sh_entsize-2].d_tag = DT_NEEDED;
>   new_data_dyn[new_data_size / sh_entsize-2].d_un.d_val = 1;
> 
> }
> ```

Your real problem is not calling ELF_F_LAYOUT as described above. But
the above will add the new_data to the section, in addition to the
already existing data. If you aren't changing the size then you can
just replace the data in place through data->d_buf. Then call
elf_flagdata (data, ELF_C_SET, ELF_F_DIRTY) to tell libelf to write it
out. If you want to totally replace the data, you can just replace the
data->d_buf and data->d_size fields (and set ELF_F_DIRTY before calling
elf_update).

Do note that if you do use ELF_F_LAYOUT you are responsible for
updating the sh_size yourself. And possibly move any section data after
it by updating the next sections sh_offset. Also if any of the data
moved is referenced through the program headers you'll need to update
those too.

Note that there are helpers for dealing with the dynamic section
entries gelf_getdyn and gelf_update_dyn.

But updating something like the dynamic section is really tricky if you
want to change the size. It is an allocated section and if there is a
program header there is a corresponding PT_DYNAMIC segment that has to
be kept in sync. Normally allocated (SHF_ALLOC) sections are packed
together when the file has a program header. So moving them around is a
little tricky (you might have to move them all and/or update the
segment headers too). Also there might be other entries, like the
_DYNAMIC symb

Re: Range lists, zero-length functions, linker gc

2020-06-25 Thread David Blaikie via Elfutils-devel
On Wed, Jun 24, 2020 at 3:22 PM Mark Wielaard  wrote:
>
> Hi David,
>
> On Fri, 2020-06-19 at 17:46 -0700, David Blaikie via Elfutils-devel wrote:
> > On Fri, Jun 19, 2020 at 5:00 AM Mark Wielaard  wrote:
> > > I think that is kind of the point of Early Debug. Only use DWARF (at
> > > first) for address/range-less data like types and program scope
> > > entries, but don't emit anything (in DWARF format) for things that
> > > might need adjustments during link/LTO phase. The problem with using
> > > DWARF with address (ranges) during early object creation is that the
> > > linker isn't capable to rewrite the DWARF. You'll need a linker plugin
> > > that calls back into the compiler to do the actual LTO and emit the
> > > actual DWARF containing address/ranges (which can then link back to the
> > > already emitted DWARF types/program scope/etc during the Early Debug
> > > phase). I think the issue you are describing is actually that you do
> > > use DWARF to describe function definitions (not just the declarations)
> > > too early. If you aren't sure yet which addresses will be used DWARF
> > > isn't really the appropriate (temporary) debug format.
> >
> > Sorry, I think we keep talking around each other. Not sure if we can
> > reach a good consensus or shared understanding on this topic.
>
> I think the confusion comes from the fact that we seem to cycle through
> a couple of different topics which are related, but not really
> connected directly.
>
> There is the topic of using "tombstones" in place of some pc or range
> attributes/tables in the case of traditional linking separate compile
> units/objects. Where we seem to agree that those are better than
> silently producing bad data, but were we disagree whether there are
> other ways to solve the issue (using comdat section for example, where
> we might see the overhead/gains differently).
>
> There is the topic of LTO where part of the linker optimization is done
> through a (compiler) plugin. Where it isn't clear (to me at least) if
> some of the traditional way of handling DWARF in object files makes
> sense.

Oh - perhaps to clarify: I don't know of any implementation that
creates DWARF in intermediate object files in LTO.

> I would argue that GCC shows that for LTO you need something
> like Early Debug, where you only produce parts of the DWARF early that
> don't contain any addresses or ranges, since you don't know yet where
> code/data will end up till after the actual LTO phase, only after which
> it can be produced.

Yeah - I guess that's the point of the name "Early Debug" - it's
earlier than usual, rather than making the rest later than usual.

In LLVM's implementation the faux .o files in LTO contain no DWARF
whatsoever - but a semantic representation something like DWARF
intended to be manipulated by compiler optimizations and designed to
drop unreferenced portions as optimizations make changes. (if you
inline and optimize away a function call, that function may get
dropped - then no DWARF is emitted for it, same as if it were never
called)

Yeah, it'd be theoretically possible to create all the DWARF up-front,
use loclists and rnglists for /everything/ (because you wouldn't know
if a variable would have a single location or multiple until after
optimizations) and then fill in those loclists and rnglists
post-optimization. I don't know of any implementation that does that,
though - it'd make for very verbose DWARF, and I agree with you that
that wouldn't be great - I think the only point of conflict there is:
I don't think that's a concern that's actually manifesting in DWARF
producers today. Certainly not in LLVM & doesn't sound like it is in
GCC.

I think there's enough incentive for compiler performance - not to
produce loads of duplicate DWARF, and to have a fairly
compact/optimizable intermediate representation - there was a lot of
work that went into changing LLVM's representation to be more amenable
to LTO to ensure things got dropped and deduplicated as soon as
possible.

> Then there is the topic of Split Dwarf, where I am not sure it is
> directly relevant to the above two topics. It is just a different
> representation of the DWARF data, with an extra layer of indirections
> used for addresses. Which in the case of the traditional model means
> that you still hit the tombstones, just through an indirection table.
> And for LTO it just makes some things more complicated because you have
> this extra address indirection table, but since you cannot know where
> the addresses end up till after the LTO phase you now have an extra
> layer of indirection to fix up.

I think the point of Split DWARF is, to your first point about you and
I having perhaps different tradeoffs about object size cost (using
comdats to deduplicate/drop DWARF For dead or deduplicated functions)
- in the case of Split DWARF, it's impossible - well, it's impossible
if you're going to use fragmented DWARF (eg: use comdats to stitch
together a single CU out of dro