[PATCH v2 0/4] elfutils: DWARF package (.dwp) file support

2023-12-06 Thread Omar Sandoval
From: Omar Sandoval 

Hi,

This is version 2 of my patch series adding support for DWARF package
files to libdw and the elfutils tools. Version 1 is here [1].

Patches 1-3 add the main implementation and tests for dwp files.

Most of this support is internal to libdw, but patch 1 adds a new public
function, dwarf_cu_dwp_section_info. drgn's dwp branch [2] demonstrates
how that function will be used. Also see [3] for more context on why
drgn needs this.

Patch 4 adds support and tests for an LLVM extension to the dwp format.
The "extension" is ugly because of an oversight in the design of the
format that LLVM had to make the best of, but unfortunately it's
necessary for a lot of our use cases.

With this patch series, drgn's test suite passes against a Linux kernel
build using .dwp.

Changes from v1:

* Rebased on main and dropped patches that were already merged.
* Moved ChangeLog entries to commit messages.
* Updated version in libdw.map to 0.191.
* Moved DW_SECT_TYPES definition to dwarf.h.
* Added copyright years.
* Added error handling for dwarf_cu_dwp_section_info calls in
  str_offsets_base_off, __libdw_cu_ranges_base, and __libdw_cu_locs_base
* Changed memset initialization of index->sections to an explicit
  loop.
* Added comment explaining __libdw_link_skel_split change.

There were a couple of things that were mentioned in review that I
didn't change:

* I kept dwarf_cu_dwp_section_info in patch 1 instead of separating it
  into its own patch so that I could test the dwp index implementation
  in the same commit that I introduced it in.
* I didn't make try_dwp_file return an error since try_split_file that
  it's based on doesn't either.

Thanks!
Omar

1: https://sourceware.org/pipermail/elfutils-devel/2023q3/006410.html
2: https://github.com/osandov/drgn/tree/dwp
3: https://sourceware.org/pipermail/elfutils-devel/2023q4/006630.html

Omar Sandoval (4):
  libdw: Parse DWARF package file index sections
  libdw: Try .dwp file in __libdw_find_split_unit()
  libdw: Apply DWARF package file section offsets where appropriate
  libdw: Handle overflowed DW_SECT_INFO offsets in DWARF package file
indexes

 libdw/Makefile.am |2 +-
 libdw/dwarf.h |2 +-
 libdw/dwarf_begin_elf.c   |1 +
 libdw/dwarf_cu_dwp_section_info.c |  531 +++
 libdw/dwarf_end.c |   24 +-
 libdw/dwarf_error.c   |1 +
 libdw/dwarf_getlocation.c |6 +
 libdw/dwarf_getmacros.c   |   26 +-
 libdw/libdw.h |   23 +
 libdw/libdw.map   |5 +
 libdw/libdwP.h|  101 +-
 libdw/libdw_find_split_unit.c |   75 +-
 libdw/libdw_findcu.c  |8 +
 tests/.gitignore  |1 +
 tests/Makefile.am |   15 +-
 tests/cu-dwp-section-info.c   |   73 +
 tests/run-all-dwarf-ranges.sh |  114 ++
 tests/run-cu-dwp-section-info.sh  |  168 ++
 tests/run-dwarf-getmacros.sh  | 1412 +
 tests/run-get-units-split.sh  |   18 +
 tests/run-large-elf-file.sh   |  174 ++
 tests/run-varlocs.sh  |  112 ++
 tests/testfile-dwp-4-cu-index-overflow.bz2|  Bin 0 -> 4490 bytes
 .../testfile-dwp-4-cu-index-overflow.dwp.bz2  |  Bin 0 -> 5584 bytes
 tests/testfile-dwp-4-strict.bz2   |  Bin 0 -> 4169 bytes
 tests/testfile-dwp-4-strict.dwp.bz2   |  Bin 0 -> 6871 bytes
 tests/testfile-dwp-4.bz2  |  Bin 0 -> 4194 bytes
 tests/testfile-dwp-4.dwp.bz2  |  Bin 0 -> 10098 bytes
 tests/testfile-dwp-5-cu-index-overflow.bz2|  Bin 0 -> 4544 bytes
 .../testfile-dwp-5-cu-index-overflow.dwp.bz2  |  Bin 0 -> 5790 bytes
 tests/testfile-dwp-5.bz2  |  Bin 0 -> 4223 bytes
 tests/testfile-dwp-5.dwp.bz2  |  Bin 0 -> 10313 bytes
 tests/testfile-dwp-cu-index-overflow.source   |   86 +
 tests/testfile-dwp.source |  102 ++
 34 files changed, 3051 insertions(+), 29 deletions(-)
 create mode 100644 libdw/dwarf_cu_dwp_section_info.c
 create mode 100644 tests/cu-dwp-section-info.c
 create mode 100755 tests/run-cu-dwp-section-info.sh
 create mode 100755 tests/testfile-dwp-4-cu-index-overflow.bz2
 create mode 100644 tests/testfile-dwp-4-cu-index-overflow.dwp.bz2
 create mode 100755 tests/testfile-dwp-4-strict.bz2
 create mode 100644 tests/testfile-dwp-4-strict.dwp.bz2
 create mode 100755 tests/testfile-dwp-4.bz2
 create mode 100644 tests/testfile-dwp-4.dwp.bz2
 create mode 100755 tests/testfile-dwp-5-cu-index-overflow.bz2
 create mode 100644 tests/testfile-dwp-5-cu-index-overflow.dwp.bz2
 create mode 100755 tests/testfile-dwp-5.bz2
 create mode 100644 tests/testfile-dwp-5.dwp.bz2
 creat

[PATCH v2 1/4] libdw: Parse DWARF package file index sections

2023-12-06 Thread Omar Sandoval
From: Omar Sandoval 

The .debug_cu_index and .debug_tu_index sections in DWARF package files
are basically hash tables mapping a unit's 8 byte signature to an offset
and size in each section used by that unit [1].  Add support for parsing
and doing lookups in the index sections.

We look up a unit in the index when we intern it and cache its hash
table row in Dwarf_CU.  Then, a new function, dwarf_cu_dwp_section_info,
can be used to look up the section offsets and sizes for a unit.  This
will mostly be used internally in libdw, but it will also be needed in
static inline functions shared with eu-readelf.  Additionally, making it
public it makes dwp support much easier for external tools that do their
own low-level parsing of DWARF information, like drgn [2].

1: 
https://gcc.gnu.org/wiki/DebugFissionDWP#Format_of_the_CU_and_TU_Index_Sections
2: https://github.com/osandov/drgn

* libdw/dwarf.h: Add DW_SECT_TYPES.
* libdw/libdwP.h (Dwarf): Add cu_index and tu_index.
(Dwarf_CU): Add dwp_row.
(Dwarf_Package_Index): New type.
(__libdw_dwp_find_unit): New declaration.
(dwarf_cu_dwp_section_info): New INTDECL.
Add DWARF_E_UNKNOWN_SECTION.
* libdw/Makefile.am (libdw_a_SOURCES): Add
dwarf_cu_dwp_section_info.c.
* libdw/dwarf_end.c (dwarf_end): Free dwarf->cu_index and
dwarf->tu_index.
* libdw/dwarf_error.c (errmsgs): Add DWARF_E_UNKNOWN_SECTION.
* libdw/libdw.h (dwarf_cu_dwp_section_info): New declaration.
* libdw/libdw.map (ELFUTILS_0.190): Add
dwarf_cu_dwp_section_info.
* libdw/libdw_findcu.c (__libdw_intern_next_unit): Call
__libdw_dwp_find_unit, and use it to adjust abbrev_offset and
assign newp->dwp_row.
* libdw/dwarf_cu_dwp_section_info.c: New file.
* tests/Makefile.am (check_PROGRAMS): Add cu-dwp-section-info.
(TESTS): Add run-cu-dwp-section-info.sh
(EXTRA_DIST): Add run-cu-dwp-section-info.sh and new test files.
(cu_dwp_section_info_LDADD): New variable.
* tests/cu-dwp-section-info.c: New test.
* tests/run-cu-dwp-section-info.sh: New test.
* tests/testfile-dwp-4-strict.bz2: New test file.
* tests/testfile-dwp-4-strict.dwp.bz2: New test file.
* tests/testfile-dwp-4.bz2: New test file.
* tests/testfile-dwp-4.dwp.bz2: New test file.
* tests/testfile-dwp-5.bz2: New test file.
* tests/testfile-dwp-5.dwp.bz2: New test file.
* tests/testfile-dwp.source: New file.

Signed-off-by: Omar Sandoval 
---
 libdw/Makefile.am   |   2 +-
 libdw/dwarf.h   |   2 +-
 libdw/dwarf_cu_dwp_section_info.c   | 371 
 libdw/dwarf_end.c   |   3 +
 libdw/dwarf_error.c |   1 +
 libdw/libdw.h   |  23 ++
 libdw/libdw.map |   5 +
 libdw/libdwP.h  |  33 +++
 libdw/libdw_findcu.c|   8 +
 tests/.gitignore|   1 +
 tests/Makefile.am   |  11 +-
 tests/cu-dwp-section-info.c |  73 ++
 tests/run-cu-dwp-section-info.sh| 168 +
 tests/testfile-dwp-4-strict.bz2 | Bin 0 -> 4169 bytes
 tests/testfile-dwp-4-strict.dwp.bz2 | Bin 0 -> 6871 bytes
 tests/testfile-dwp-4.bz2| Bin 0 -> 4194 bytes
 tests/testfile-dwp-4.dwp.bz2| Bin 0 -> 10098 bytes
 tests/testfile-dwp-5.bz2| Bin 0 -> 4223 bytes
 tests/testfile-dwp-5.dwp.bz2| Bin 0 -> 10313 bytes
 tests/testfile-dwp.source   | 102 
 20 files changed, 798 insertions(+), 5 deletions(-)
 create mode 100644 libdw/dwarf_cu_dwp_section_info.c
 create mode 100644 tests/cu-dwp-section-info.c
 create mode 100755 tests/run-cu-dwp-section-info.sh
 create mode 100755 tests/testfile-dwp-4-strict.bz2
 create mode 100644 tests/testfile-dwp-4-strict.dwp.bz2
 create mode 100755 tests/testfile-dwp-4.bz2
 create mode 100644 tests/testfile-dwp-4.dwp.bz2
 create mode 100755 tests/testfile-dwp-5.bz2
 create mode 100644 tests/testfile-dwp-5.dwp.bz2
 create mode 100644 tests/testfile-dwp.source

diff --git a/libdw/Makefile.am b/libdw/Makefile.am
index e548f38c..5363c02a 100644
--- a/libdw/Makefile.am
+++ b/libdw/Makefile.am
@@ -93,7 +93,7 @@ libdw_a_SOURCES = dwarf_begin.c dwarf_begin_elf.c dwarf_end.c 
dwarf_getelf.c \
  dwarf_cu_die.c dwarf_peel_type.c dwarf_default_lower_bound.c \
  dwarf_die_addr_die.c dwarf_get_units.c \
  libdw_find_split_unit.c dwarf_cu_info.c \
- dwarf_next_lines.c
+ dwarf_next_lines.c dwarf_cu_dwp_section_info.c
 
 if MAINTAINER_MODE
 BUILT_SOURCES = $(srcdir)/known-dwarf.h
diff --git a/libdw/dwarf.h b/libdw/dwarf.h
index b2e49db2..4be32de5 100644
--- a/libdw/dwarf.h
+++ b/libdw/dwarf.h
@@ -942,7 +942,7 @@ enum
 enum
   {
 DW_SECT_INFO = 1,
-/* Reserved = 2, */
+DW_SECT_TYPES = 2, 

[PATCH v2 3/4] libdw: Apply DWARF package file section offsets where appropriate

2023-12-06 Thread Omar Sandoval
From: Omar Sandoval 

The final piece of DWARF package file support is that offsets have to be
interpreted relative to the section offset from the package index.
.debug_abbrev.dwo is already covered, so sprinkle around calls to
dwarf_cu_dwp_section_info for the remaining sections: .debug_line.dwo,
.debug_loclists.dwo/.debug_loc.dwo, .debug_str_offsets.dwo,
.debug_macro.dwo/.debug_macinfo.dwo, and .debug_rnglists.dwo.  With all
of that in place, we can finally test various libdw functions on dwp
files.

* libdw/dwarf_getmacros.c (get_macinfo_table): Call
dwarf_cu_dwp_section_info and add offset to line_offset.
(get_offset_from): Call dwarf_cu_dwp_section_info and add offset
to *retp.
* libdw/libdwP.h (str_offsets_base_off): Call
dwarf_cu_dwp_section_info and add offset.
(__libdw_cu_ranges_base): Ditto.
(__libdw_cu_locs_base): Ditto.
* libdw/dwarf_getlocation.c (initial_offset): Call
dwarf_cu_dwp_section_info and add offset to start_offset.
* tests/run-varlocs.sh: Check testfile-dwp-5 and testfile-dwp-4.
* tests/run-all-dwarf-ranges.sh: Check testfile-dwp-5 and
testfile-dwp-4.
* tests/run-dwarf-getmacros.sh: Check testfile-dwp-5 and
testfile-dwp-4-strict.
* tests/run-get-units-split.sh: Check testfile-dwp-5,
testfile-dwp-4, and testfile-dwp-4-strict.

Signed-off-by: Omar Sandoval 
---
 libdw/dwarf_getlocation.c |6 +
 libdw/dwarf_getmacros.c   |   26 +-
 libdw/libdwP.h|   42 +-
 tests/run-all-dwarf-ranges.sh |  114 +++
 tests/run-dwarf-getmacros.sh  | 1412 +
 tests/run-get-units-split.sh  |   18 +
 tests/run-varlocs.sh  |  112 +++
 7 files changed, 1715 insertions(+), 15 deletions(-)

diff --git a/libdw/dwarf_getlocation.c b/libdw/dwarf_getlocation.c
index 553fdc98..37b32fc1 100644
--- a/libdw/dwarf_getlocation.c
+++ b/libdw/dwarf_getlocation.c
@@ -812,6 +812,12 @@ initial_offset (Dwarf_Attribute *attr, ptrdiff_t *offset)
: DWARF_E_NO_DEBUG_LOCLISTS),
NULL, &start_offset) == NULL)
return -1;
+
+  Dwarf_Off loc_off;
+  if (INTUSE(dwarf_cu_dwp_section_info) (attr->cu, DW_SECT_LOCLISTS,
+&loc_off, NULL) != 0)
+   return -1;
+  start_offset += loc_off;
 }
 
   *offset = start_offset;
diff --git a/libdw/dwarf_getmacros.c b/libdw/dwarf_getmacros.c
index a3a78884..2667eb45 100644
--- a/libdw/dwarf_getmacros.c
+++ b/libdw/dwarf_getmacros.c
@@ -47,7 +47,15 @@ get_offset_from (Dwarf_Die *die, int name, Dwarf_Word *retp)
 return -1;
 
   /* Offset into the corresponding section.  */
-  return INTUSE(dwarf_formudata) (&attr, retp);
+  if (INTUSE(dwarf_formudata) (&attr, retp) != 0)
+return -1;
+
+  Dwarf_Off offset;
+  if (INTUSE(dwarf_cu_dwp_section_info) (die->cu, DW_SECT_MACRO, &offset, NULL)
+  != 0)
+return -1;
+  *retp += offset;
+  return 0;
 }
 
 static int
@@ -131,6 +139,14 @@ get_macinfo_table (Dwarf *dbg, Dwarf_Word macoff, 
Dwarf_Die *cudie)
   else if (cudie->cu->unit_type == DW_UT_split_compile
   && dbg->sectiondata[IDX_debug_line] != NULL)
 line_offset = 0;
+  if (line_offset != (Dwarf_Off) -1)
+{
+  Dwarf_Off dwp_offset;
+  if (INTUSE(dwarf_cu_dwp_section_info) (cudie->cu, DW_SECT_LINE,
+&dwp_offset, NULL) != 0)
+   return NULL;
+  line_offset += dwp_offset;
+}
 
   Dwarf_Macro_Op_Table *table = libdw_alloc (dbg, Dwarf_Macro_Op_Table,
 macinfo_data_size, 1);
@@ -188,6 +204,14 @@ get_table_for_offset (Dwarf *dbg, Dwarf_Word macoff,
if (unlikely (INTUSE(dwarf_formudata) (attr, &line_offset) != 0))
  return NULL;
 }
+  if (line_offset != (Dwarf_Off) -1 && cudie != NULL)
+{
+  Dwarf_Off dwp_offset;
+  if (INTUSE(dwarf_cu_dwp_section_info) (cudie->cu, DW_SECT_LINE,
+&dwp_offset, NULL) != 0)
+   return NULL;
+  line_offset += dwp_offset;
+}
 
   uint8_t address_size;
   if (cudie != NULL)
diff --git a/libdw/libdwP.h b/libdw/libdwP.h
index 54445886..64d86bbd 100644
--- a/libdw/libdwP.h
+++ b/libdw/libdwP.h
@@ -1108,25 +1108,30 @@ str_offsets_base_off (Dwarf *dbg, Dwarf_CU *cu)
cu = first_cu;
 }
 
+  Dwarf_Off off = 0;
   if (cu != NULL)
 {
   if (cu->str_off_base == (Dwarf_Off) -1)
{
+ Dwarf_Off dwp_offset;
+ if (dwarf_cu_dwp_section_info (cu, DW_SECT_STR_OFFSETS, &dwp_offset,
+NULL) == 0)
+   off = dwp_offset;
  Dwarf_Die cu_die = CUDIE(cu);
  Dwarf_Attribute attr;
  if (dwarf_attr (&cu_die, DW_AT_str_offsets_base, &attr) != NULL)
{
- Dwarf_Word off;
- if (dwarf_formudata (&attr, &off) == 0)
+ Dwarf_

[PATCH v2 4/4] libdw: Handle overflowed DW_SECT_INFO offsets in DWARF package file indexes

2023-12-06 Thread Omar Sandoval
From: Omar Sandoval 

Meta uses DWARF package files for our large, statically-linked C++
applications.  Some of our largest applications have more than 4GB in
.debug_info.dwo, but the section offsets in .debug_cu_index and
.debug_tu_index are 32 bits; see the discussion here [1].  We
implemented a workaround/extension for this in LLVM.  Implement the
equivalent in libdw.

To test this, we need files with more than 4GB in .debug_info.dwo.  I
created these artificially by editing GCC's assembly output.  They
compress down to 6KB.  I test them from run-large-elf-file.sh to take
advantage of the existing checks for large file support.

1: https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902.

* libdw/dwarf_end.c (dwarf_package_index_free): New function.
* tests/testfile-dwp-4-cu-index-overflow.bz2: New test file.
* tests/testfile-dwp-4-cu-index-overflow.dwp.bz2: New test file.
* tests/testfile-dwp-5-cu-index-overflow.bz2: New test file.
* tests/testfile-dwp-5-cu-index-overflow.dwp.bz2: New test file.
* tests/testfile-dwp-cu-index-overflow.source: New file.
* tests/run-large-elf-file.sh: Check
testfile-dwp-5-cu-index-overflow and
testfile-dwp-4-cu-index-overflow.

Signed-off-by: Omar Sandoval 
---
 libdw/dwarf_cu_dwp_section_info.c | 147 ++-
 libdw/dwarf_end.c |  15 +-
 libdw/libdwP.h|   3 +
 tests/Makefile.am |   6 +-
 tests/run-large-elf-file.sh   | 174 ++
 tests/testfile-dwp-4-cu-index-overflow.bz2| Bin 0 -> 4490 bytes
 .../testfile-dwp-4-cu-index-overflow.dwp.bz2  | Bin 0 -> 5584 bytes
 tests/testfile-dwp-5-cu-index-overflow.bz2| Bin 0 -> 4544 bytes
 .../testfile-dwp-5-cu-index-overflow.dwp.bz2  | Bin 0 -> 5790 bytes
 tests/testfile-dwp-cu-index-overflow.source   |  86 +
 10 files changed, 425 insertions(+), 6 deletions(-)
 create mode 100755 tests/testfile-dwp-4-cu-index-overflow.bz2
 create mode 100644 tests/testfile-dwp-4-cu-index-overflow.dwp.bz2
 create mode 100755 tests/testfile-dwp-5-cu-index-overflow.bz2
 create mode 100644 tests/testfile-dwp-5-cu-index-overflow.dwp.bz2
 create mode 100644 tests/testfile-dwp-cu-index-overflow.source

diff --git a/libdw/dwarf_cu_dwp_section_info.c 
b/libdw/dwarf_cu_dwp_section_info.c
index 298f36f9..3d11c87a 100644
--- a/libdw/dwarf_cu_dwp_section_info.c
+++ b/libdw/dwarf_cu_dwp_section_info.c
@@ -30,6 +30,8 @@
 # include 
 #endif
 
+#include 
+
 #include "libdwP.h"
 
 static Dwarf_Package_Index *
@@ -110,7 +112,9 @@ __libdw_read_package_index (Dwarf *dbg, bool tu)
 
   index->dbg = dbg;
   /* Set absent sections to UINT32_MAX.  */
-  memset (index->sections, 0xff, sizeof (index->sections));
+  for (size_t i = 0;
+   i < sizeof (index->sections) / sizeof (index->sections[0]); i++)
+index->sections[i] = UINT32_MAX;
   for (size_t i = 0; i < section_count; i++)
 {
   uint32_t section = read_4ubyte_unaligned (dbg, sections + i * 4);
@@ -161,6 +165,7 @@ __libdw_read_package_index (Dwarf *dbg, bool tu)
   index->indices = indices;
   index->section_offsets = section_offsets;
   index->section_sizes = section_sizes;
+  index->debug_info_offsets = NULL;
 
   return index;
 }
@@ -177,6 +182,137 @@ __libdw_package_index (Dwarf *dbg, bool tu)
   if (index == NULL)
 return NULL;
 
+  /* Offsets in the section offset table are 32-bit unsigned integers.  In
+ practice, the .debug_info.dwo section for very large executables can be
+ larger than 4GB.  GNU dwp as of binutils 2.41 and llvm-dwp before LLVM 15
+ both accidentally truncate offsets larger than 4GB.
+
+ LLVM 15 detects the overflow and errors out instead; see LLVM commit
+ f8df8114715b ("[DWP][DWARF] Detect and error on debug info offset
+ overflow").  However, lldb in LLVM 16 supports using dwp files with
+ truncated offsets by recovering them directly from the unit headers in the
+ .debug_info.dwo section; see LLVM commit c0db06227721 ("[DWARFLibrary] Add
+ support to re-construct cu-index").  Since LLVM 17, the overflow error can
+ be turned into a warning instead; see LLVM commit 53a483cee801 ("[DWP] add
+ overflow check for llvm-dwp tools if offset overflow").
+
+ LLVM's support for > 4GB offsets is effectively an extension to the DWARF
+ package file format, which we implement here.  The strategy is to walk the
+ unit headers in .debug_info.dwo in lockstep with the DW_SECT_INFO columns
+ in the section offset tables.  As long as they are in the same order
+ (which they are in practice for both GNU dwp and llvm-dwp), we can
+ correlate the truncated offset and produce a corrected array of offsets.
+
+ Note that this will be fixed properly in DWARF 6:
+ https://dwarfstd.org/issues/220708.2.html.  */
+  if (index->sections[DW_SECT_INFO - 1] != UINT32_MAX
+  && dbg->sectiondata[IDX_debug_info

[PATCH v2 2/4] libdw: Try .dwp file in __libdw_find_split_unit()

2023-12-06 Thread Omar Sandoval
From: Omar Sandoval 

Try opening the file in the location suggested by the standard (the
skeleton file name + ".dwp") and looking up the unit in the package
index.  The rest is similar to .dwo files, with slightly different
cleanup since a single Dwarf handle is shared.

* libdw/libdw_find_split_unit.c (try_dwp_file): New function.
(__libdw_find_split_unit): Call try_dwp_file.
* libdw/libdwP.h (Dwarf): Add dwp_dwarf and dwp_fd.
(__libdw_dwp_findcu_id): New declaration.
(__libdw_link_skel_split): Handle .debug_addr for dwp.
* libdw/libdw_begin_elf.c (dwarf_begin_elf): Initialize
result->dwp_fd.
* libdw/dwarf_end.c (dwarf_end): Free dwarf->dwp_dwarf and close
dwarf->dwp_fd.
(cu_free): Don't free split dbg if it is dwp_dwarf.

Signed-off-by: Omar Sandoval 
---
 libdw/dwarf_begin_elf.c   |  1 +
 libdw/dwarf_cu_dwp_section_info.c | 19 
 libdw/dwarf_end.c | 10 -
 libdw/libdwP.h| 23 --
 libdw/libdw_find_split_unit.c | 75 ---
 5 files changed, 119 insertions(+), 9 deletions(-)

diff --git a/libdw/dwarf_begin_elf.c b/libdw/dwarf_begin_elf.c
index 323a91d0..ca2b7e2a 100644
--- a/libdw/dwarf_begin_elf.c
+++ b/libdw/dwarf_begin_elf.c
@@ -567,6 +567,7 @@ dwarf_begin_elf (Elf *elf, Dwarf_Cmd cmd, Elf_Scn *scngrp)
 
   result->elf = elf;
   result->alt_fd = -1;
+  result->dwp_fd = -1;
 
   /* Initialize the memory handling.  Initial blocks are allocated on first
  actual allocation.  */
diff --git a/libdw/dwarf_cu_dwp_section_info.c 
b/libdw/dwarf_cu_dwp_section_info.c
index 4a4eac8c..298f36f9 100644
--- a/libdw/dwarf_cu_dwp_section_info.c
+++ b/libdw/dwarf_cu_dwp_section_info.c
@@ -340,6 +340,25 @@ __libdw_dwp_find_unit (Dwarf *dbg, bool debug_types, 
Dwarf_Off off,
   abbrev_offsetp, NULL);
 }
 
+Dwarf_CU *
+internal_function
+__libdw_dwp_findcu_id (Dwarf *dbg, uint64_t unit_id8)
+{
+  Dwarf_Package_Index *index = __libdw_package_index (dbg, false);
+  uint32_t unit_row;
+  Dwarf_Off offset;
+  Dwarf_CU *cu;
+  if (__libdw_dwp_unit_row (index, unit_id8, &unit_row) == 0
+  && __libdw_dwp_section_info (index, unit_row, DW_SECT_INFO, &offset,
+  NULL) == 0
+  && (cu = __libdw_findcu (dbg, offset, false)) != NULL
+  && cu->unit_type == DW_UT_split_compile
+  && cu->unit_id8 == unit_id8)
+return cu;
+  else
+return NULL;
+}
+
 int
 dwarf_cu_dwp_section_info (Dwarf_CU *cu, unsigned int section,
   Dwarf_Off *offsetp, Dwarf_Off *sizep)
diff --git a/libdw/dwarf_end.c b/libdw/dwarf_end.c
index b7f817d9..78224ddb 100644
--- a/libdw/dwarf_end.c
+++ b/libdw/dwarf_end.c
@@ -66,7 +66,9 @@ cu_free (void *arg)
  /* The fake_addr_cu might be shared, only release one.  */
  if (p->dbg->fake_addr_cu == p->split->dbg->fake_addr_cu)
p->split->dbg->fake_addr_cu = NULL;
- INTUSE(dwarf_end) (p->split->dbg);
+ /* There is only one DWP file. We free it later.  */
+ if (p->split->dbg != p->dbg->dwp_dwarf)
+   INTUSE(dwarf_end) (p->split->dbg);
}
 }
 }
@@ -147,6 +149,12 @@ dwarf_end (Dwarf *dwarf)
  close (dwarf->alt_fd);
}
 
+  if (dwarf->dwp_fd != -1)
+   {
+ INTUSE(dwarf_end) (dwarf->dwp_dwarf);
+ close (dwarf->dwp_fd);
+   }
+
   /* The cached path and dir we found the Dwarf ELF file in.  */
   free (dwarf->elfpath);
   free (dwarf->debugdir);
diff --git a/libdw/libdwP.h b/libdw/libdwP.h
index 7f8d69b5..54445886 100644
--- a/libdw/libdwP.h
+++ b/libdw/libdwP.h
@@ -180,6 +180,9 @@ struct Dwarf
   /* dwz alternate DWARF file.  */
   Dwarf *alt_dwarf;
 
+  /* DWARF package file.  */
+  Dwarf *dwp_dwarf;
+
   /* The section data.  */
   Elf_Data *sectiondata[IDX_last];
 
@@ -197,6 +200,9 @@ struct Dwarf
  close this file descriptor.  */
   int alt_fd;
 
+  /* File descriptor of DWARF package file.  */
+  int dwp_fd;
+
   /* Information for traversing the .debug_pubnames section.  This is
  an array and separately allocated with malloc.  */
   struct pubnames_s
@@ -716,6 +722,10 @@ extern int __libdw_dwp_find_unit (Dwarf *dbg, bool 
debug_types, Dwarf_Off off,
  Dwarf_Off *abbrev_offsetp)
  __nonnull_attribute__ (1, 7, 8) internal_function;
 
+/* Find the compilation unit in a DWARF package file with the given id.  */
+extern Dwarf_CU *__libdw_dwp_findcu_id (Dwarf *dbg, uint64_t unit_id8)
+ __nonnull_attribute__ (1) internal_function;
+
 /* Get abbreviation with given code.  */
 extern Dwarf_Abbrev *__libdw_findabbrev (struct Dwarf_CU *cu,
 unsigned int code)
@@ -1367,12 +1377,19 @@ __libdw_link_skel_split (Dwarf_CU *skel, Dwarf_CU 
*split)
 
   /* Get .debug_addr and addr_base greedy.
  We also need it for the fake addr cu.
- There is only

[Bug tools/31114] eu-readelf --debug-dump=info cannot show mega-enum

2023-12-06 Thread mark at klomp dot org
https://sourceware.org/bugzilla/show_bug.cgi?id=31114

Mark Wielaard  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #1 from Mark Wielaard  ---
The problem seems to be that this mega-enum .debug_info starts with a
completely empty (length = 7 bytes) compile unit which doesn't contain any (CU)
DIE. At least __libdw_intern_next_unit seems to depend on being able to fetch
some information from the (non-existing) CU DIE. Some other code also seems to
assume a Compile Unit contains at least one DIE.

We should probably audit all uses of the CUDIE (and SUBDIE) macro from
libdwP.h.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[PATCH 1/2] libdw: Use INTUSE with dwarf_get_units

2023-12-06 Thread Aaron Merey
Add INTDECL for dwarf_get_units and call dwarf_get_units with INTUSE.

Signed-off-by: Aaron Merey 
---
 libdw/dwarf_get_units.c   |  1 +
 libdw/dwarf_next_lines.c  |  8 +--
 libdw/libdwP.h| 91 +--
 libdw/libdw_find_split_unit.c |  4 +-
 4 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/libdw/dwarf_get_units.c b/libdw/dwarf_get_units.c
index 6215bf4b..407ed2ba 100644
--- a/libdw/dwarf_get_units.c
+++ b/libdw/dwarf_get_units.c
@@ -129,3 +129,4 @@ dwarf_get_units (Dwarf *dwarf, Dwarf_CU *cu, Dwarf_CU 
**next_cu,
 
   return 0;
 }
+INTDEF(dwarf_get_units)
diff --git a/libdw/dwarf_next_lines.c b/libdw/dwarf_next_lines.c
index 9b76b47e..74854ecd 100644
--- a/libdw/dwarf_next_lines.c
+++ b/libdw/dwarf_next_lines.c
@@ -99,8 +99,8 @@ dwarf_next_lines (Dwarf *dbg, Dwarf_Off off,
   Dwarf_CU *given_cu = *cu;
   Dwarf_CU *next_cu = given_cu;
   bool found = false;
-  while (dwarf_get_units (dbg, next_cu, &next_cu, NULL, NULL,
- &cudie, NULL) == 0)
+  while (INTUSE(dwarf_get_units) (dbg, next_cu, &next_cu, NULL, NULL,
+ &cudie, NULL) == 0)
{
  if (dwarf_hasattr (&cudie, DW_AT_stmt_list))
{
@@ -131,8 +131,8 @@ dwarf_next_lines (Dwarf *dbg, Dwarf_Off off,
 tables. Need to do a linear search (but stop at the given
 CU, since we already searched those.  */
  next_cu = NULL;
- while (dwarf_get_units (dbg, next_cu, &next_cu, NULL, NULL,
- &cudie, NULL) == 0
+ while (INTUSE(dwarf_get_units) (dbg, next_cu, &next_cu, NULL, NULL,
+ &cudie, NULL) == 0
 && next_cu != given_cu)
{
  Dwarf_Attribute attr;
diff --git a/libdw/libdwP.h b/libdw/libdwP.h
index aef42267..5aca9082 100644
--- a/libdw/libdwP.h
+++ b/libdw/libdwP.h
@@ -414,6 +414,49 @@ struct Dwarf_CU
   void *endp;
 };
 
+/* Aliases to avoid PLTs.  */
+INTDECL (dwarf_aggregate_size)
+INTDECL (dwarf_attr)
+INTDECL (dwarf_attr_integrate)
+INTDECL (dwarf_begin)
+INTDECL (dwarf_begin_elf)
+INTDECL (dwarf_child)
+INTDECL (dwarf_default_lower_bound)
+INTDECL (dwarf_dieoffset)
+INTDECL (dwarf_diename)
+INTDECL (dwarf_end)
+INTDECL (dwarf_entrypc)
+INTDECL (dwarf_errmsg)
+INTDECL (dwarf_formaddr)
+INTDECL (dwarf_formblock)
+INTDECL (dwarf_formref_die)
+INTDECL (dwarf_formsdata)
+INTDECL (dwarf_formstring)
+INTDECL (dwarf_formudata)
+INTDECL (dwarf_getabbrevattr_data)
+INTDECL (dwarf_getalt)
+INTDECL (dwarf_getarange_addr)
+INTDECL (dwarf_getarangeinfo)
+INTDECL (dwarf_getaranges)
+INTDECL (dwarf_getlocation_die)
+INTDECL (dwarf_getsrcfiles)
+INTDECL (dwarf_getsrclines)
+INTDECL (dwarf_get_units)
+INTDECL (dwarf_hasattr)
+INTDECL (dwarf_haschildren)
+INTDECL (dwarf_haspc)
+INTDECL (dwarf_highpc)
+INTDECL (dwarf_lowpc)
+INTDECL (dwarf_nextcu)
+INTDECL (dwarf_next_unit)
+INTDECL (dwarf_offdie)
+INTDECL (dwarf_peel_type)
+INTDECL (dwarf_ranges)
+INTDECL (dwarf_setalt)
+INTDECL (dwarf_siblingof)
+INTDECL (dwarf_srclang)
+INTDECL (dwarf_tag)
+
 #define ISV4TU(cu) ((cu)->version == 4 && (cu)->sec_idx == IDX_debug_types)
 
 /* Compute the offset of a CU's first DIE from the CU offset.
@@ -1061,8 +1104,8 @@ str_offsets_base_off (Dwarf *dbg, Dwarf_CU *cu)
   if (cu == NULL && dbg != NULL)
 {
   Dwarf_CU *first_cu;
-  if (dwarf_get_units (dbg, NULL, &first_cu,
-  NULL, NULL, NULL, NULL) == 0)
+  if (INTUSE(dwarf_get_units) (dbg, NULL, &first_cu,
+  NULL, NULL, NULL, NULL) == 0)
cu = first_cu;
 }
 
@@ -1379,48 +1422,4 @@ void __libdw_set_debugdir (Dwarf *dbg);
 char * __libdw_filepath (const char *debugdir, const char *dir,
 const char *file)
   internal_function;
-
-
-/* Aliases to avoid PLTs.  */
-INTDECL (dwarf_aggregate_size)
-INTDECL (dwarf_attr)
-INTDECL (dwarf_attr_integrate)
-INTDECL (dwarf_begin)
-INTDECL (dwarf_begin_elf)
-INTDECL (dwarf_child)
-INTDECL (dwarf_default_lower_bound)
-INTDECL (dwarf_dieoffset)
-INTDECL (dwarf_diename)
-INTDECL (dwarf_end)
-INTDECL (dwarf_entrypc)
-INTDECL (dwarf_errmsg)
-INTDECL (dwarf_formaddr)
-INTDECL (dwarf_formblock)
-INTDECL (dwarf_formref_die)
-INTDECL (dwarf_formsdata)
-INTDECL (dwarf_formstring)
-INTDECL (dwarf_formudata)
-INTDECL (dwarf_getabbrevattr_data)
-INTDECL (dwarf_getalt)
-INTDECL (dwarf_getarange_addr)
-INTDECL (dwarf_getarangeinfo)
-INTDECL (dwarf_getaranges)
-INTDECL (dwarf_getlocation_die)
-INTDECL (dwarf_getsrcfiles)
-INTDECL (dwarf_getsrclines)
-INTDECL (dwarf_hasattr)
-INTDECL (dwarf_haschildren)
-INTDECL (dwarf_haspc)
-INTDECL (dwarf_highpc)
-INTDECL (dwarf_lowpc)
-INTDECL (dwarf_nextcu)
-INTDECL (dwarf_next_unit)
-INTDECL (dwarf_offdie)
-INTDECL (dwarf_peel_type)
-INTDECL (dwarf_ranges)
-INTDECL (dwarf_setalt)
-INTDECL (dwarf_siblingof)
-INTDECL (dwarf_srclang)
-INTDECL (dwarf_tag)
-
 #

[PATCH 0/2] dwarf_getaranges: Build aranges list from CUs

2023-12-06 Thread Aaron Merey
Patch 1/2 is a prepatory patch that modifies dwarf_get_units calls
with INTUSE.

Aaron Merey (2):
  libdw: Use INTUSE with dwarf_get_units
  dwarf_getaranges: Build aranges list from CUs instead of
.debug_aranges

 libdw/dwarf_get_units.c   |   1 +
 libdw/dwarf_getaranges.c  | 215 --
 libdw/dwarf_next_lines.c  |   8 +-
 libdw/libdwP.h|  91 +++---
 libdw/libdw_find_split_unit.c |   4 +-
 5 files changed, 104 insertions(+), 215 deletions(-)

-- 
2.43.0



[PATCH 2/2] dwarf_getaranges: Build aranges list from CUs instead of .debug_aranges

2023-12-06 Thread Aaron Merey
No longer use .debug_aranges to build the aranges list since it could be
absent or incomplete.

Instead build the aranges list by iterating over each CU and recording
each address range.

https://sourceware.org/bugzilla/show_bug.cgi?id=22288
https://sourceware.org/bugzilla/show_bug.cgi?id=30948

Signed-off-by: Aaron Merey 

---

This patch's method of building the aranges list is slower than simply
reading .debug_aranges.  On my machine, running eu-stack on a 2.9G
firefox core file takes about 8.7 seconds with this patch applied,
compared to about 3.3 seconds without this patch.

Ideally we could assume that .debug_aranges is complete if it is present
and build the aranges list via CU iteration only when .debug_aranges
is absent.  This would let us save time on gcc-compiled binaries, which
include complete .debug_aranges by default.

However the DWARF spec appears to permit partially complete
.debug_aranges [1].  We could improve performance by starting with a
potentially incomplete list built from .debug_aranges.  If a lookup
fails then search the CUs for missing aranges and add to the list
when found.

This approach would complicate the dwarf_get_aranges interface.  The
list it initially provides could no longer be assumed to be complete.
The number of elements in the list could change during calls to
dwarf_getarange{info, _addr}.  This would invalidate the naranges value
set by dwarf_getaranges.  The current API doesn't include a way to
communicate to the caller when narages changes and by how much.

Due to these complications I think it's better to simply ignore
.debug_aranges altogether and build the aranges table via CU iteration,
as is done in this patch.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=22288#c5

 libdw/dwarf_getaranges.c | 215 ++-
 1 file changed, 52 insertions(+), 163 deletions(-)

diff --git a/libdw/dwarf_getaranges.c b/libdw/dwarf_getaranges.c
index 27439d37..8676f93b 100644
--- a/libdw/dwarf_getaranges.c
+++ b/libdw/dwarf_getaranges.c
@@ -33,7 +33,6 @@
 #endif
 
 #include 
-#include 
 #include "libdwP.h"
 #include 
 
@@ -68,174 +67,51 @@ dwarf_getaranges (Dwarf *dbg, Dwarf_Aranges **aranges, 
size_t *naranges)
   return 0;
 }
 
-  if (dbg->sectiondata[IDX_debug_aranges] == NULL)
-{
-  /* No such section.  */
-  *aranges = NULL;
-  if (naranges != NULL)
-   *naranges = 0;
-  return 0;
-}
-
-  if (dbg->sectiondata[IDX_debug_aranges]->d_buf == NULL)
-return -1;
-
   struct arangelist *arangelist = NULL;
   unsigned int narangelist = 0;
 
-  const unsigned char *readp = dbg->sectiondata[IDX_debug_aranges]->d_buf;
-  const unsigned char *readendp
-= readp + dbg->sectiondata[IDX_debug_aranges]->d_size;
-
-  while (readp < readendp)
+  Dwarf_CU *cu = NULL;
+  while (INTUSE(dwarf_get_units) (dbg, cu, &cu, NULL, NULL, NULL, NULL) == 0)
 {
-  const unsigned char *hdrstart = readp;
-
-  /* Each entry starts with a header:
-
-1. A 4-byte or 12-byte length containing the length of the
-set of entries for this compilation unit, not including the
-length field itself. [...]
-
-2. A 2-byte version identifier containing the value 2 for
-DWARF Version 2.1.
-
-3. A 4-byte or 8-byte offset into the .debug_info section. [...]
-
-4. A 1-byte unsigned integer containing the size in bytes of
-an address (or the offset portion of an address for segmented
-addressing) on the target system.
-
-5. A 1-byte unsigned integer containing the size in bytes of
-a segment descriptor on the target system.  */
-  if (unlikely (readp + 4 > readendp))
-   goto invalid;
-
-  Dwarf_Word length = read_4ubyte_unaligned_inc (dbg, readp);
-  unsigned int length_bytes = 4;
-  if (length == DWARF3_LENGTH_64_BIT)
-   {
- if (unlikely (readp + 8 > readendp))
-   goto invalid;
-
- length = read_8ubyte_unaligned_inc (dbg, readp);
- length_bytes = 8;
-   }
-  else if (unlikely (length >= DWARF3_LENGTH_MIN_ESCAPE_CODE
-&& length <= DWARF3_LENGTH_MAX_ESCAPE_CODE))
-   goto invalid;
-
-  const unsigned char *endp = readp + length;
-  if (unlikely (endp > readendp))
-   goto invalid;
-
-  if (unlikely (readp + 2 > readendp))
-   goto invalid;
-
-  unsigned int version = read_2ubyte_unaligned_inc (dbg, readp);
-  if (version != 2)
-   {
-   invalid:
- __libdw_seterrno (DWARF_E_INVALID_DWARF);
-   fail:
- while (arangelist != NULL)
-   {
- struct arangelist *next = arangelist->next;
- free (arangelist);
- arangelist = next;
-   }
- return -1;
-   }
-
-  Dwarf_Word offset = 0;
-  if (__libdw_read_offset_inc (dbg,
-  IDX_debug_aranges, &readp,
-  length_bytes, &offset, IDX_debu