On demand static/shared libs and binary linkage

2018-06-07 Thread Laurent Stacul
Hello,

I am quite new on elfutils library. I know their were discussions in
the past about the question I will ask but let me introduce my use
case.

I am working on a project which aims at building some kind of
"standalone" toolchain (binutils, gcc, clang + tools like gdb,
valgrind). This toolchain could be shipped easily on any linux
machines by copying files on the targeted machine.

To ease this, the tools binaries are statically linked to avoid any
interferences with user local LD_LIBRARY_PATH environment variable.

I am planning to add elfutils to this toolchain. The provided binaries
are useful and the delivered libs are a prerequisite for libabigail
(which I plan to also add).

I can see there is no flag to chose between static and shared.

>From my investigations in the code and the build system:

- the delivered libs cannot be 100% static archives. Depending on the
machine the libelf is used, some backend code are dlopen'd. So at
least, the backends code must be delivered as shared lib (in the by
default EBL directory)

- the delivered binaries are by default dynamically linked except if
the gcov or gprof support is enabled.

I was wondering if it would be a good idea (or even possible) to have
two options in the configure script, for instance, --enable-static and
--enable-shared, which would be set to "yes" by default.

In the case --enable-shared=no, no dso is generated (except the
backends) and the binaries are statically linked.
In the case --enable-static=no, only the dso are generated and
binaries are dynamically linked.
The combination --enable-shared=no and --enable-dynamic=no is impossible.

Finally if the gcov or gprof support is required, we force the current
behaviour (ie. --enable-shared=yes and --enable-static=yes).

Can you give me your feed back on such a proposal ? If it is worth
working on this, I can try to provide a patch.

Regarding the project I am working on, I have a workaround which does
not satisfy me:
- build zlib shared lib
- build elfutils with gcov support

Thanks in advance,
Laurent Stacul


Re: On demand static/shared libs and binary linkage

2018-06-07 Thread Laurent Stacul
Understood.

In this case, there are 2 solutions:

- we completely remove the call to dlopen
- we keep the call to dlopen from the shared lib version of libdw and
embed all the backends code in the archive

I don't know if the second solution is worth the additional
work/complexity. What do you think ?

Laurent


[COMMITTED] readelf: Lookup "no" translation for no_str, not "yes".

2018-06-07 Thread Mark Wielaard
On irc Tom pointed out that no was yes... oops.
Committed as obvious.

Also use yes_str and no_str in print_debug_abbrev_section
and print_form_data.

Signed-off-by: Mark Wielaard 
---
 src/ChangeLog | 6 ++
 src/readelf.c | 6 +++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 37e24714..d6fc919a 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,9 @@
+2018-06-07  Mark Wielaard  
+
+   * readelf.c (main): Lookup "no" for no_str.
+   (print_debug_abbrev_section): Use yes_str and no_str.
+   (print_form_data): Likewise.
+
 2018-06-04  Mark Wielaard  
 
* readelf (format_result): New static char pointer.
diff --git a/src/readelf.c b/src/readelf.c
index 8f37f17b..6ac45111 100644
--- a/src/readelf.c
+++ b/src/readelf.c
@@ -327,7 +327,7 @@ main (int argc, char *argv[])
 
   /* Look up once.  */
   yes_str = gettext ("yes");
-  no_str = gettext ("yes");
+  no_str = gettext ("no");
 
   /* Parse and process arguments.  */
   int remaining;
@@ -5062,7 +5062,7 @@ print_debug_abbrev_section (Dwfl_Module *dwflmod 
__attribute__ ((unused)),
  printf (gettext (" [%5u] offset: %" PRId64
   ", children: %s, tag: %s\n"),
  code, (int64_t) offset,
- has_children ? gettext ("yes") : gettext ("no"),
+ has_children ? yes_str : no_str,
  dwarf_tag_name (tag));
 
  size_t cnt = 0;
@@ -7955,7 +7955,7 @@ print_form_data (Dwarf *dbg, int form, const unsigned 
char *readp,
   if (readendp - readp < 1)
goto invalid_data;
   val = *readp++;
-  printf ("%s", val != 0 ? gettext ("yes") : gettext ("no"));
+  printf ("%s", val != 0 ? yes_str : no_str);
   break;
 
 case DW_FORM_string:
-- 
2.17.0



[PATCH] libdw: Check DIE address fall inside the CU before reading abbrev code.

2018-06-07 Thread Mark Wielaard
The afl fuzzer found a case where we tried reading an uleb for the DIE
abbrev code without properly checking the DIE address is inside the CU.

Signed-off-by: Mark Wielaard 
---
 libdw/ChangeLog | 4 
 libdw/libdwP.h  | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index b000492..b569393 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,5 +1,9 @@
 2018-06-06  Mark Wielaard  
 
+   * libdwP.h (__libdw_dieabbrev): Check DIE addr falls in cu.
+
+2018-06-06  Mark Wielaard  
+
* dwarf_getlocation_die.c (dwarf_getlocation_die): Check offset
falls inside cu data.
 
diff --git a/libdw/libdwP.h b/libdw/libdwP.h
index 1c8dd0d..3d8e145 100644
--- a/libdw/libdwP.h
+++ b/libdw/libdwP.h
@@ -653,7 +653,7 @@ __libdw_dieabbrev (Dwarf_Die *die, const unsigned char 
**readp)
   /* Get the abbreviation code.  */
   unsigned int code;
   const unsigned char *addr = die->addr;
-  if (die->cu == NULL)
+  if (die->cu == NULL || addr >= (const unsigned char *) die->cu->endp)
return DWARF_END_ABBREV;
   get_uleb128 (code, addr, die->cu->endp);
   if (readp != NULL)
-- 
1.8.3.1



[PATCH 1/2] libdw: Make sure that address_size and offset_size are 4 or 8 bytes.

2018-06-07 Thread Mark Wielaard
When interning a CU make sure that address_size and offset_size are
either 4 or 8 bytes. We really don't (want to) handle any other size.

Signed-off-by: Mark Wielaard 
---
 libdw/ChangeLog  |  6 ++
 libdw/libdw_findcu.c | 13 +++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index b569393..9d0b484 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,3 +1,9 @@
+2018-06-07  Mark Wielaard  
+
+   * libdw_findcu.c (__libdw_intern_next_unit): Report DWARF_E_VERSION,
+   not DWARF_E_INVALID_DWARF on unknown version. Set address_size and
+   offset_size to 8 when unknown.
+
 2018-06-06  Mark Wielaard  
 
* libdwP.h (__libdw_dieabbrev): Check DIE addr falls in cu.
diff --git a/libdw/libdw_findcu.c b/libdw/libdw_findcu.c
index 2f5c6c4..ed74423 100644
--- a/libdw/libdw_findcu.c
+++ b/libdw/libdw_findcu.c
@@ -120,14 +120,23 @@ __libdw_intern_next_unit (Dwarf *dbg, bool debug_types)
 return NULL;
 
   /* We only know how to handle the DWARF version 2 through 5 formats.
- For v4 debug types we only handle version 4. */
+ For v4 debug types we only handle version 4.  */
   if (unlikely (version < 2) || unlikely (version > 5)
   || (debug_types && unlikely (version != 4)))
 {
-  __libdw_seterrno (DWARF_E_INVALID_DWARF);
+  __libdw_seterrno (DWARF_E_VERSION);
   return NULL;
 }
 
+  /* We only handle 32 or 64 bit (4 or 8 byte) addresses and offsets.
+ Just assume we are dealing with 64bit in case the size is "unknown".
+ Too much code assumes if it isn't 4 then it is 8 (or the other way
+ around).  */
+  if (unlikely (address_size != 4 && address_size != 8))
+address_size = 8;
+  if (unlikely (offset_size != 4 && offset_size != 8))
+offset_size = 8;
+
   /* Invalid or truncated debug section data?  */
   size_t sec_idx = debug_types ? IDX_debug_types : IDX_debug_info;
   Elf_Data *data = dbg->sectiondata[sec_idx];
-- 
1.8.3.1



Re: [PATCH] readelf: Don't allocate string with asprintf, but reuse buffer with sprintf.

2018-06-07 Thread Mark Wielaard
On Mon, Jun 04, 2018 at 07:05:16PM +0200, Mark Wielaard wrote:
> Since we are single threaded we can just use a static result buffer for
> format_dwarf_addr as long as we make sure to print the result before
> calling format_dwarf_addr again. This removes lots of malloc/free calls.

Almost as soon as I checked this in the afl fuzzer detected that we
assumed addresses are max 8 bytes (64bits). So it presented us with
a CU that has an address size of 136 bytes... We dutifully try to
print that large an address into a buffer that has room for just 8
and crash...

First, we should just make sure to always use 32 or 64 bit addresses
(and offsets). There is too much code that really relies on them being
either 4 bytes or 8 bytes.

[PATCH 1/2] libdw: Make sure that address_size and offset_size are 4

Second, it is not really necessary to create a buffer, sprintf into it,
then use that buffer to printf to stdio. Just do it directly.

[PATCH 2/2] readelf: Turn format_print_dwarf into print_dwarf_addr.

Cheers,

Mark


[PATCH 2/2] readelf: Turn format_print_dwarf into print_dwarf_addr.

2018-06-07 Thread Mark Wielaard
We don't really need to setup a buffer, print into it and then print it
out to stdout. Simplify the code by directly printing the address (and
symbol name).

Signed-off-by: Mark Wielaard 
---
 src/ChangeLog|  20 +++
 src/readelf.c| 423 +--
 tests/run-readelf-loc.sh |  52 +++---
 3 files changed, 238 insertions(+), 257 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index cdc4720..778238e 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,23 @@
+2018-06-07  Mark Wielaard  
+
+   * readelf.c (format_result): Removed.
+   (format_result_size): Removed.
+   (format_dwarf_addr): Renamed to...
+   (print_dwarf_addr): ...this. Simply call printf, don't setup buffer,
+   don't call sprintf.
+   (print_ops): Use print_dwarf_addr instead of format_dwarf_addr.
+   (print_debug_addr_section): Likewise.
+   (print_debug_aranges_section): Likewise.
+   (print_debug_rnglists_section): Likewise.
+   (print_debug_ranges_section): Likewise.
+   (print_debug_frame_section): Likewise.
+   (attr_callback): Likewise.
+   (print_decoded_line_section): Likewise.
+   (print_debug_line_section): Likewise.
+   (print_debug_loclists_section): Likewise.
+   (print_debug_loc_section): Likewise.
+   (print_gdb_index_section): Likewsie.
+
 2018-06-05  Mark Wielaard  
 
* readelf.c (print_debug_addr_section): Set unit_length always to
diff --git a/src/readelf.c b/src/readelf.c
index eac5eac..f9514a1 100644
--- a/src/readelf.c
+++ b/src/readelf.c
@@ -3705,12 +3705,9 @@ print_attributes (Ebl *ebl, const GElf_Ehdr *ehdr)
 }
 
 
-static char *format_result = NULL;
-static size_t format_result_size = 0;
-
-static char *
-format_dwarf_addr (Dwfl_Module *dwflmod,
-  int address_size, Dwarf_Addr address, Dwarf_Addr raw)
+void
+print_dwarf_addr (Dwfl_Module *dwflmod,
+ int address_size, Dwarf_Addr address, Dwarf_Addr raw)
 {
   /* See if there is a name we can give for this address.  */
   GElf_Sym sym;
@@ -3736,76 +3733,41 @@ format_dwarf_addr (Dwfl_Module *dwflmod,
 : dwfl_module_relocation_info (dwflmod, i, NULL));
 }
 
-  char *result;
-  size_t max_size = ((name != NULL ? strlen (name) : 0)
-+ (scn != NULL ? strlen (scn) : 0)
-+ (2 + 8 * 2) * 2 + 6);
-  if (max_size > format_result_size)
-{
-  max_size *= 2; /* A bit more, so we don't immediately realloc again.  */
-  result = realloc (format_result, max_size);
-  if (result == NULL)
-   error (EXIT_FAILURE, 0, _("memory exhausted"));
-  format_result_size = max_size;
-  format_result = result;
-}
-  else
-result = format_result;
-
   if ((name != NULL
? (off != 0
  ? (scn != NULL
 ? (address_size == 0
-   ? sprintf (result,
-  "%s+%#" PRIx64 " <%s+%#" PRIx64 ">",
-  scn, address, name, off)
-   : sprintf (result,
-  "%s+%#0*" PRIx64 " <%s+%#" PRIx64 ">",
-  scn, 2 + address_size * 2, address,
-  name, off))
+   ? printf ("%s+%#" PRIx64 " <%s+%#" PRIx64 ">",
+ scn, address, name, off)
+   : printf ("%s+%#0*" PRIx64 " <%s+%#" PRIx64 ">",
+ scn, 2 + address_size * 2, address,
+ name, off))
 : (address_size == 0
-   ? sprintf (result,
-  "%#" PRIx64 " <%s+%#" PRIx64 ">",
-  address, name, off)
-   : sprintf (result,
-  "%#0*" PRIx64 " <%s+%#" PRIx64 ">",
-  2 + address_size * 2, address,
-  name, off)))
+   ? printf ("%#" PRIx64 " <%s+%#" PRIx64 ">",
+ address, name, off)
+   : printf ("%#0*" PRIx64 " <%s+%#" PRIx64 ">",
+ 2 + address_size * 2, address,
+ name, off)))
  : (scn != NULL
 ? (address_size == 0
-   ? sprintf (result,
-  "%s+%#" PRIx64 " <%s>",
-  scn, address, name)
-   : sprintf (result,
-  "%s+%#0*" PRIx64 " <%s>",
+   ? printf ("%s+%#" PRIx64 " <%s>", scn, address, name)
+   : printf ("%s+%#0*" PRIx64 " <%s>",
   scn, 2 + address_size * 2, address, name))
 : (address_size == 0
-   ? sprintf (result,
-  "%#" PRIx64 " <%s>",
-  address, name)
-   : sprintf (result,
-  "%#0*" PRIx64 " <%s>",
-  2 + address_size * 2, address, name
+   ? printf ("%#" PRIx64 " <%s>", address, name)
+ 

Re: On demand static/shared libs and binary linkage

2018-06-07 Thread Mark Wielaard
On Thu, Jun 07, 2018 at 03:59:22PM +0200, Laurent Stacul wrote:
> Understood.
> 
> In this case, there are 2 solutions:
> 
> - we completely remove the call to dlopen
> - we keep the call to dlopen from the shared lib version of libdw and
> embed all the backends code in the archive
> 
> I don't know if the second solution is worth the additional
> work/complexity. What do you think ?

I kind of did the first on the mjw/RH-DTS branch. Which just has
the backends for the architectures that RHEL officially supports.

But I don't really like it. Ideally I would like to have a configure
option to say which backends need to be linked in directly and keep
the rest dynamic using dlopen. We could then default to include the
native backend (and maybe related sub-archies, like x86_64 plus i386).
But still build and ship the rest as shared libraries that can be
dlopened on demand.

One extra issue is that dlopen code (and libebl in general) has pretty
bad error handling. If we clean up this code it would be nice to get
a better error mechanism for when a backend cannot be found.

Cheers,

Mark