On demand static/shared libs and binary linkage
Hello, I am quite new on elfutils library. I know their were discussions in the past about the question I will ask but let me introduce my use case. I am working on a project which aims at building some kind of "standalone" toolchain (binutils, gcc, clang + tools like gdb, valgrind). This toolchain could be shipped easily on any linux machines by copying files on the targeted machine. To ease this, the tools binaries are statically linked to avoid any interferences with user local LD_LIBRARY_PATH environment variable. I am planning to add elfutils to this toolchain. The provided binaries are useful and the delivered libs are a prerequisite for libabigail (which I plan to also add). I can see there is no flag to chose between static and shared. >From my investigations in the code and the build system: - the delivered libs cannot be 100% static archives. Depending on the machine the libelf is used, some backend code are dlopen'd. So at least, the backends code must be delivered as shared lib (in the by default EBL directory) - the delivered binaries are by default dynamically linked except if the gcov or gprof support is enabled. I was wondering if it would be a good idea (or even possible) to have two options in the configure script, for instance, --enable-static and --enable-shared, which would be set to "yes" by default. In the case --enable-shared=no, no dso is generated (except the backends) and the binaries are statically linked. In the case --enable-static=no, only the dso are generated and binaries are dynamically linked. The combination --enable-shared=no and --enable-dynamic=no is impossible. Finally if the gcov or gprof support is required, we force the current behaviour (ie. --enable-shared=yes and --enable-static=yes). Can you give me your feed back on such a proposal ? If it is worth working on this, I can try to provide a patch. Regarding the project I am working on, I have a workaround which does not satisfy me: - build zlib shared lib - build elfutils with gcov support Thanks in advance, Laurent Stacul
Re: On demand static/shared libs and binary linkage
Understood. In this case, there are 2 solutions: - we completely remove the call to dlopen - we keep the call to dlopen from the shared lib version of libdw and embed all the backends code in the archive I don't know if the second solution is worth the additional work/complexity. What do you think ? Laurent
[COMMITTED] readelf: Lookup "no" translation for no_str, not "yes".
On irc Tom pointed out that no was yes... oops. Committed as obvious. Also use yes_str and no_str in print_debug_abbrev_section and print_form_data. Signed-off-by: Mark Wielaard --- src/ChangeLog | 6 ++ src/readelf.c | 6 +++--- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/src/ChangeLog b/src/ChangeLog index 37e24714..d6fc919a 100644 --- a/src/ChangeLog +++ b/src/ChangeLog @@ -1,3 +1,9 @@ +2018-06-07 Mark Wielaard + + * readelf.c (main): Lookup "no" for no_str. + (print_debug_abbrev_section): Use yes_str and no_str. + (print_form_data): Likewise. + 2018-06-04 Mark Wielaard * readelf (format_result): New static char pointer. diff --git a/src/readelf.c b/src/readelf.c index 8f37f17b..6ac45111 100644 --- a/src/readelf.c +++ b/src/readelf.c @@ -327,7 +327,7 @@ main (int argc, char *argv[]) /* Look up once. */ yes_str = gettext ("yes"); - no_str = gettext ("yes"); + no_str = gettext ("no"); /* Parse and process arguments. */ int remaining; @@ -5062,7 +5062,7 @@ print_debug_abbrev_section (Dwfl_Module *dwflmod __attribute__ ((unused)), printf (gettext (" [%5u] offset: %" PRId64 ", children: %s, tag: %s\n"), code, (int64_t) offset, - has_children ? gettext ("yes") : gettext ("no"), + has_children ? yes_str : no_str, dwarf_tag_name (tag)); size_t cnt = 0; @@ -7955,7 +7955,7 @@ print_form_data (Dwarf *dbg, int form, const unsigned char *readp, if (readendp - readp < 1) goto invalid_data; val = *readp++; - printf ("%s", val != 0 ? gettext ("yes") : gettext ("no")); + printf ("%s", val != 0 ? yes_str : no_str); break; case DW_FORM_string: -- 2.17.0
[PATCH] libdw: Check DIE address fall inside the CU before reading abbrev code.
The afl fuzzer found a case where we tried reading an uleb for the DIE abbrev code without properly checking the DIE address is inside the CU. Signed-off-by: Mark Wielaard --- libdw/ChangeLog | 4 libdw/libdwP.h | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/libdw/ChangeLog b/libdw/ChangeLog index b000492..b569393 100644 --- a/libdw/ChangeLog +++ b/libdw/ChangeLog @@ -1,5 +1,9 @@ 2018-06-06 Mark Wielaard + * libdwP.h (__libdw_dieabbrev): Check DIE addr falls in cu. + +2018-06-06 Mark Wielaard + * dwarf_getlocation_die.c (dwarf_getlocation_die): Check offset falls inside cu data. diff --git a/libdw/libdwP.h b/libdw/libdwP.h index 1c8dd0d..3d8e145 100644 --- a/libdw/libdwP.h +++ b/libdw/libdwP.h @@ -653,7 +653,7 @@ __libdw_dieabbrev (Dwarf_Die *die, const unsigned char **readp) /* Get the abbreviation code. */ unsigned int code; const unsigned char *addr = die->addr; - if (die->cu == NULL) + if (die->cu == NULL || addr >= (const unsigned char *) die->cu->endp) return DWARF_END_ABBREV; get_uleb128 (code, addr, die->cu->endp); if (readp != NULL) -- 1.8.3.1
[PATCH 1/2] libdw: Make sure that address_size and offset_size are 4 or 8 bytes.
When interning a CU make sure that address_size and offset_size are either 4 or 8 bytes. We really don't (want to) handle any other size. Signed-off-by: Mark Wielaard --- libdw/ChangeLog | 6 ++ libdw/libdw_findcu.c | 13 +++-- 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/libdw/ChangeLog b/libdw/ChangeLog index b569393..9d0b484 100644 --- a/libdw/ChangeLog +++ b/libdw/ChangeLog @@ -1,3 +1,9 @@ +2018-06-07 Mark Wielaard + + * libdw_findcu.c (__libdw_intern_next_unit): Report DWARF_E_VERSION, + not DWARF_E_INVALID_DWARF on unknown version. Set address_size and + offset_size to 8 when unknown. + 2018-06-06 Mark Wielaard * libdwP.h (__libdw_dieabbrev): Check DIE addr falls in cu. diff --git a/libdw/libdw_findcu.c b/libdw/libdw_findcu.c index 2f5c6c4..ed74423 100644 --- a/libdw/libdw_findcu.c +++ b/libdw/libdw_findcu.c @@ -120,14 +120,23 @@ __libdw_intern_next_unit (Dwarf *dbg, bool debug_types) return NULL; /* We only know how to handle the DWARF version 2 through 5 formats. - For v4 debug types we only handle version 4. */ + For v4 debug types we only handle version 4. */ if (unlikely (version < 2) || unlikely (version > 5) || (debug_types && unlikely (version != 4))) { - __libdw_seterrno (DWARF_E_INVALID_DWARF); + __libdw_seterrno (DWARF_E_VERSION); return NULL; } + /* We only handle 32 or 64 bit (4 or 8 byte) addresses and offsets. + Just assume we are dealing with 64bit in case the size is "unknown". + Too much code assumes if it isn't 4 then it is 8 (or the other way + around). */ + if (unlikely (address_size != 4 && address_size != 8)) +address_size = 8; + if (unlikely (offset_size != 4 && offset_size != 8)) +offset_size = 8; + /* Invalid or truncated debug section data? */ size_t sec_idx = debug_types ? IDX_debug_types : IDX_debug_info; Elf_Data *data = dbg->sectiondata[sec_idx]; -- 1.8.3.1
Re: [PATCH] readelf: Don't allocate string with asprintf, but reuse buffer with sprintf.
On Mon, Jun 04, 2018 at 07:05:16PM +0200, Mark Wielaard wrote: > Since we are single threaded we can just use a static result buffer for > format_dwarf_addr as long as we make sure to print the result before > calling format_dwarf_addr again. This removes lots of malloc/free calls. Almost as soon as I checked this in the afl fuzzer detected that we assumed addresses are max 8 bytes (64bits). So it presented us with a CU that has an address size of 136 bytes... We dutifully try to print that large an address into a buffer that has room for just 8 and crash... First, we should just make sure to always use 32 or 64 bit addresses (and offsets). There is too much code that really relies on them being either 4 bytes or 8 bytes. [PATCH 1/2] libdw: Make sure that address_size and offset_size are 4 Second, it is not really necessary to create a buffer, sprintf into it, then use that buffer to printf to stdio. Just do it directly. [PATCH 2/2] readelf: Turn format_print_dwarf into print_dwarf_addr. Cheers, Mark
[PATCH 2/2] readelf: Turn format_print_dwarf into print_dwarf_addr.
We don't really need to setup a buffer, print into it and then print it out to stdout. Simplify the code by directly printing the address (and symbol name). Signed-off-by: Mark Wielaard --- src/ChangeLog| 20 +++ src/readelf.c| 423 +-- tests/run-readelf-loc.sh | 52 +++--- 3 files changed, 238 insertions(+), 257 deletions(-) diff --git a/src/ChangeLog b/src/ChangeLog index cdc4720..778238e 100644 --- a/src/ChangeLog +++ b/src/ChangeLog @@ -1,3 +1,23 @@ +2018-06-07 Mark Wielaard + + * readelf.c (format_result): Removed. + (format_result_size): Removed. + (format_dwarf_addr): Renamed to... + (print_dwarf_addr): ...this. Simply call printf, don't setup buffer, + don't call sprintf. + (print_ops): Use print_dwarf_addr instead of format_dwarf_addr. + (print_debug_addr_section): Likewise. + (print_debug_aranges_section): Likewise. + (print_debug_rnglists_section): Likewise. + (print_debug_ranges_section): Likewise. + (print_debug_frame_section): Likewise. + (attr_callback): Likewise. + (print_decoded_line_section): Likewise. + (print_debug_line_section): Likewise. + (print_debug_loclists_section): Likewise. + (print_debug_loc_section): Likewise. + (print_gdb_index_section): Likewsie. + 2018-06-05 Mark Wielaard * readelf.c (print_debug_addr_section): Set unit_length always to diff --git a/src/readelf.c b/src/readelf.c index eac5eac..f9514a1 100644 --- a/src/readelf.c +++ b/src/readelf.c @@ -3705,12 +3705,9 @@ print_attributes (Ebl *ebl, const GElf_Ehdr *ehdr) } -static char *format_result = NULL; -static size_t format_result_size = 0; - -static char * -format_dwarf_addr (Dwfl_Module *dwflmod, - int address_size, Dwarf_Addr address, Dwarf_Addr raw) +void +print_dwarf_addr (Dwfl_Module *dwflmod, + int address_size, Dwarf_Addr address, Dwarf_Addr raw) { /* See if there is a name we can give for this address. */ GElf_Sym sym; @@ -3736,76 +3733,41 @@ format_dwarf_addr (Dwfl_Module *dwflmod, : dwfl_module_relocation_info (dwflmod, i, NULL)); } - char *result; - size_t max_size = ((name != NULL ? strlen (name) : 0) -+ (scn != NULL ? strlen (scn) : 0) -+ (2 + 8 * 2) * 2 + 6); - if (max_size > format_result_size) -{ - max_size *= 2; /* A bit more, so we don't immediately realloc again. */ - result = realloc (format_result, max_size); - if (result == NULL) - error (EXIT_FAILURE, 0, _("memory exhausted")); - format_result_size = max_size; - format_result = result; -} - else -result = format_result; - if ((name != NULL ? (off != 0 ? (scn != NULL ? (address_size == 0 - ? sprintf (result, - "%s+%#" PRIx64 " <%s+%#" PRIx64 ">", - scn, address, name, off) - : sprintf (result, - "%s+%#0*" PRIx64 " <%s+%#" PRIx64 ">", - scn, 2 + address_size * 2, address, - name, off)) + ? printf ("%s+%#" PRIx64 " <%s+%#" PRIx64 ">", + scn, address, name, off) + : printf ("%s+%#0*" PRIx64 " <%s+%#" PRIx64 ">", + scn, 2 + address_size * 2, address, + name, off)) : (address_size == 0 - ? sprintf (result, - "%#" PRIx64 " <%s+%#" PRIx64 ">", - address, name, off) - : sprintf (result, - "%#0*" PRIx64 " <%s+%#" PRIx64 ">", - 2 + address_size * 2, address, - name, off))) + ? printf ("%#" PRIx64 " <%s+%#" PRIx64 ">", + address, name, off) + : printf ("%#0*" PRIx64 " <%s+%#" PRIx64 ">", + 2 + address_size * 2, address, + name, off))) : (scn != NULL ? (address_size == 0 - ? sprintf (result, - "%s+%#" PRIx64 " <%s>", - scn, address, name) - : sprintf (result, - "%s+%#0*" PRIx64 " <%s>", + ? printf ("%s+%#" PRIx64 " <%s>", scn, address, name) + : printf ("%s+%#0*" PRIx64 " <%s>", scn, 2 + address_size * 2, address, name)) : (address_size == 0 - ? sprintf (result, - "%#" PRIx64 " <%s>", - address, name) - : sprintf (result, - "%#0*" PRIx64 " <%s>", - 2 + address_size * 2, address, name + ? printf ("%#" PRIx64 " <%s>", address, name) +
Re: On demand static/shared libs and binary linkage
On Thu, Jun 07, 2018 at 03:59:22PM +0200, Laurent Stacul wrote: > Understood. > > In this case, there are 2 solutions: > > - we completely remove the call to dlopen > - we keep the call to dlopen from the shared lib version of libdw and > embed all the backends code in the archive > > I don't know if the second solution is worth the additional > work/complexity. What do you think ? I kind of did the first on the mjw/RH-DTS branch. Which just has the backends for the architectures that RHEL officially supports. But I don't really like it. Ideally I would like to have a configure option to say which backends need to be linked in directly and keep the rest dynamic using dlopen. We could then default to include the native backend (and maybe related sub-archies, like x86_64 plus i386). But still build and ship the rest as shared libraries that can be dlopened on demand. One extra issue is that dlopen code (and libebl in general) has pretty bad error handling. If we clean up this code it would be nice to get a better error mechanism for when a backend cannot be found. Cheers, Mark