[PATCH] libdwfl: Rewrite reading of ar_size in elf_begin_rand
With GCC 12.1.1, glibc 2.3a, -fsanitize=undefined and -D_FORTIFY_SOURCE=3 we get the following error message: In file included from /usr/include/ar.h:22, from ../libelf/libelfP.h:33, from core-file.c:31: In function ‘pread’, inlined from ‘pread_retry’ at ../lib/system.h:188:21, inlined from ‘elf_begin_rand’ at core-file.c:86:16, inlined from ‘core_file_read_eagerly’ at core-file.c:205:15: /usr/include/bits/unistd.h:74:10: error: ‘__pread_alias’ writing 58 or more bytes into a region of size 10 overflows the destination [-Werror=stringop-overflow=] 74 | return __glibc_fortify (pread, __nbytes, sizeof (char), | ^~~ /usr/include/ar.h: In function ‘core_file_read_eagerly’: /usr/include/ar.h:41:10: note: destination object ‘ar_size’ of size 10 41 | char ar_size[10]; /* File size, in ASCII decimal. */ | ^~~ /usr/include/bits/unistd.h:50:16: note: in a call to function ‘__pread_alias’ declared with attribute ‘access (write_only, 2, 3)’ 50 | extern ssize_t __REDIRECT (__pread_alias, |^~ cc1: all warnings being treated as errors The warning disappears when dropping either -fsanitize=undefined or when using -D_FORTIFY_SOURCE=2. It looks like a false positive. But I haven't figured out how/why it happens. The code is a little tricky to proof correct though. The ar_size field is a not-zero terminated string ASCII decimal, right-paddedr with spaces. Which is then converted with strtoll. Relying on the fact that the struct ar_hdr is zero initialized, so there will be a zero byte after the ar_size field. Rewrite the code to just use a zero byte terminated char array. Which is much easier to reason about. As a bonus the error disappears. Signed-off-by: Mark Wielaard --- libdwfl/ChangeLog | 5 + libdwfl/core-file.c | 26 -- 2 files changed, 21 insertions(+), 10 deletions(-) diff --git a/libdwfl/ChangeLog b/libdwfl/ChangeLog index 75c53948..acdaa013 100644 --- a/libdwfl/ChangeLog +++ b/libdwfl/ChangeLog @@ -1,3 +1,8 @@ +2022-07-28 Mark Wielaard + + * core-file.c (elf_begin_rand): Replace struct ar_hdr h with + a char ar_size[AR_SIZE_CHARS + 1] array to read size. + 2022-07-18 Shahab Vahedi * debuginfod-client.c (dwfl_get_debuginfod_client stub): diff --git a/libdwfl/core-file.c b/libdwfl/core-file.c index cefc3db0..4418ef33 100644 --- a/libdwfl/core-file.c +++ b/libdwfl/core-file.c @@ -75,26 +75,32 @@ elf_begin_rand (Elf *parent, off_t offset, off_t size, off_t *next) from the archive header to override SIZE. */ if (parent->kind == ELF_K_AR) { - struct ar_hdr h = { .ar_size = "" }; - - if (unlikely (parent->maximum_size - offset < sizeof h)) + /* File size, in ASCII decimal, right-padded with ASCII spaces. + Max 10 characters. Not zero terminated. So make this ar_size + array one larger and explicitly zero terminate it. As needed + for strtoll. */ + #define AR_SIZE_CHARS 10 + char ar_size[AR_SIZE_CHARS + 1]; + ar_size[AR_SIZE_CHARS] = '\0'; + + if (unlikely (parent->maximum_size - offset < sizeof (struct ar_hdr))) return fail (ELF_E_RANGE); if (parent->map_address != NULL) - memcpy (h.ar_size, parent->map_address + parent->start_offset + offset, - sizeof h.ar_size); + memcpy (ar_size, parent->map_address + parent->start_offset + offset, + AR_SIZE_CHARS); else if (unlikely (pread_retry (parent->fildes, - h.ar_size, sizeof (h.ar_size), + ar_size, AR_SIZE_CHARS, parent->start_offset + offset + offsetof (struct ar_hdr, ar_size)) -!= sizeof (h.ar_size))) +!= AR_SIZE_CHARS)) return fail (ELF_E_READ_ERROR); - offset += sizeof h; + offset += sizeof (struct ar_hdr); char *endp; - size = strtoll (h.ar_size, &endp, 10); - if (unlikely (endp == h.ar_size) + size = strtoll (ar_size, &endp, 10); + if (unlikely (endp == ar_size) || unlikely ((off_t) parent->maximum_size - offset < size)) return fail (ELF_E_INVALID_ARCHIVE); } -- 2.18.4
Re: [PATCH] libdwfl: Rewrite reading of ar_size in elf_begin_rand
On Thu, 2022-07-28 at 15:48 +0200, Mark Wielaard wrote: > With GCC 12.1.1, glibc 2.3a, -fsanitize=undefined and > -D_FORTIFY_SOURCE=3 we get the following error message: Sorry for the typo, it is glibc 2.35. Basically an up to date Fedora 36 system (replicated on x86_64, ppc64le and s390x).
buildbot users try branch builders for elfutils
Hi, I setup git users try branches for elfutils. If you have commit access to elfutils.git you can now push to a users//try-xxx branch and the buildbot will do builds and sent you (the commit author) email about the results. The builds are also visible at: https://builder.sourceware.org/buildbot/#/builders?tags=elfutils-try My workflow to use this looks like: - git checkout -b ar_size - hack, hack, hack... OK, looks good to submit - git commit -a -m "Now who got an ar_size?" - git push origin ar_size:users/mark/try-ar_size - ... wait for the emails to come in or watch buildbot logs ... - Send in patches and mention what the try bot reported (OK, I sent in the patch first, but you can reply to your original patch submission or in bugzilla what the try results were) After your patches have been accepted you can delete the branch again: - git push origin :users/mark/try-ar_size Please use this only for patches you intend to submit and think are ready but want to double check. Use the GCC Compile Farm machines if you need to hack edit/compile/debug on a specific architecture. https://cfarm.tetaneutral.net/ But if you really need access to one of the buildbot machines for investigating an issue, please contact build...@sourceware.org. The try builders also upload logs to the bunsendb. And so are visible here https://builder.sourceware.org/testruns/ See also the bunsen upload step in the try-build to get an URL to the bunsen result for that specific build. Please sent feedback (bad or good) to build...@sourceware.org Cheers, Mark
Re: debuginfod Credential Helper RFC
Hi Daniel, On Tue, 2022-07-26 at 15:50 -0700, Daniel Thornburgh via Elfutils-devel wrote: > I'm working on a use case for debuginfod (in LLVM) that needs a > solution > for authentication and authorization of users when accessing source and > debug information. I've put together a short RFC for how this might work, > based on how git and Docker CLIs handle credentials. It should be fairly > straightforward to implement and to generalize to new credential types. > > Please take a look; it'd be good to have a consensus on how this should > work across interested debuginfod implementations before moving forward > towards implementation. I think this could work for a standalone program like debuginfod-find, but not for a library like libdebuginfod. I would rather not have to fork and exec from libdebuginfod. We don't really know in what state the program is and forking a big process is not cheap. The process might be watching its own children (like when libdebuginfod is used in a debugger or profiler) and suddenly get unexpected sigchilds or pids from wait. Can't this be handled through e.g. the underlying libcurl library by setting a proxy environment variable so the requests goes through a local proxy that is setup to do some kind of authentication transparently? Or by simply defining the base DEBUGINFOD_URL with https://user:p...@debuginfod.example.com/ ? Thanks, Mark
Re: runtime validation of DT_SYMTAB lookups - why is there no DT_SYMSZ?
Hi Milian, On Wed, 2022-07-27 at 13:38 +0200, Milian Wolff wrote: > Thanks for confirming that this isn't available currently. Would it > be > possible to add this? What's the process for standardization here? I guess it > would take a very long time, yet this seems to me as if it would be > beneficial > in the long term. Standardization of the ELF gabi takes place on (sorry google groups, I know, sigh): https://groups.google.com/g/generic-abi you should be able to subscribe with generic-abi+subscr...@googlegroups.com so you don't have to go through the webgui mess. There is also https://sourceware.org/gnu-gabi/ but that is more for GNU extensions and I think you want something generic. > > Di Chen recently > > (or actually not that recently, I just still haven't reviewed, > > sorry!) > > posted a patch for > > https://sourceware.org/bugzilla/show_bug.cgi?id=28873 to print out > > the > > symbols from the dynamic segment > > https://sourceware.org/pipermail/elfutils-devel/2022q2/005086.html > > Interesting. But from what I can tell, this patch has access to the > full Elf > object and thus can access segments which are not normally loaded at > runtime? Yes it could, but it doesn't use anything that isn't referenced from the phdrs or dynamic segment, so it only uses those parts that are normally loaded at runtime. If you go through the dynamic segment then everything it references (.dynsym in this case) is from a loaded segment. So going through phdrs to check where it is loaded and the length is fine. > Try with eu-elflint --gnu which suppresses some known issues. > > Indeed, with `--gnu` the tool reports `No errors`. > > > Also could you show those symbol values (1272, 3684, 25720, 27227) > > they > > might have a special type, so their st_value isn't really an > > address? > > ``` > $ eu-readelf -s libQt5Qml.so.5.12.0 | grep -E > "^\s*(1272|3684|25720|27227):" > 1272: 003f9974 0 NOTYPE GLOBAL DEFAULT 25 > __bss_start__@@Qt_5 > 3684: 003f9974 0 NOTYPE GLOBAL DEFAULT 25 > __bss_start@@Qt_5 > 1272: 003ccc4c 0 NOTYPE LOCAL DEFAULT 17 $d > 3684: 003cbfec 0 NOTYPE LOCAL DEFAULT 17 $d > 25720: 003f9974 0 NOTYPE GLOBAL DEFAULT 25 __bss_start > 27227: 003f9974 0 NOTYPE GLOBAL DEFAULT 25 __bss_start__ > ``` > > The first two matches come from the `.dynsym`, the last four come > from > `.symtab`. > > Can anyone tell me how `eu-readelf` resolves these symbol names? Currently through the section tables, which point to the string table section used. But Di Chen's patch would change that by going through the dynamic segment and phdrs to find the strtab for the dynsym segment (but will of course still need to go through the sections for the .symtab symbols since those aren't accessible through the phdrs). Cheers, Mark
Re: [PATCH] libdwfl: Rewrite reading of ar_size in elf_begin_rand
On 2022-07-28 09:48, Mark Wielaard wrote: With GCC 12.1.1, glibc 2.3a, -fsanitize=undefined and -D_FORTIFY_SOURCE=3 we get the following error message: In file included from /usr/include/ar.h:22, from ../libelf/libelfP.h:33, from core-file.c:31: In function ‘pread’, inlined from ‘pread_retry’ at ../lib/system.h:188:21, inlined from ‘elf_begin_rand’ at core-file.c:86:16, inlined from ‘core_file_read_eagerly’ at core-file.c:205:15: /usr/include/bits/unistd.h:74:10: error: ‘__pread_alias’ writing 58 or more bytes into a region of size 10 overflows the destination [-Werror=stringop-overflow=] 74 | return __glibc_fortify (pread, __nbytes, sizeof (char), | ^~~ /usr/include/ar.h: In function ‘core_file_read_eagerly’: /usr/include/ar.h:41:10: note: destination object ‘ar_size’ of size 10 41 | char ar_size[10]; /* File size, in ASCII decimal. */ | ^~~ /usr/include/bits/unistd.h:50:16: note: in a call to function ‘__pread_alias’ declared with attribute ‘access (write_only, 2, 3)’ 50 | extern ssize_t __REDIRECT (__pread_alias, |^~ cc1: all warnings being treated as errors The warning disappears when dropping either -fsanitize=undefined or when using -D_FORTIFY_SOURCE=2. It looks like a false positive. But I haven't figured out how/why it happens. Interesting, I'll take a closer look at this from the gcc context. I obviously don't have any strong opinions about the elfutils patch :) Thanks, Sid The code is a little tricky to proof correct though. The ar_size field is a not-zero terminated string ASCII decimal, right-paddedr with spaces. Which is then converted with strtoll. Relying on the fact that the struct ar_hdr is zero initialized, so there will be a zero byte after the ar_size field. Rewrite the code to just use a zero byte terminated char array. Which is much easier to reason about. As a bonus the error disappears. Signed-off-by: Mark Wielaard --- libdwfl/ChangeLog | 5 + libdwfl/core-file.c | 26 -- 2 files changed, 21 insertions(+), 10 deletions(-) diff --git a/libdwfl/ChangeLog b/libdwfl/ChangeLog index 75c53948..acdaa013 100644 --- a/libdwfl/ChangeLog +++ b/libdwfl/ChangeLog @@ -1,3 +1,8 @@ +2022-07-28 Mark Wielaard + + * core-file.c (elf_begin_rand): Replace struct ar_hdr h with + a char ar_size[AR_SIZE_CHARS + 1] array to read size. + 2022-07-18 Shahab Vahedi * debuginfod-client.c (dwfl_get_debuginfod_client stub): diff --git a/libdwfl/core-file.c b/libdwfl/core-file.c index cefc3db0..4418ef33 100644 --- a/libdwfl/core-file.c +++ b/libdwfl/core-file.c @@ -75,26 +75,32 @@ elf_begin_rand (Elf *parent, off_t offset, off_t size, off_t *next) from the archive header to override SIZE. */ if (parent->kind == ELF_K_AR) { - struct ar_hdr h = { .ar_size = "" }; - - if (unlikely (parent->maximum_size - offset < sizeof h)) + /* File size, in ASCII decimal, right-padded with ASCII spaces. + Max 10 characters. Not zero terminated. So make this ar_size + array one larger and explicitly zero terminate it. As needed + for strtoll. */ + #define AR_SIZE_CHARS 10 + char ar_size[AR_SIZE_CHARS + 1]; + ar_size[AR_SIZE_CHARS] = '\0'; + + if (unlikely (parent->maximum_size - offset < sizeof (struct ar_hdr))) return fail (ELF_E_RANGE); if (parent->map_address != NULL) - memcpy (h.ar_size, parent->map_address + parent->start_offset + offset, - sizeof h.ar_size); + memcpy (ar_size, parent->map_address + parent->start_offset + offset, + AR_SIZE_CHARS); else if (unlikely (pread_retry (parent->fildes, - h.ar_size, sizeof (h.ar_size), + ar_size, AR_SIZE_CHARS, parent->start_offset + offset + offsetof (struct ar_hdr, ar_size)) -!= sizeof (h.ar_size))) +!= AR_SIZE_CHARS)) return fail (ELF_E_READ_ERROR); - offset += sizeof h; + offset += sizeof (struct ar_hdr); char *endp; - size = strtoll (h.ar_size, &endp, 10); - if (unlikely (endp == h.ar_size) + size = strtoll (ar_size, &endp, 10); + if (unlikely (endp == ar_size) || unlikely ((off_t) parent->maximum_size - offset < size)) return fail (ELF_E_INVALID_ARCHIVE); }
Re: debuginfod Credential Helper RFC
> > I think this could work for a standalone program like debuginfod-find, > but not for a library like libdebuginfod. I would rather not have to > fork and exec from libdebuginfod. > Could this functionality be made optional? Something a client could call to fork out to a credential helper, but with a notice on the tin? Or just a way to pass through credentials to libcurl, and libraries for parsing/producing the helper format? > Can't this be handled through e.g. the underlying libcurl library by > setting a proxy environment variable so the requests goes through a > local proxy that is setup to do some kind of authentication > transparently? It would be at least somewhat undesirable for any process capable of making loopback requests to gain access equivalent to any user using debuginfod on that system. A credential helper would only have the ambient authority available to the user running the debuginfod client. That being said, I'm not opposed to this idea of an authenticating proxy, so long as there's a way of scoping access to it appropriately. I'm pretty far from a UNIX/Windows networking wizard, so if you know of a reasonable way to do this, please let me know. > Or by simply defining the base DEBUGINFOD_URL with > https://user:p...@debuginfod.example.com/ ? > The specific use case we had in mind uses OAuth2 bearer tokens, not username/password pairs. This is increasingly common for the sorts of cloud hosting one might like to use for things like debug binaries. -- Daniel Thornburgh | dth...@google.com
[Bug libdw/29430] New: `dwarf_getscopes` fails after a8493c1
https://sourceware.org/bugzilla/show_bug.cgi?id=29430 Bug ID: 29430 Summary: `dwarf_getscopes` fails after a8493c1 Product: elfutils Version: unspecified Status: UNCONFIRMED Severity: normal Priority: P2 Component: libdw Assignee: unassigned at sourceware dot org Reporter: godlygeek at gmail dot com CC: elfutils-devel at sourceware dot org Target Milestone: --- Apologies, but I haven't yet succeeded in creating a self-contained reproducer for this issue. When calling `dwarf_getscopes` on a (PGO and LTO) binary (a Python interpreter built with GCC 9.3.1 against glibc 2.12, which is a relatively old glibc version), I'm seeing failures with elfutils 0.187 that I didn't see with elfutils 0.179. We were able to bisect the problem down to commit a8493c1, and we see that reverting that commit causes `dwarf_getscopes` to succeed even with elfutils 0.187 That commit is: libdw: Skip imported compiler_units in libdw_visit_scopes walking DIE tree Some gcc -flto versions imported other top-level compile units, skip those. Otherwise we'll visit various DIE trees multiple times. Note in the testcase that with newer GCC versions function foo is fully inlined and does appear only once (as declared, but not as separate subprogram). Signed-off-by: Mark Wielaard Any idea why this might have broken PC resolution for us? -- You are receiving this mail because: You are on the CC list for the bug.