[PATCH] libdwfl: Rewrite reading of ar_size in elf_begin_rand

2022-07-28 Thread Mark Wielaard
With GCC 12.1.1, glibc 2.3a, -fsanitize=undefined and
-D_FORTIFY_SOURCE=3 we get the following error message:

In file included from /usr/include/ar.h:22,
 from ../libelf/libelfP.h:33,
 from core-file.c:31:
In function ‘pread’,
inlined from ‘pread_retry’ at ../lib/system.h:188:21,
inlined from ‘elf_begin_rand’ at core-file.c:86:16,
inlined from ‘core_file_read_eagerly’ at core-file.c:205:15:
/usr/include/bits/unistd.h:74:10: error: ‘__pread_alias’ writing 58 or more 
bytes into a region of size 10 overflows the destination 
[-Werror=stringop-overflow=]
   74 |   return __glibc_fortify (pread, __nbytes, sizeof (char),
  |  ^~~
/usr/include/ar.h: In function ‘core_file_read_eagerly’:
/usr/include/ar.h:41:10: note: destination object ‘ar_size’ of size 10
   41 | char ar_size[10];   /* File size, in ASCII decimal.  */
  |  ^~~
/usr/include/bits/unistd.h:50:16: note: in a call to function ‘__pread_alias’ 
declared with attribute ‘access (write_only, 2, 3)’
   50 | extern ssize_t __REDIRECT (__pread_alias,
  |^~
cc1: all warnings being treated as errors

The warning disappears when dropping either -fsanitize=undefined
or when using -D_FORTIFY_SOURCE=2. It looks like a false positive.
But I haven't figured out how/why it happens.

The code is a little tricky to proof correct though. The ar_size
field is a not-zero terminated string ASCII decimal, right-paddedr
with spaces. Which is then converted with strtoll. Relying on the
fact that the struct ar_hdr is zero initialized, so there will be
a zero byte after the ar_size field.

Rewrite the code to just use a zero byte terminated char array.
Which is much easier to reason about. As a bonus the error disappears.

Signed-off-by: Mark Wielaard 
---
 libdwfl/ChangeLog   |  5 +
 libdwfl/core-file.c | 26 --
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/libdwfl/ChangeLog b/libdwfl/ChangeLog
index 75c53948..acdaa013 100644
--- a/libdwfl/ChangeLog
+++ b/libdwfl/ChangeLog
@@ -1,3 +1,8 @@
+2022-07-28  Mark Wielaard  
+
+   * core-file.c (elf_begin_rand): Replace struct ar_hdr h with
+   a char ar_size[AR_SIZE_CHARS + 1] array to read size.
+
 2022-07-18  Shahab Vahedi  
 
* debuginfod-client.c (dwfl_get_debuginfod_client stub):
diff --git a/libdwfl/core-file.c b/libdwfl/core-file.c
index cefc3db0..4418ef33 100644
--- a/libdwfl/core-file.c
+++ b/libdwfl/core-file.c
@@ -75,26 +75,32 @@ elf_begin_rand (Elf *parent, off_t offset, off_t size, 
off_t *next)
  from the archive header to override SIZE.  */
   if (parent->kind == ELF_K_AR)
 {
-  struct ar_hdr h = { .ar_size = "" };
-
-  if (unlikely (parent->maximum_size - offset < sizeof h))
+  /* File size, in ASCII decimal, right-padded with ASCII spaces.
+ Max 10 characters. Not zero terminated. So make this ar_size
+ array one larger and explicitly zero terminate it.  As needed
+ for strtoll.  */
+  #define AR_SIZE_CHARS 10
+  char ar_size[AR_SIZE_CHARS + 1];
+  ar_size[AR_SIZE_CHARS] = '\0';
+
+  if (unlikely (parent->maximum_size - offset < sizeof (struct ar_hdr)))
return fail (ELF_E_RANGE);
 
   if (parent->map_address != NULL)
-   memcpy (h.ar_size, parent->map_address + parent->start_offset + offset,
-   sizeof h.ar_size);
+   memcpy (ar_size, parent->map_address + parent->start_offset + offset,
+   AR_SIZE_CHARS);
   else if (unlikely (pread_retry (parent->fildes,
- h.ar_size, sizeof (h.ar_size),
+ ar_size, AR_SIZE_CHARS,
  parent->start_offset + offset
  + offsetof (struct ar_hdr, ar_size))
-!= sizeof (h.ar_size)))
+!= AR_SIZE_CHARS))
return fail (ELF_E_READ_ERROR);
 
-  offset += sizeof h;
+  offset += sizeof (struct ar_hdr);
 
   char *endp;
-  size = strtoll (h.ar_size, &endp, 10);
-  if (unlikely (endp == h.ar_size)
+  size = strtoll (ar_size, &endp, 10);
+  if (unlikely (endp == ar_size)
  || unlikely ((off_t) parent->maximum_size - offset < size))
return fail (ELF_E_INVALID_ARCHIVE);
 }
-- 
2.18.4



Re: [PATCH] libdwfl: Rewrite reading of ar_size in elf_begin_rand

2022-07-28 Thread Mark Wielaard
On Thu, 2022-07-28 at 15:48 +0200, Mark Wielaard wrote:
> With GCC 12.1.1, glibc 2.3a, -fsanitize=undefined and
> -D_FORTIFY_SOURCE=3 we get the following error message:

Sorry for the typo, it is glibc 2.35. Basically an up to date Fedora 36
system (replicated on x86_64, ppc64le and s390x).


buildbot users try branch builders for elfutils

2022-07-28 Thread Mark Wielaard
Hi,

I setup git users try branches for elfutils. If you have commit
access to elfutils.git you can now push to a users//try-xxx
branch and the buildbot will do builds and sent you (the commit
author) email about the results. The builds are also visible at:
https://builder.sourceware.org/buildbot/#/builders?tags=elfutils-try

My workflow to use this looks like:

- git checkout -b ar_size
- hack, hack, hack... OK, looks good to submit
- git commit -a -m "Now who got an ar_size?"
- git push origin ar_size:users/mark/try-ar_size
- ... wait for the emails to come in or watch buildbot logs ...
- Send in patches and mention what the try bot reported
  (OK, I sent in the patch first, but you can reply to your
   original patch submission or in bugzilla what the try
   results were)

After your patches have been accepted you can delete the branch again:
- git push origin :users/mark/try-ar_size

Please use this only for patches you intend to submit and think are
ready but want to double check. Use the GCC Compile Farm machines if
you need to hack edit/compile/debug on a specific architecture.
https://cfarm.tetaneutral.net/

But if you really need access to one of the buildbot machines for
investigating an issue, please contact build...@sourceware.org.

The try builders also upload logs to the bunsendb. And so are
visible here https://builder.sourceware.org/testruns/

See also the bunsen upload step in the try-build to get an URL
to the bunsen result for that specific build.

Please sent feedback (bad or good) to build...@sourceware.org

Cheers,

Mark


Re: debuginfod Credential Helper RFC

2022-07-28 Thread Mark Wielaard
Hi Daniel,

On Tue, 2022-07-26 at 15:50 -0700, Daniel Thornburgh via Elfutils-devel 
wrote:
> I'm working on a use case for debuginfod (in LLVM) that needs a
> solution
> for authentication and authorization of users when accessing source and
> debug information. I've put together a short RFC for how this might work,
> based on how git and Docker CLIs handle credentials. It should be fairly
> straightforward to implement and to generalize to new credential types.
> 
> Please take a look; it'd be good to have a consensus on how this should
> work across interested debuginfod implementations before moving forward
> towards implementation.

I think this could work for a standalone program like debuginfod-find,
but not for a library like libdebuginfod. I would rather not have to
fork and exec from libdebuginfod. We don't really know in what state
the program is and forking a big process is not cheap. The process
might be watching its own children (like when libdebuginfod is used in
a debugger or profiler) and suddenly get unexpected sigchilds or pids
from wait.

Can't this be handled through e.g. the underlying libcurl library by
setting a proxy environment variable so the requests goes through a
local proxy that is setup to do some kind of authentication
transparently? Or by simply defining the base DEBUGINFOD_URL with 
https://user:p...@debuginfod.example.com/ ?

Thanks,

Mark



Re: runtime validation of DT_SYMTAB lookups - why is there no DT_SYMSZ?

2022-07-28 Thread Mark Wielaard
Hi Milian,

On Wed, 2022-07-27 at 13:38 +0200, Milian Wolff wrote:
> Thanks for confirming that this isn't available currently. Would it
> be 
> possible to add this? What's the process for standardization here? I guess it 
> would take a very long time, yet this seems to me as if it would be 
> beneficial 
> in the long term.

Standardization of the ELF gabi takes place on (sorry google groups, I
know, sigh): https://groups.google.com/g/generic-abi you should be able
to subscribe with generic-abi+subscr...@googlegroups.com so you don't
have to go through the webgui mess.

There is also https://sourceware.org/gnu-gabi/ but that is more for GNU
extensions and I think you want something generic.

> > Di Chen recently
> > (or actually not that recently, I just still haven't reviewed,
> > sorry!)
> > posted a patch for
> > https://sourceware.org/bugzilla/show_bug.cgi?id=28873 to print out
> > the
> > symbols from the dynamic segment
> > https://sourceware.org/pipermail/elfutils-devel/2022q2/005086.html
> 
> Interesting. But from what I can tell, this patch has access to the
> full Elf 
> object and thus can access segments which are not normally loaded at
> runtime?

Yes it could, but it doesn't use anything that isn't referenced from
the phdrs or dynamic segment, so it only uses those parts that are
normally loaded at runtime. If you go through the dynamic segment then
everything it references (.dynsym in this case) is from a loaded
segment. So going through phdrs to check where it is loaded and the
length is fine.

> Try with eu-elflint --gnu which suppresses some known issues.
> 
> Indeed, with `--gnu` the tool reports `No errors`.
> 
> > Also could you show those symbol values (1272, 3684, 25720, 27227)
> > they
> > might have a special type, so their st_value isn't really an
> > address?
> 
> ```
> $ eu-readelf -s libQt5Qml.so.5.12.0 | grep -E
> "^\s*(1272|3684|25720|27227):"
>  1272: 003f9974  0 NOTYPE  GLOBAL DEFAULT   25
> __bss_start__@@Qt_5
>  3684: 003f9974  0 NOTYPE  GLOBAL DEFAULT   25
> __bss_start@@Qt_5
>  1272: 003ccc4c  0 NOTYPE  LOCAL  DEFAULT   17 $d
>  3684: 003cbfec  0 NOTYPE  LOCAL  DEFAULT   17 $d
> 25720: 003f9974  0 NOTYPE  GLOBAL DEFAULT   25 __bss_start
> 27227: 003f9974  0 NOTYPE  GLOBAL DEFAULT   25 __bss_start__
> ```
> 
> The first two matches come from the `.dynsym`, the last four come
> from 
> `.symtab`.
> 
> Can anyone tell me how `eu-readelf` resolves these symbol names?

Currently through the section tables, which point to the string table
section used. But Di Chen's patch would change that by going through
the dynamic segment and phdrs to find the strtab for the dynsym segment
(but will of course still need to go through the sections for the
.symtab symbols since those aren't accessible through the phdrs).

Cheers,

Mark


Re: [PATCH] libdwfl: Rewrite reading of ar_size in elf_begin_rand

2022-07-28 Thread Siddhesh Poyarekar

On 2022-07-28 09:48, Mark Wielaard wrote:

With GCC 12.1.1, glibc 2.3a, -fsanitize=undefined and
-D_FORTIFY_SOURCE=3 we get the following error message:

In file included from /usr/include/ar.h:22,
  from ../libelf/libelfP.h:33,
  from core-file.c:31:
In function ‘pread’,
 inlined from ‘pread_retry’ at ../lib/system.h:188:21,
 inlined from ‘elf_begin_rand’ at core-file.c:86:16,
 inlined from ‘core_file_read_eagerly’ at core-file.c:205:15:
/usr/include/bits/unistd.h:74:10: error: ‘__pread_alias’ writing 58 or more 
bytes into a region of size 10 overflows the destination 
[-Werror=stringop-overflow=]
74 |   return __glibc_fortify (pread, __nbytes, sizeof (char),
   |  ^~~
/usr/include/ar.h: In function ‘core_file_read_eagerly’:
/usr/include/ar.h:41:10: note: destination object ‘ar_size’ of size 10
41 | char ar_size[10];   /* File size, in ASCII decimal.  */
   |  ^~~
/usr/include/bits/unistd.h:50:16: note: in a call to function ‘__pread_alias’ 
declared with attribute ‘access (write_only, 2, 3)’
50 | extern ssize_t __REDIRECT (__pread_alias,
   |^~
cc1: all warnings being treated as errors

The warning disappears when dropping either -fsanitize=undefined
or when using -D_FORTIFY_SOURCE=2. It looks like a false positive.
But I haven't figured out how/why it happens.


Interesting, I'll take a closer look at this from the gcc context.  I 
obviously don't have any strong opinions about the elfutils patch :)


Thanks,
Sid


The code is a little tricky to proof correct though. The ar_size
field is a not-zero terminated string ASCII decimal, right-paddedr
with spaces. Which is then converted with strtoll. Relying on the
fact that the struct ar_hdr is zero initialized, so there will be
a zero byte after the ar_size field.

Rewrite the code to just use a zero byte terminated char array.
Which is much easier to reason about. As a bonus the error disappears.

Signed-off-by: Mark Wielaard 
---
  libdwfl/ChangeLog   |  5 +
  libdwfl/core-file.c | 26 --
  2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/libdwfl/ChangeLog b/libdwfl/ChangeLog
index 75c53948..acdaa013 100644
--- a/libdwfl/ChangeLog
+++ b/libdwfl/ChangeLog
@@ -1,3 +1,8 @@
+2022-07-28  Mark Wielaard  
+
+   * core-file.c (elf_begin_rand): Replace struct ar_hdr h with
+   a char ar_size[AR_SIZE_CHARS + 1] array to read size.
+
  2022-07-18  Shahab Vahedi  
  
  	* debuginfod-client.c (dwfl_get_debuginfod_client stub):

diff --git a/libdwfl/core-file.c b/libdwfl/core-file.c
index cefc3db0..4418ef33 100644
--- a/libdwfl/core-file.c
+++ b/libdwfl/core-file.c
@@ -75,26 +75,32 @@ elf_begin_rand (Elf *parent, off_t offset, off_t size, 
off_t *next)
   from the archive header to override SIZE.  */
if (parent->kind == ELF_K_AR)
  {
-  struct ar_hdr h = { .ar_size = "" };
-
-  if (unlikely (parent->maximum_size - offset < sizeof h))
+  /* File size, in ASCII decimal, right-padded with ASCII spaces.
+ Max 10 characters. Not zero terminated. So make this ar_size
+ array one larger and explicitly zero terminate it.  As needed
+ for strtoll.  */
+  #define AR_SIZE_CHARS 10
+  char ar_size[AR_SIZE_CHARS + 1];
+  ar_size[AR_SIZE_CHARS] = '\0';
+
+  if (unlikely (parent->maximum_size - offset < sizeof (struct ar_hdr)))
return fail (ELF_E_RANGE);
  
if (parent->map_address != NULL)

-   memcpy (h.ar_size, parent->map_address + parent->start_offset + offset,
-   sizeof h.ar_size);
+   memcpy (ar_size, parent->map_address + parent->start_offset + offset,
+   AR_SIZE_CHARS);
else if (unlikely (pread_retry (parent->fildes,
- h.ar_size, sizeof (h.ar_size),
+ ar_size, AR_SIZE_CHARS,
  parent->start_offset + offset
  + offsetof (struct ar_hdr, ar_size))
-!= sizeof (h.ar_size)))
+!= AR_SIZE_CHARS))
return fail (ELF_E_READ_ERROR);
  
-  offset += sizeof h;

+  offset += sizeof (struct ar_hdr);
  
char *endp;

-  size = strtoll (h.ar_size, &endp, 10);
-  if (unlikely (endp == h.ar_size)
+  size = strtoll (ar_size, &endp, 10);
+  if (unlikely (endp == ar_size)
  || unlikely ((off_t) parent->maximum_size - offset < size))
return fail (ELF_E_INVALID_ARCHIVE);
  }


Re: debuginfod Credential Helper RFC

2022-07-28 Thread Daniel Thornburgh via Elfutils-devel
>
> I think this could work for a standalone program like debuginfod-find,
> but not for a library like libdebuginfod. I would rather not have to
> fork and exec from libdebuginfod.
>
Could this functionality be made optional? Something a client could call to
fork out to a credential helper, but with a notice on the tin? Or just a
way to pass through credentials to libcurl, and libraries for
parsing/producing the helper format?

>

Can't this be handled through e.g. the underlying libcurl library by
> setting a proxy environment variable so the requests goes through a
> local proxy that is setup to do some kind of authentication
> transparently?

It would be at least somewhat undesirable for any process capable of making
loopback requests to gain access equivalent to any user using debuginfod on
that system. A credential helper would only have the ambient authority
available to the user running the debuginfod client.

That being said, I'm not opposed to this idea of an authenticating proxy,
so long as there's a way of scoping access to it appropriately. I'm pretty
far from a UNIX/Windows networking wizard, so if you know of a reasonable
way to do this, please let me know.


> Or by simply defining the base DEBUGINFOD_URL with
> https://user:p...@debuginfod.example.com/ ?
>
The specific use case we had in mind uses OAuth2 bearer tokens, not
username/password pairs. This is increasingly common for the sorts of cloud
hosting one might like to use for things like debug binaries.

-- 

Daniel Thornburgh | dth...@google.com


[Bug libdw/29430] New: `dwarf_getscopes` fails after a8493c1

2022-07-28 Thread godlygeek at gmail dot com via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=29430

Bug ID: 29430
   Summary: `dwarf_getscopes` fails after a8493c1
   Product: elfutils
   Version: unspecified
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: libdw
  Assignee: unassigned at sourceware dot org
  Reporter: godlygeek at gmail dot com
CC: elfutils-devel at sourceware dot org
  Target Milestone: ---

Apologies, but I haven't yet succeeded in creating a self-contained reproducer
for this issue.

When calling `dwarf_getscopes` on a (PGO and LTO) binary (a Python interpreter
built with GCC 9.3.1 against glibc 2.12, which is a relatively old glibc
version), I'm seeing failures with elfutils 0.187 that I didn't see with
elfutils 0.179. We were able to bisect the problem down to commit a8493c1, and
we see that reverting that commit causes `dwarf_getscopes` to succeed even with
elfutils 0.187

That commit is:

libdw: Skip imported compiler_units in libdw_visit_scopes walking DIE tree

Some gcc -flto versions imported other top-level compile units,
skip those. Otherwise we'll visit various DIE trees multiple times.

Note in the testcase that with newer GCC versions function foo is
fully inlined and does appear only once (as declared, but not as
separate subprogram).

Signed-off-by: Mark Wielaard 

Any idea why this might have broken PC resolution for us?

-- 
You are receiving this mail because:
You are on the CC list for the bug.