Re: Writing core files to contain buildids

2020-10-28 Thread Mark Wielaard
Hi Paul,

On Wed, 2020-10-28 at 00:40 -0400, Paul Smith wrote:
> On Tue, 2020-10-27 at 23:23 +0100, Mark Wielaard wrote:
> > Do you have the generated core files somehwere so others can look
> > at
> > them? How exactly are you testing the build-id notes are there?
> 
> I don't have one to publish but I can create one.  That's a good idea
> anyway as it will be simpler to debug than the actual system which is
> much more complex.
> 
> I'm using eu-unstrip -n --core core.xxx to check for build IDs.

OK, so say you have a core file (this one produced by the kernel
crashing a local sleep process). Then eu-unstrip -n --core would show:

0x40+0x208000 ec28c24670b6b52fb99b43338e47fe2d4b6fb4ea@0x400284 - - 
/usr/bin/sleep
0x7ff037345000+0x3cd000 dd9a6ba0d81c91f5ca7dbb4a1ac58319cc26dd5a@0x7ff037345280 
- - /usr/lib64/libc-2.17.so
0x7ff037712000+0x224000 a527fe72908703c5972ae384e78d1850d1881ee7@0x7ff0377121d8 
- - /usr/lib64/ld-2.17.so
0x7ffed73bf000+0x1000 63023e5151c4d5bbdb7f6f85a6f95a6b95dc780e@0x7ffed73bf328 . 
- linux-vdso.so.1

What this means is that it found the build-id for libc-2.17.so at
address 0x7ff037345280 in the segment spanning 0x7ff037345000+0x3cd000.

So that address is very near the beginning of that segment.

eu-readelf -l on the core file will show a segment that corresponds to
that:

  LOAD   0x025000 0x7ff037345000 0x 0x001000 
0x1c2000 R E 0x1000

Note that the filesize is just the first page (0x001000) even though
the file itself is much bigger (as shown by the memsize of 0x1c2000)

You'll have to make sure your coredumper program produces a similar
segment in the core file covering the same area of libc.so to get the
build-id in there.

Hope that helps,

Mark


Re: [musl] Re: [QUESTION] Which fnmatch() functionality does elfutils depend on?

2020-10-28 Thread Mark Wielaard
Hi Rich,

On Tue, 2020-10-27 at 18:25 -0400, Rich Felker wrote:
> On Tue, Oct 27, 2020 at 11:19:11PM +0100, Mark Wielaard wrote:
> > On Tue, Oct 27, 2020 at 01:08:17PM -0400, Rich Felker wrote:
> > > They do because they're also in space, unless you want
> > > exponential-time which is huge even on small inputs, and greater than
> > > O(1) space requirement means the interface can't satisfy its contract
> > > to return a conclusive result for valid inputs.
> > 
> > But that isn't the contract if fnmatch. fnmatch returns 0 for a match
> > and non-zero for either a non-match or some error. So if your
> > algorithm hits some error case, like out of memory, returning a
> > non-zero result is fine.
> > 
> > I believe the extended wildcard pattern are widely supported and
> > useful. If you don't want to implement them because they aren't in any
> > standardized enough yet we can ask the Austin Group to add them to
> > fnmatch. They have adopted other GNU flags for fnmatch in the past.
> 
> And I can ask them not to. Your hostility is really unwelcome here.

No hostility intended at all. Please assume postive intend. I was just
pointing out what I believe are technical facts. That extended wildcard
patterns are well defined and supported in various context, how Posix
defines the fnmatch contract (which explicitly allows for error
handling) and that the Austin group has been willing to document and
specify GNU extensions to various standard functions.

I am really just trying to help some people who would like musl add
support for functionality elfutils relies on or find workarounds for
missing functionality. I realize extended wildcard support through
fnmatch is just one small part of that. There are certainly larger
issues to overcome. As far as I can see musl doesn't support argp,
obstack, fts, symbol versions and various on-stack string functions. So
there is certainly some work to do. But hopefully we can do that
without taking away any useful features from either project. I don't
believe anybody is trying to be hostile by trying to make these
projects work together. I do think it is useful to see if we can
standardize some of these glibc extensions projects are relying on.

Cheers,

Mark


Re: [PATCH] Don't use __BEGIN_DECLS macros from glibc.

2020-10-28 Thread Mark Wielaard
Hi Érico,

On Mon, 2020-10-26 at 23:24 +0100, Mark Wielaard wrote:
> Please just suggest a patch upstream to libc-al...@sourceware.org.
> We'll pick it up when they decide to accept it. Note that it is also a
> public glibc header installed as /usr/include/elf.h.

I saw your patch was accepted upstream, so I synced it and pushed the
attached.

Thanks,

Mark
From 56f64c94651f4840e890c1963f9d6f6a4123abde Mon Sep 17 00:00:00 2001
From: Mark Wielaard 
Date: Wed, 28 Oct 2020 12:36:57 +0100
Subject: [PATCH] libelf: Sync elf.h from glibc.

Makes elf.h standalone and removes __BEGIN_DECLS/__END_DECLS macros.

Signed-off-by: Mark Wielaard 
---
 libelf/ChangeLog | 4 
 libelf/elf.h | 6 --
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/libelf/ChangeLog b/libelf/ChangeLog
index a3f15883..b15508f2 100644
--- a/libelf/ChangeLog
+++ b/libelf/ChangeLog
@@ -1,3 +1,7 @@
+2020-10-28  Mark Wielaard  
+
+	* elf.h: Update from glibc.
+
 2020-08-28  Mark Wielaard  
 
 	* elf.h: Update from glibc.
diff --git a/libelf/elf.h b/libelf/elf.h
index ff9f1dad..6439c1a4 100644
--- a/libelf/elf.h
+++ b/libelf/elf.h
@@ -19,10 +19,6 @@
 #ifndef _ELF_H
 #define	_ELF_H 1
 
-#include 
-
-__BEGIN_DECLS
-
 /* Standard ELF types.  */
 
 #include 
@@ -4105,6 +4101,4 @@ enum
 #define R_ARC_TLS_LE_S9		0x4a
 #define R_ARC_TLS_LE_32		0x4b
 
-__END_DECLS
-
 #endif	/* elf.h */
-- 
2.18.4



Re: [musl] Re: [QUESTION] Which fnmatch() functionality does elfutils depend on?

2020-10-28 Thread Rich Felker
On Wed, Oct 28, 2020 at 11:05:58AM +0100, Mark Wielaard wrote:
> Hi Rich,
> 
> On Tue, 2020-10-27 at 18:25 -0400, Rich Felker wrote:
> > On Tue, Oct 27, 2020 at 11:19:11PM +0100, Mark Wielaard wrote:
> > > On Tue, Oct 27, 2020 at 01:08:17PM -0400, Rich Felker wrote:
> > > > They do because they're also in space, unless you want
> > > > exponential-time which is huge even on small inputs, and greater than
> > > > O(1) space requirement means the interface can't satisfy its contract
> > > > to return a conclusive result for valid inputs.
> > > 
> > > But that isn't the contract if fnmatch. fnmatch returns 0 for a match
> > > and non-zero for either a non-match or some error. So if your
> > > algorithm hits some error case, like out of memory, returning a
> > > non-zero result is fine.
> > > 
> > > I believe the extended wildcard pattern are widely supported and
> > > useful. If you don't want to implement them because they aren't in any
> > > standardized enough yet we can ask the Austin Group to add them to
> > > fnmatch. They have adopted other GNU flags for fnmatch in the past.
> > 
> > And I can ask them not to. Your hostility is really unwelcome here.
> 
> No hostility intended at all. Please assume postive intend. I was just

"Assume positive intent" is a hostile policy and citing it does not
make you sound better.

> pointing out what I believe are technical facts. That extended wildcard
> patterns are well defined and supported in various context, how Posix

They are not supported by anything near a majority of implementations.
I could not find them anywhere but glibc. The BSDs do not have them
and are probably not interested in them.

> defines the fnmatch contract (which explicitly allows for error
> handling) and that the Austin group has been willing to document and
> specify GNU extensions to various standard functions.

>From a QoI standpoint even if it's not a hard requirement, error
handling is for erroneous expression forms, not "the implementation
failed to do what it was asked to do".

> I am really just trying to help some people who would like musl add
> support for functionality elfutils relies on or find workarounds for
> missing functionality. I realize extended wildcard support through

GNU projects already have an official way to do this: gnulib. If you
want to use glibc extensions of standard functions, you include gnulib
in your project and have the configure script enable replacements for
whichever functions don't have the glibc extension you want on the
host version of the function.

> fnmatch is just one small part of that. There are certainly larger
> issues to overcome. As far as I can see musl doesn't support argp,
> obstack, fts, symbol versions and various on-stack string functions. So

Also all gnulib. The underlying problem here seems to be that elfutils
wants to operate as a GNU project using GNU interfaces, but doesn't
want to ship gnulib to make that portable.

> there is certainly some work to do. But hopefully we can do that
> without taking away any useful features from either project. I don't
> believe anybody is trying to be hostile by trying to make these
> projects work together. I do think it is useful to see if we can
> standardize some of these glibc extensions projects are relying on.

It is not useful to attempt to standardize functions used by a tiny
minority of software, pretty much all from a single "vendor" (GNU).
For the past 5+ years there have been multiple musl-based
distributions shipping thousands if not tens of thousands of packages.
An explicit goal of musl is and always has been making software more
portable by drawing attention to use of non-standard, GNU-specific
functionality.

We have (poorly documented, at the moment, but documented in the
mailing list a long time ago and widely understood) criteria for
inclusion or exclusion of nonstandard interfaces. In order to be
appropriate for inclusion, they need to be either in widespread use or
use in at least some nontrivial amount of important software that
can't be modified to achieve the same thing portably (e.g. with a
drop-in replacement). They also should not have complex interactions
with existing standard functionality that add new implementation
constraints, or be unjustifiably resource-costly. Ideally there should
be precedent on multiple historical systems; if not, they at least
should not have conflicting historical definitions on different
systems.

My personal position on advocating for inclusion of new functionality
in the standard, and I believe this is shared by many in our
community, is that it should follow similar guidelines.
Consensus-based standards processes are about finding common ground in
existing implementations, possibly making minor changes or leaving
minor details unspecified in the process, for the sake of establishing
what programmers can reasonably rely on and maintaining participation
and faith in the process by implementors. "We'll just try to use the
standard

[PATCH v2] Fix leb128 reading

2020-10-28 Thread Tom Tromey
PR 26773 points out that some sleb128 values are decoded incorrectly.

This version of the fix only examines the sleb128 conversion.
Overlong encodings are not handled, and the uleb128 decoders are not
touched.  The approach taken here is to do the work in an unsigned
type, and then rely on an implementation-defined cast to convert to
signed.

Signed-off-by: Tom Tromey 
---
 .gitignore|   1 +
 ChangeLog |   4 ++
 libdw/ChangeLog   |  10 +++
 libdw/dwarf_getlocation.c |   5 +-
 libdw/memory-access.h |  42 ++--
 tests/ChangeLog   |   7 ++
 tests/Makefile.am |   6 +-
 tests/leb128.c| 140 ++
 8 files changed, 205 insertions(+), 10 deletions(-)
 create mode 100644 tests/leb128.c

diff --git a/.gitignore b/.gitignore
index c9790941..d737b14d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -151,6 +151,7 @@ Makefile.in
 /tests/get-units-invalid
 /tests/get-units-split
 /tests/hash
+/tests/leb128
 /tests/line2addr
 /tests/low_high_pc
 /tests/msg_tst
diff --git a/ChangeLog b/ChangeLog
index 72e8397c..4c8699f8 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2020-10-28  Tom Tromey  
+
+   * .gitignore: Add /tests/leb128.
+
 2020-10-01  Frank Ch. Eigler  
 
PR25461
diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index 731c7e79..289bb4c9 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,3 +1,13 @@
+2020-10-28  Tom Tromey  
+
+   PR26773
+   * dwarf_getlocation.c (store_implicit_value): Use
+   __libdw_get_uleb128_unchecked.
+   * memory-access.h (get_sleb128_step): Assume unsigned type for
+   'var'.
+   (__libdw_get_sleb128, __libdw_get_sleb128_unchecked): Work in
+   unsigned type.  Handle final byte.
+
 2020-10-19  Mark Wielaard  
 
* dwarf_frame_register.c (dwarf_frame_register): Declare ops_mem
diff --git a/libdw/dwarf_getlocation.c b/libdw/dwarf_getlocation.c
index 4617f9e9..f2bad5a9 100644
--- a/libdw/dwarf_getlocation.c
+++ b/libdw/dwarf_getlocation.c
@@ -130,9 +130,8 @@ store_implicit_value (Dwarf *dbg, void **cache, Dwarf_Op 
*op)
   struct loc_block_s *block = libdw_alloc (dbg, struct loc_block_s,
   sizeof (struct loc_block_s), 1);
   const unsigned char *data = (const unsigned char *) (uintptr_t) op->number2;
-  uint64_t len = __libdw_get_uleb128 (&data, data + len_leb128 (Dwarf_Word));
-  if (unlikely (len != op->number))
-return -1;
+  /* Skip the block length.  */
+  __libdw_get_uleb128_unchecked (&data);
   block->addr = op;
   block->data = (unsigned char *) data;
   block->length = op->number;
diff --git a/libdw/memory-access.h b/libdw/memory-access.h
index 14436a71..8b2386ee 100644
--- a/libdw/memory-access.h
+++ b/libdw/memory-access.h
@@ -113,19 +113,22 @@ __libdw_get_uleb128_unchecked (const unsigned char 
**addrp)
 #define get_sleb128_step(var, addr, nth) \
   do {   \
 unsigned char __b = *(addr)++;   \
+(var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7); \
 if (likely ((__b & 0x80) == 0))  \
   {
  \
-   struct { signed int i:7; } __s = { .i = __b };\
-   (var) |= (typeof (var)) __s.i * ((typeof (var)) 1 << ((nth) * 7));\
+   if ((__b & 0x40) != 0)\
+ (var) |= - ((typeof (var)) 1 << (((nth) + 1) * 7)); \
return (var); \
   }
  \
-(var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7); \
   } while (0)
 
 static inline int64_t
 __libdw_get_sleb128 (const unsigned char **addrp, const unsigned char *end)
 {
-  int64_t acc = 0;
+  /* Do the work in an unsigned type, but use implementation-defined
+ behavior to cast to signed on return.  This avoids some undefined
+ behavior when shifting.  */
+  uint64_t acc = 0;
 
   /* Unroll the first step to help the compiler optimize
  for the common single-byte case.  */
@@ -134,6 +137,20 @@ __libdw_get_sleb128 (const unsigned char **addrp, const 
unsigned char *end)
   const size_t max = __libdw_max_len_sleb128 (*addrp - 1, end);
   for (size_t i = 1; i < max; ++i)
 get_sleb128_step (acc, *addrp, i);
+  if (*addrp == end)
+return INT64_MAX;
+
+  /* There might be one extra byte.  */
+  unsigned char b = **addrp;
+  ++*addrp;
+  if (likely ((b & 0x80) == 0))
+{
+  /* We only need the low bit of the final byte, and as it is the
+sign bit, we don't need to do anything else here.  */
+  acc |= ((typeof (acc)) b) << 7 * max;
+  return acc;
+