Re: [PATCH] Add replacement endian.h and byteswap.h to libgnu

2017-11-16 Thread Florian Weimer
* Ulf Hermann:

> Some systems don't provide endian.h and byteswap.h. The required
> functions are trivial to define using sys/param.h and gcc builtins,
> though.
>
> Also, include endian.h in dwelf_scn_gnu_compressed_size.c as that uses
> be64toh().

This is still an issue with non-glibc, non-BSD compilation.  The patch
is not ideal, as it depends on a GCC extension, but it's an
improvement for those who use GCC on a platform which does not provide
these functions as part of the C library.


Re: [PATCH] Add replacement endian.h and byteswap.h to libgnu

2017-11-16 Thread Florian Weimer
* Ulf Hermann:

>>> Some systems don't provide endian.h and byteswap.h. The required
>>> functions are trivial to define using sys/param.h and gcc builtins,
>>> though.
>>>
>>> Also, include endian.h in dwelf_scn_gnu_compressed_size.c as that uses
>>> be64toh().
>> 
>> This is still an issue with non-glibc, non-BSD compilation.  The patch
>> is not ideal, as it depends on a GCC extension, but it's an
>> improvement for those who use GCC on a platform which does not provide
>> these functions as part of the C library.
>
> Well, there are a lot of other issues to be fixed if you want to build
> elfutils on anything non-gcc. We could add a further check for those
> builtins and sys/param.h and then add a somewhat less trivial version
> of endian.h and byteswap.h if they are missing. But that would likely
> involve other compiler extensions or OS-specific headers.

Ah, fair enough.  I mainly wanted to increase awareness of the patch
because it has not been merged yet.


Re: Elfutils 0.176 and riscv

2019-02-17 Thread Florian Weimer
* Mark Wielaard:

> On Sat, Feb 16, 2019 at 06:09:06PM +0100, Kurt Roeckx wrote:
>> On Debian's riscv64 port, this is a log of the build:
>> https://buildd.debian.org/status/fetch.php?pkg=elfutils&arch=riscv64&ver=0.176-1&stamp=1550335976&raw=0
>
> You are missing .eh_frame CFI. Without that backtraces
> will be unreliable. Make sure that gcc for riscv64 has
> -fasynchronous-unwind-tables on by default and these
> failures should disappear. It is the default on Fedora
> riscv64 and there we see zero-fail.

And keep in mind that you have to patch GCC for
-fasynchronous-unwind-tables on other architectures, too.  Unwinding
information is not optional on GNU.

Thanks,
Florian


[PATCH] elfclassify tool

2019-04-12 Thread Florian Weimer
This patch adds an elfclassify tool, mainly for the benefit of RPM's
find-debuginfo.sh.

I still need to implement an --unstripped option and fix the iteration
over the dynamic section.

Suggestions for improving the argp/help output are welcome as well.  I'm
not familiar with argp at all.

I'm keeping a branch with these changes here:

  

Thanks,
Florian

diff --git a/src/Makefile.am b/src/Makefile.am
index 2b1c0dcb..966d1da7 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -26,7 +26,8 @@ AM_CPPFLAGS += -I$(srcdir)/../libelf -I$(srcdir)/../libebl \
 AM_LDFLAGS = -Wl,-rpath-link,../libelf:../libdw
 
 bin_PROGRAMS = readelf nm size strip elflint findtextrel addr2line \
-  elfcmp objdump ranlib strings ar unstrip stack elfcompress
+  elfcmp objdump ranlib strings ar unstrip stack elfcompress \
+  elfclassify
 
 noinst_LIBRARIES = libar.a
 
@@ -83,6 +84,7 @@ ar_LDADD = libar.a $(libelf) $(libeu) $(argp_LDADD)
 unstrip_LDADD = $(libebl) $(libelf) $(libdw) $(libeu) $(argp_LDADD) -ldl
 stack_LDADD = $(libebl) $(libelf) $(libdw) $(libeu) $(argp_LDADD) -ldl 
$(demanglelib)
 elfcompress_LDADD = $(libebl) $(libelf) $(libdw) $(libeu) $(argp_LDADD)
+elfclassify_LDADD = $(libelf) $(libeu) $(argp_LDADD)
 
 installcheck-binPROGRAMS: $(bin_PROGRAMS)
bad=0; pid=; list="$(bin_PROGRAMS)"; for p in $$list; do \
diff --git a/src/elfclassify.c b/src/elfclassify.c
new file mode 100644
index ..ead3260b
--- /dev/null
+++ b/src/elfclassify.c
@@ -0,0 +1,387 @@
+/* Classification of ELF files.
+   Copyright (C) 2019 Red Hat, Inc.
+   This file is part of elfutils.
+
+   This file is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   elfutils is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see .  */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include ELFUTILS_HEADER(elf)
+#include "printversion.h"
+
+/* Name and version of program.  */
+ARGP_PROGRAM_VERSION_HOOK_DEF = print_version;
+
+/* Bug report address.  */
+ARGP_PROGRAM_BUG_ADDRESS_DEF = PACKAGE_BUGREPORT;
+
+enum classify_command
+{
+  classify_file = 1000,
+  classify_elf,
+  classify_executable,
+  classify_shared,
+  classify_loadable
+};
+
+/* Set by parse_opt.  */
+static enum classify_command command;
+static const char *command_path;
+static int verbose;
+
+/* Set by map_file.  */
+static int file_fd = -1;
+
+static void
+open_file (void)
+{
+  if (verbose > 1)
+fprintf (stderr, "debug: processing file: %s\n", command_path);
+
+  file_fd = open (command_path, O_RDONLY);
+  if (file_fd < 0)
+{
+  if (errno == ENOENT)
+exit (1);
+  else
+error (2, errno, N_("opening %s"), command_path);
+}
+  struct stat st;
+  if (fstat (file_fd, &st) != 0)
+error (2, errno, N_("reading %s\n"), command_path);
+  if (!S_ISREG (st.st_mode))
+exit (1);
+}
+
+/* Set by open_elf.  */
+static Elf *elf;
+
+static void
+open_elf (void)
+{
+  open_file ();
+  elf = elf_begin (file_fd, ELF_C_READ, NULL);
+  if (elf == NULL)
+error (2, 0, "%s: %s", command_path, elf_errmsg (-1));
+  if (elf_kind (elf) != ELF_K_ELF && elf_kind (elf) != ELF_K_AR)
+exit (1);
+}
+
+static int elf_type;
+static bool has_program_interpreter;
+static bool has_dynamic;
+static bool has_soname;
+static bool has_pie_flag;
+static bool has_dt_debug;
+
+static void
+run_classify (void)
+{
+  if (elf_kind (elf) != ELF_K_ELF)
+return;
+
+  GElf_Ehdr ehdr_storage;
+  GElf_Ehdr *ehdr = gelf_getehdr (elf, &ehdr_storage);
+  if (ehdr == NULL)
+exit (1);
+  elf_type = ehdr->e_type;
+
+  /* Examine program headers.  */
+  {
+size_t nphdrs;
+if (elf_getphdrnum (elf, &nphdrs) != 0)
+  error (2, 0, "%s: program header: %s", command_path, elf_errmsg (-1));
+if (nphdrs > INT_MAX)
+  error (2, 0, "%s: number of program headers is too large: %zu",
+ command_path, nphdrs);
+for (size_t phdr_idx = 0; phdr_idx < nphdrs; ++phdr_idx)
+  {
+GElf_Phdr phdr_storage;
+GElf_Phdr *phdr = gelf_getphdr (elf, phdr_idx, &phdr_storage);
+if (phdr == NULL)
+  error (2, 0, "%s: %s", command_path, elf_errmsg (-1));
+if (phdr->p_type == PT_DYNAMIC)
+  has_dynamic = true;
+if (phdr->p_type == PT_INTERP)
+  has_program_interpreter = true;
+  }
+  }
+
+  /* Examine the dynamic section.  */
+  if

Re: [PATCH] elfclassify tool

2019-04-16 Thread Florian Weimer
* Mark Wielaard:

>> I still need to implement an --unstripped option and fix the
>> iteration over the dynamic section.
>
> We did already discuss some of this off-list.

Thanks for summarizing the previous discussion.

> --elf PATH return 0 whenever the file can be opened and a minimal ELF
> header can be read (it might not be a completely valid ELF file). Do we
> want or need to do any more verification (e.g. try to get the full ELF
> header, walk through all phdrs and shdrs)?

If we ever need that, I think we should add it as separate options,
detecting both separate debuginfo and regular ELF files.

> --unstripped (not yet implemented) would be a classification that
> indicates whether the ELF file can be stripped (further), that is has a
> .symtab (symbol table), .debug_* sections (and possibly any non-
> loadable sections -- "file" only detects the first two).

Some non-allocated sections are expected in stripped binaries:
.gnu_debuglink, .shstrtab, .gnu.build.attributes look relevant in this
context.  I'm not sure if we should flag any other non-allocated section
in this way.

> I am not sure --file=PATH is a useful option.

It's useful for determining if the file exists and can be mapped.

> But maybe we need some way to indicate whether a file is a real file or
> a symlink? But the current implementation returns 0 even for symlinks.
> As do every other option (if the file is a symlink to an ELF file of
> the requested classification). Is this what we want? I would suggest
> that we return 1 for anything that is not a regular file. But that
> would mean that for example eu-elfclassify --executable=/proc/$$/exe
> would also return 1 (currently it returns 0, which might be helpful in
> some cases).

I don't know what RPM needs in this context.  I expect that it can
easily filter out non-regular files.  My problem with symbolic link
detection is that it is so inconsistent—it generally applies to the
final pathname component, and that does not look useful to me.

> --loadable basically checks whether the given ELF file is not an object
> (ET_REL) file, so it will return 0 for either an executable, a shared
> object or core file, but not check whether any other attribute (like
> whether it has program headers and/or loadable segments). Personally I
> would like it if this at least included a check for a PT_LOAD segment.

Is a PT_LOAD segment required to make the PT_DYNAMIC segment visible?
It is possible to have mostly empty objects, after all.

> This does not classify kernel modules as loadable objects.
> rpm does contain a check for that, it might make sense to include that
> as a separate classification in elfclassify --kernel-module.
>
> Kernel modules are also special because they can be compressed ELF
> files. Do we want to support that? (It is easy with gelf_elf_begin).
> That could for example be an flag/option like --compressed which can be
> combined with any other classification option?

How relevant are kernel modules to eu-elfclassify?

Is path-based detection feasible for kernel modules?

> I think another useful classification would be --debugfile which
> succeeds if the primary function of the given ELF file is being a
> separete debug file (basically .debug, .dwo or dwz .multi file) which
> cannot be linked and loaded on its own

That is very difficult to detect reliably, unfortunately, and would best
be implemented in lib(g)elf itself because it would be generally useful,
for all kinds of tools which expect to process real ELF files only.

> BTW. Florian, the extra options are certainly not required for you to
> implement to get eu-elfclassify accepted. They are just suggestions,
> which we might decide not to do/add. Or they can be added by others if
> they think they are useful.

Understood.  I would rather fix the command line syntax as a priority,
implement --unstripped, and add a test suite.

>> Suggestions for improving the argp/help output are welcome as
>> well.  I'm not familiar with argp at all.
>
> You usage of argp seems fine.

Trust me, it's been a struggle so far.

> But I think you don't want to use
> ARGP_NO_EXIT. That causes standard options like --version and --help to
> not exit (with success). Which is generally what we want.
> We do want to want --version and --help to not return an error
> indicator (this is actually checked by make distcheck).

I want to exit with status 2 on usage error.  I couldn't make that
happen without ARGP_NO_EXIT.  I'm open to different suggestions.

> I think we might want to avoid specific ELF concepts in the
> classification descriptors though. For example people might have a
> different concept of DSO.
>
>> I'm keeping a branch with these changes here:
>> 
>>   
>>
>
>> +/* Name and version of program.  */
>> +ARGP_PROGRAM_VERSION_HOOK_DEF = print_version;
>> +
>> +/* Bug report address.  */
>> +ARGP_PROGRAM_BUG_ADDRESS_DEF = PACKAGE_BUGREPORT;
>> +
>> +enum classify_command
>>

Re: [PATCH] elfclassify tool

2019-04-18 Thread Florian Weimer
* Florian Weimer:

>> BTW. Florian, the extra options are certainly not required for you to
>> implement to get eu-elfclassify accepted. They are just suggestions,
>> which we might decide not to do/add. Or they can be added by others if
>> they think they are useful.
>
> Understood.  I would rather fix the command line syntax as a priority,
> implement --unstripped, and add a test suite.

The patch below, also available here:

  <https://pagure.io/fweimer/elfutils/commits/elfclassify>

reworks the command line parser, implements filtering of file lists, and
adds the --unstripped option.

I assume the next step is to write tests.

Thanks,
Florian

diff --git a/src/Makefile.am b/src/Makefile.am
index 2b1c0dcb..966d1da7 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -26,7 +26,8 @@ AM_CPPFLAGS += -I$(srcdir)/../libelf -I$(srcdir)/../libebl \
 AM_LDFLAGS = -Wl,-rpath-link,../libelf:../libdw
 
 bin_PROGRAMS = readelf nm size strip elflint findtextrel addr2line \
-  elfcmp objdump ranlib strings ar unstrip stack elfcompress
+  elfcmp objdump ranlib strings ar unstrip stack elfcompress \
+  elfclassify
 
 noinst_LIBRARIES = libar.a
 
@@ -83,6 +84,7 @@ ar_LDADD = libar.a $(libelf) $(libeu) $(argp_LDADD)
 unstrip_LDADD = $(libebl) $(libelf) $(libdw) $(libeu) $(argp_LDADD) -ldl
 stack_LDADD = $(libebl) $(libelf) $(libdw) $(libeu) $(argp_LDADD) -ldl 
$(demanglelib)
 elfcompress_LDADD = $(libebl) $(libelf) $(libdw) $(libeu) $(argp_LDADD)
+elfclassify_LDADD = $(libelf) $(libeu) $(argp_LDADD)
 
 installcheck-binPROGRAMS: $(bin_PROGRAMS)
bad=0; pid=; list="$(bin_PROGRAMS)"; for p in $$list; do \
diff --git a/src/elfclassify.c b/src/elfclassify.c
new file mode 100644
index ..d4b46b64
--- /dev/null
+++ b/src/elfclassify.c
@@ -0,0 +1,654 @@
+/* Classification of ELF files.
+   Copyright (C) 2019 Red Hat, Inc.
+   This file is part of elfutils.
+
+   This file is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   elfutils is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include ELFUTILS_HEADER(elf)
+#include "printversion.h"
+
+/* Name and version of program.  */
+ARGP_PROGRAM_VERSION_HOOK_DEF = print_version;
+
+/* Bug report address.  */
+ARGP_PROGRAM_BUG_ADDRESS_DEF = PACKAGE_BUGREPORT;
+
+/* Set by parse_opt.  */
+static int verbose;
+
+/* Set by the main function.  */
+static const char *current_path;
+
+/* Set by open_file.  */
+static int file_fd = -1;
+
+static bool
+open_file (void)
+{
+  if (file_fd >= 0)
+{
+  close (file_fd);
+  file_fd = -1;
+}
+
+  if (verbose > 1)
+fprintf (stderr, "debug: processing file: %s\n", current_path);
+
+  file_fd = open (current_path, O_RDONLY);
+  if (file_fd < 0)
+{
+  if (errno == ENOENT)
+{
+  if (verbose > 0)
+fprintf (stderr, N_("warning: %s: file does not exist\n"),
+ current_path);
+  return false;
+}
+  else
+error (2, errno, N_("opening %s"), current_path);
+}
+  struct stat st;
+  if (fstat (file_fd, &st) != 0)
+error (2, errno, N_("reading %s\n"), current_path);
+
+  /* Reject directories here because processing those as ELF fails
+ would fail.  */
+  if (!S_ISREG (st.st_mode))
+{
+  if (verbose > 0)
+fprintf (stderr, N_("warning: %s: not a regular file\n"),
+ current_path);
+  return false;
+}
+  return true;
+}
+
+/* Set by open_elf.  */
+static Elf *elf;
+
+static bool
+open_elf (void)
+{
+  if (elf != NULL)
+{
+  elf_end (elf);
+  elf = NULL;
+}
+
+  if (!open_file ())
+return false;
+
+  elf = elf_begin (file_fd, ELF_C_READ, NULL);
+  if (elf == NULL)
+error (2, 0, "%s: %s", current_path, elf_errmsg (-1));
+  if (elf_kind (elf) != ELF_K_ELF && elf_kind (elf) != ELF_K_AR)
+{
+  if (verbose > 0)
+fprintf (stderr, N_("warning: %s: not an ELF file\n"),
+ current_path);
+  return false;
+}
+
+  return true;
+}
+
+static int elf_type;
+static bool has_program_interpreter;
+static bool has_dynamic;
+static bool has_soname;
+static bool has_pie_flag;
+static bool 

Re: [RFC PATCH 0/2] elfutils: don't use dlopen() for libebl modules

2019-07-08 Thread Florian Weimer
* Omar Sandoval:

> This makes sense. One thing I noted in the patch to export the libebl
> symbols [1] is that exporting them wouldn't necessarily mean supporting
> them as an official API. However, I can see why you'd be concerned with
> developers using them anyways.

You could ship a link-only libc as the .so which does not include those
symbols.  Anyone who uses the standard development setup would not be
able to link against them as a result.

Thanks,
Florian


Re: [PATCH] elfclassify tool

2019-07-19 Thread Florian Weimer
* Dmitry V. Levin:

>> So, I don't think the code is wrong. We might want to tweak the comment
>> a bit though, to make it less definitive?
>
> What I'm saying is that has_soname is just a hint which is probably even
> less reliable than has_program_interpreter.

If I recall correctly, I added the soname check to classify
/lib64/libc.so.6 as a library, not an executable.  So it didn't come
completely out of nowhere.

Thanks,
Florian


Re: [PATCH] elfclassify tool

2019-07-22 Thread Florian Weimer
* Mark Wielaard:

> On Thu, 2019-04-18 at 13:17 +0200, Florian Weimer wrote:
>> * Florian Weimer:
>> 
>> > > BTW. Florian, the extra options are certainly not required for you to
>> > > implement to get eu-elfclassify accepted. They are just suggestions,
>> > > which we might decide not to do/add. Or they can be added by others if
>> > > they think they are useful.
>> > 
>> > Understood.  I would rather fix the command line syntax as a priority,
>> > implement --unstripped, and add a test suite.
>> 
>> The patch below, also available here:
>> 
>>   <https://pagure.io/fweimer/elfutils/commits/elfclassify>
>> 
>> reworks the command line parser, implements filtering of file lists, and
>> adds the --unstripped option.
>
> That looks really good. I went ahead an fixed a couple of nits and
> added some of my suggestions. I'll respond to your other email
> explaining some of my reasoning. The changes I made are:

Wow, this is much more than I expected.  Thanks!

>   elfclassify: Fix bre -> be typo in "unstripped" option help text.
>   elfclassify: When reading stdin make sure paths don't include newline.
>   elfclassify: Allow inspecting compressed or (kernel) image files with -z.
>   elfclassify: Always clean up ELF file and descriptor if one is still open.
>   elfclassify: Don't treat errors in elf_open or run_classify as fatal.
>   elfclassify: Add --quiet/-q to suppress error messages.
>   elfclassify: Add \n to fputs debug output.
>   elfclassify: Add --file/-f for testing just regular files.
>   elfclassify: Require --elf by default. Add more classifications.
>   elfclassify: Add elf_kind and elf_type strings for verbose output.
>   elfclassify: Require PT_LOAD for loadable classification.
>   elfclassify: Add --program classification.
>   elfclassify: Don't use ARGP_NO_EXIT and document exit code expectation.
>   elfclassify: Add --linux-kernel-module classification.
>   elfclassify: Add --debug-only classification.

I went through these patches, albeit in a somewhat cursory fashion, and
they look okay to me.

Do you think this is enough to port over RPM's find-debuginfo.sh?

Thanks,
Florian


Re: [PATCH] elfclassify tool

2019-07-29 Thread Florian Weimer
* Mark Wielaard:

> +  if (elf == NULL)
> +{
> +  /* This likely means it just isn't an ELF file, probably not a
> +  real issue, but warn if verbose reporting.  */
> +  if (verbose > 0)
> + fprintf (stderr, "warning: %s: %s\n", current_path, elf_errmsg (-1));
> +  return false;
> +}

Is it possible to distinguish the error from a memory allocation error?
It would be wrong to mis-classify a file just because the system is low
on memory.

Thanks,
Florian


Re: [PATCH] elfclassify tool

2019-07-29 Thread Florian Weimer
* Mark Wielaard:

> +/* Called to process standard input if flag_stdin is not no_stdin.  */
> +static void
> +process_stdin (int *status)
> +{
> +  char delim;
> +  if (flag_stdin == do_stdin0)
> +delim = '\0';
> +  else
> +delim = '\n';
> +
> +  char *buffer = NULL;
> +  size_t buffer_size = 0;
> +  while (true)
> +{
> +  ssize_t ret = getdelim (&buffer, &buffer_size, delim, stdin);
> +  if (ferror (stdin))
> + {
> +   current_path = NULL;
> +   issue (errno, N_("reading from standard input"));
> +   break;
> + }
> +  if (feof (stdin))
> +break;
> +  if (ret < 0)
> +abort ();   /* Cannot happen due to error checks above.  */
> +  if (delim != '\0' && ret > 0)
> +buffer[ret - 1] = '\0';

I think this can overwrite the last character of the last line if the
file does not end with '\n'.

Thanks,
Florian


Re: [PATCH] elfclassify tool

2019-07-29 Thread Florian Weimer
* Mark Wielaard:

> Signed-off-by: Mark Wielaard 

Does elfutils use DCO?  Then yoy have my signoff as well:

Signed-off-by: Florian Weimer 

You should you list yourself as an author somewhere in the commit
message.  This is so much more than what I wrote.

Regarding the test case, I think if the build target is ELF, it makes
sense to check that the elfutils binaries themselves are classified as
expected, with the current build flags.  This will detect changes
required due to the evolution of the toolchain.

Thanks,
Florian


Re: [PATCH] elfclassify tool

2019-07-29 Thread Florian Weimer
* Mark Wielaard:

> On Mon, Jul 29, 2019 at 11:16:31AM +0200, Florian Weimer wrote:
>> * Mark Wielaard:
>> 
>> > +/* Called to process standard input if flag_stdin is not no_stdin.  */
>> > +static void
>> > +process_stdin (int *status)
>> > +{
>> > +  char delim;
>> > +  if (flag_stdin == do_stdin0)
>> > +delim = '\0';
>> > +  else
>> > +delim = '\n';
>> > +
>> > +  char *buffer = NULL;
>> > +  size_t buffer_size = 0;
>> > +  while (true)
>> > +{
>> > +  ssize_t ret = getdelim (&buffer, &buffer_size, delim, stdin);
>> > +  if (ferror (stdin))
>> > +  {
>> > +current_path = NULL;
>> > +issue (errno, N_("reading from standard input"));
>> > +break;
>> > +  }
>> > +  if (feof (stdin))
>> > +break;
>> > +  if (ret < 0)
>> > +abort ();   /* Cannot happen due to error checks above.  
>> > */
>> > +  if (delim != '\0' && ret > 0)
>> > +buffer[ret - 1] = '\0';
>> 
>> I think this can overwrite the last character of the last line if the
>> file does not end with '\n'.
>
> I see.  "The buffer is null-terminated and includes the newline
> character, if one was found."
>
> So the test should be:
>
> diff --git a/src/elfclassify.c b/src/elfclassify.c
> index ebd42c1d5..b17d14d45 100644
> --- a/src/elfclassify.c
> +++ b/src/elfclassify.c
> @@ -862,7 +862,7 @@ process_stdin (int *status)
>  break;
>if (ret < 0)
>  abort ();   /* Cannot happen due to error checks above.  */
> -  if (delim != '\0' && ret > 0)
> +  if (delim != '\0' && ret > 0 && buffer[ret - 1] == '\n')
>  buffer[ret - 1] = '\0';
>current_path = buffer;
>process_current_path (status);

Right.  But now I wonder why ret == 0 can ever happen.  Maybe on
OpenVMS, which doesn't use in-band signaling for line terminators?

Thanks,
Florian


Re: [PATCH] elfclassify tool

2019-07-29 Thread Florian Weimer
* Mark Wielaard:

>> Regarding the test case, I think if the build target is ELF, it makes
>> sense to check that the elfutils binaries themselves are classified as
>> expected, with the current build flags.  This will detect changes
>> required due to the evolution of the toolchain.
>
> That is what the run-elfclassify-self.sh testcase does I believe.

Ah, I had missed it.  Great, I think this one is covered.

Thanks,
Florian


Re: [PATCH] elfclassify tool

2019-08-12 Thread Florian Weimer
* Mark Wielaard:

> What do you think about this change to dwelf_elf_begin?
> The change would make it possible to detect real errors in the
> elfclassify code, whether elf_begin or dwelf_elf_begin was used. So we
> would not misclassify files (but return an error status of 2).

I'm not really familiar with how these functions are used, sorry,
Viewed in isolation, the changes appear reasonable to me.

Thanks,
Florian


Re: [PATCH] readelf: Add --dyn-sym option.

2019-09-03 Thread Florian Weimer
* Mark Wielaard:

> It is already possible to select the symbol table to print by name,
> using --symbols=SECTION. This allows printing the dynamic symbol table
> with --symbols=.dynsym. binutils readelf allows printing just the
> dynamic symbol table by type using --dyn-sym. Add the same option
> and document it. Also add a testcase to show --symbols=.dynsym and
> --dyn-sym produce the same output.

Note that this behavior is not the same as the binutils behavior.
There, --dyn-sym does not use the section header table to locate the
dynamic symbol table, but the dynamic segment.

Thanks,
Florian


Re: [PATCH] readelf: Add --dyn-sym option.

2019-09-03 Thread Florian Weimer
* Mark Wielaard:

> On Tue, 2019-09-03 at 09:44 +0200, Florian Weimer wrote:
>> * Mark Wielaard:
>> 
>> > It is already possible to select the symbol table to print by name,
>> > using --symbols=SECTION. This allows printing the dynamic symbol table
>> > with --symbols=.dynsym. binutils readelf allows printing just the
>> > dynamic symbol table by type using --dyn-sym. Add the same option
>> > and document it. Also add a testcase to show --symbols=.dynsym and
>> > --dyn-sym produce the same output.
>> 
>> Note that this behavior is not the same as the binutils behavior.
>> There, --dyn-sym does not use the section header table to locate the
>> dynamic symbol table, but the dynamic segment.
>
> When I experimented with this, binutils seems to be display the symbols
> through the dynamic segment when using --symbols --use-dynamic.
> Using --dyn-sym explicitly seems to only display the symbols from the
> SHT_DYNSYM section, not by getting it through the dynamic segment, even
> when combined with --use-dynamic (*).

Ahh.  I patched only the section name for my testing.  It didn't occur
to me to check if readelf looked at the section type instead.

Thanks,
Florian


Re: [PATCH] libdw: add thread-safety to dwarf_getabbrev()

2019-10-26 Thread Florian Weimer
* Jonathon Anderson:

> Just finished some modifications to the patch series, git request-pull 
> output below. This rebases onto the latest master and does a little 
> diff cleaning, the major change is that I swapped out the memory 
> management to use the pthread_key_* alternative mentioned before.

This use of pthread_key_* is rather unusual.  In particular, there are
only 1024 keys supported for the entire process, so it limits the
number of Dwarf * objects that can be created by one process, even if
it does not use any threads.


Re: [PATCH] libdw: add thread-safety to dwarf_getabbrev()

2019-10-26 Thread Florian Weimer
* Mark Wielaard:

> I'll see if I can create a case where that is a problem. Then we can
> see how to adjust things to use less pthread_keys. Is there a different
> pattern we can use?

It's unclear what purpose thread-local storage serves in this context.
You already have a Dwarf *.  I would consider adding some sort of
clone function which creates a shallow Dwarf * with its own embedded
allocator or something like that.  This assumes that memory allocation
is actually a performance problem, otherwise you could let malloc
handle the details.


Re: [PATCH] libdw: add thread-safety to dwarf_getabbrev()

2019-10-26 Thread Florian Weimer
* Jonathon Anderson:

>> This assumes that memory allocation
>> is actually a performance problem, otherwise you could let malloc
>> handle the details.
>
> In our (Dyninst + HPCToolkit) work, we have found that malloc tends to 
> be slow in the multithreaded case, in particular with many small 
> allocations. The glibc implementation (which most of our clients use) 
> uses a full mutex to provide thread-safety. While we could do a lot 
> better in our own projects with regards to memory management, the fact 
> remains that malloc alone is a notable facet to the performance of 
> libdw.

Current glibc versions have a thread-local fast path, which should
address some of these concerns.  It's still not a bump-pointer
allocator, but at least there are no atomics on that path.

I'm not sure if it is prudent to code around imperfections in older
allocators.


Re: [PATCH] libdw: add thread-safety to dwarf_getabbrev()

2019-10-27 Thread Florian Weimer
* Mark Wielaard:

>> Current glibc versions have a thread-local fast path, which should
>> address some of these concerns.  It's still not a bump-pointer
>> allocator, but at least there are no atomics on that path.
>
> Since which version of glibc is there a thread-local fast path?

It was added in:

commit d5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc
Author: DJ Delorie 
Date:   Thu Jul 6 13:37:30 2017 -0400

Add per-thread cache to malloc

So glibc 2.26.  But it is a build-time option, enabled by default, but
it can be switched off by distributions.


Re: [PATCH] libdw: add thread-safety to dwarf_getabbrev()

2019-10-27 Thread Florian Weimer
* Jonathon Anderson:

> On Sun, Oct 27, 2019 at 09:59, Florian Weimer  wrote:
>> * Mark Wielaard:
>> 
>>>>  Current glibc versions have a thread-local fast path, which should
>>>>  address some of these concerns.  It's still not a bump-pointer
>>>>  allocator, but at least there are no atomics on that path.
>>> 
>>>  Since which version of glibc is there a thread-local fast path?
>> 
>> It was added in:
>> 
>> commit d5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc
>> Author: DJ Delorie mailto:d...@delorie.com>>
>> Date:   Thu Jul 6 13:37:30 2017 -0400
>> 
>> Add per-thread cache to malloc
>> 
>> So glibc 2.26.  But it is a build-time option, enabled by default, but
>> it can be switched off by distributions.
>
> I doubt any non-mobile distros would disable it, the cost seems fairly 
> small.

It increases fragmentation.  Vmware's Photon distribution disables it.

> My main concern is that it seems like chunks will only enter the 
> thread-local cache in the presence of free()s (since they have to enter 
> the "smallbins" or "fastbins" first, and those two at a glance seem to 
> be filled very lazily or on free()); since the free()s are all on 
> dwarf_end this would pose an issue. I could also be entirely mistaken, 
> glibc is by no means a simple piece of code.

No, there is a prefill step if the cache is empty, where the cache is
populated with one arena allocation which is then split up.

> According to the comments, there might also be a 16 byte overhead per 
> allocation, which would explode the small allocations considerably.

Available allocatable sizes in bytes are congruent 8 modulo (16), and
the smallest allocatable size is 24.  In general, the overhead is
8 bytes.  (All numbers are for 64-bit architectures.)


Re: RFC: deb/ddeb support for debuginfod

2019-12-01 Thread Florian Weimer
* Frank Ch. Eigler:

> +inline bool
> +string_endswith(const string& haystack, const string& needle)
> +{
> +  return (haystack.size() >= needle.size() &&
> +   haystack.substr(haystack.size()-needle.size()) == needle);
> +}

I think you should use std::equal because substr allocations.  So
something loke htos:

  return (haystack.size() >= needle.size()
  && std::equal(haystack.end() - needle.size(), haystack.end(),
needle.begin(), needle.end()));



Re: RFC: deb/ddeb support for debuginfod

2019-12-01 Thread Florian Weimer
* Florian Weimer:

> * Frank Ch. Eigler:
>
>> +inline bool
>> +string_endswith(const string& haystack, const string& needle)
>> +{
>> +  return (haystack.size() >= needle.size() &&
>> +  haystack.substr(haystack.size()-needle.size()) == needle);
>> +}
>
> I think you should use std::equal because substr allocations.  So
> something loke htos:
>
>   return (haystack.size() >= needle.size()
>   && std::equal(haystack.end() - needle.size(), haystack.end(),
> needle.begin(), needle.end()));

Eh, you have to drop the needle.end() for C++98 compatibility.



Re: [PATCH] tests: Add break to avoid implicit-fallthrough warning

2019-12-06 Thread Florian Weimer
* Mark Wielaard:

> For some reason gcc might fail to recognize the assert (0) will never
> return and emit an implicit-fallthrough warning. Just add a break to
> silence it.

Is this with -DNDEBUG?  assert (0) expands to basically nothing in that
case.  I'm not sure if we should change that.  We cannot realistically
emit __builtin_unreachable in that case, I think, because it would make
writing incorrect asserts *really* risky with -DNDEBUG.

Thanks,
Florian



Re: rfc/patch: user-agent distro-description for debuginfod http traffic

2020-01-07 Thread Florian Weimer
* Frank Ch. Eigler:

> -  return string(hostname) + string(":") + string(servname);
> +  // extract headers relevant to administration
> +  const char* user_agent = MHD_lookup_connection_value (conn, 
> MHD_HEADER_KIND, "User-Agent") ?: "";
> +  const char* x_forwarded_for = MHD_lookup_connection_value (conn, 
> MHD_HEADER_KIND, "X-Forwarded-For") ?: "";
> +  // NB: these are untrustworthy, beware if machine-processing log files
> +
> +  return string(hostname) + string(":") + string(servname) + string(" UA:") 
> + string(user_agent) + string(" XFF:") + string(x_forwarded_for);
>  }
>  
>  
Should this add quoting to make the field boundaries unforgeable?

Thanks,
Florian



Re: [PATCH] debuginfod: add CORS support

2024-12-08 Thread Florian Weimer
* Frank Ch. Eigler:

> Hi -
>
> I'm planning to commit this shortly:
>
> From 4ebefc8f3b4b8fb68a55c960e70122fda50a0fb9 Mon Sep 17 00:00:00 2001
> From: "Frank Ch. Eigler" 
> Date: Sat, 7 Dec 2024 15:01:54 -0500
> Subject: [PATCH] debuginfod: add CORS response headers and OPTIONS method

What are the security implications of a shared origin when serving
(potentially third-party) debuginfo data?

I think it will allow public web clients to exfiltrate debuginfo data
from debuginfod servers on private intranets.  Previously, the
cross-origin restrictions on web content would have prevented that.

Thanks,
Florian



Re: [RFC] sketch of an unwinder cache interface for elfutils libdwfl

2024-12-12 Thread Florian Weimer
* William Cohen:

> The dwfl_report_proc_map() will handle when things are added to the
> memory map.  Is there anything to handle the inverse event when memory
> is removed from the process memory map, such as a dlcose()?

If there's something we can do here on the glibc side, please let us
know.  I think by default, dlclose needs to unmap the address space, so
options might be somewhat limited.  But for debugging purposes, I've
long since wanted a mode where dlclose keeps the address space reserved,
so that it becomes easier to diagnose dlclose misuse.

Thanks,
Florian



debuginfod-find: Way to obtain package metadata

2025-04-25 Thread Florian Weimer
The functionality is almost there:

$ debuginfod-find -v debuginfo 1afd01f4bd3e689a67206c10c20e69921ba9be93
Progress 1 / 0
Progress 77622 / 3150088
Progress 257335 / 3150088
Progress 601864 / 3150088
Progress 2398519 / 3150088
Headers:
x-debuginfod-size: 3150088
x-debuginfod-archive: 
/mnt/fedora_koji_prod/koji/packages/glibc/2.39/37.fc40/x86_64/glibc-debuginfo-2.39-37.fc40.x86_64.rpm
x-debuginfod-file: /usr/lib/debug/usr/sbin/ldconfig-2.39-37.fc40.x86_64.debug
x-debuginfod-imasignature: 
030204a3204a9e0047304502200b6b758076e76784d840aea983ddf1151ad91ed03f2385e0c7c410cac7135aa8022100ffe22c04e69f94ee529edb22aaec3abfb5dc381df08c1a610d0316dbf715159b
Downloaded from 
https://debuginfod.fedoraproject.org/buildid/1afd01f4bd3e689a67206c10c20e69921ba9be93/debuginfo
/home/fweimer/.cache/debuginfod_client/1afd01f4bd3e689a67206c10c20e69921ba9be93/debuginfo

This strongly suggests that the build ID belongs to glibc-2.39-37.fc40
(but a more explicitly way to query this would be nice).

Where this fails is the second invocation:

$ debuginfod-find -v debuginfo 1afd01f4bd3e689a67206c10c20e69921ba9be93
/home/fweimer/.cache/debuginfod_client/1afd01f4bd3e689a67206c10c20e69921ba9be93/debuginfo

Due to the caching, the extra metadata is not shown.

Thanks,
Florian



Re: [PATCH] configure: Error when demangler is enabled, but libstdc++ support isn't

2023-02-16 Thread Florian Weimer via Elfutils-devel
* Mark Wielaard:

> +  [AC_MSG_ERROR([__cxa_demangle not found in libstdc++ use 
> --disable-demangler to disable demangler support.])]),

Missing punctuation after libstdc++?

Thanks,
Florian



Re: [PATCH] configure: Error when demangler is enabled, but libstdc++ support isn't

2023-02-16 Thread Florian Weimer via Elfutils-devel
* Mark Wielaard:

> Hi Florian,
>
> On Thu, 2023-02-16 at 11:48 +0100, Florian Weimer via Elfutils-devel
> wrote:
>> * Mark Wielaard:
>> 
>> > +  [AC_MSG_ERROR([__cxa_demangle not found in libstdc++ use 
>> > --disable-demangler to disable demangler support.])]),
>> 
>> Missing punctuation after libstdc++?
>
> You probably mean you like to see a comma after libstdc++?

Comma or semicolon.

> That sounds reasonable. But I had to double quote the argument to
> AC_MSG_ERROR, otherwise the comma was seen as a argument separator?

Ohh.  I had not consider that.

> Which I find slightly odd, but I don't fully grok autoconf quotes.
>
> Does the following do what you expect?

Yes, but even though I patched dozens of m4 files lately, I still don't
get the quoting rules.  For me, it's mostly “add […] until it breaks,
then remove one layer” …

Thanks,
Florian



[PATCH] testsuite: Avoid C99 compatibility issues in run-native-test.sh

2023-04-22 Thread Florian Weimer via Elfutils-devel
Include  for the pause function, and add the return type
of main.  Avoids an implicit function declaration and implicit int.

Signed-off-by: Florian Weimer 

---
Related to:

  <https://fedoraproject.org/wiki/Changes/PortingToModernC>
  <https://fedoraproject.org/wiki/Toolchain/PortingToModernC>

 tests/run-native-test.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/run-native-test.sh b/tests/run-native-test.sh
index d19007f2..042a51c6 100755
--- a/tests/run-native-test.sh
+++ b/tests/run-native-test.sh
@@ -27,7 +27,8 @@
 # in all builds.
 
 tempfiles native.c native
-echo 'main () { while (1) pause (); }' > native.c
+printf '#include \nint main (void) { while (1) pause (); }\n' \
+  > native.c
 
 native=0
 kill_native()

base-commit: 4d8de4b2fa05495d69d09e1a3d335f24d6bf33ee



Re: Hitting g dwfl->lookup_elts limit in report_r_debug, so not all modules show up and backtracing fails

2023-05-02 Thread Florian Weimer via Elfutils-devel
* Luke Diamand via Elfutils-devel:

> I've got a few cores where report_r_debug() in link_map.c fails to
> find all of the modules - for example I had libc.so missing. This
> obviously meant that elfutils could not backtrace my core.
>
> It seems to be related to this code:
>
>   /* There can't be more elements in the link_map list than there are
>  segments.  DWFL->lookup_elts is probably twice that number, so it
>  is certainly above the upper bound.  If we iterate too many times,
>  there must be a loop in the pointers due to link_map clobberation.  */
>   size_t iterations = 0;
>
>   while (next != 0 && ++iterations < dwfl->lookup_elts)
>
> I've changed this to just keep going until it reaches
> dwfl->lookup_elts*5, which seems to "fix" it, but I feel there must be
> a better fix!
>
> The most recent core I saw with this had lookup_elts=36, and hit 109
> iterations of the loop and then backtraced just fine.

It's probably another fallout from -z separate-code, which tends to
create four LOAD segments.  The magic number 5 sounds about right, as
gold also has -z text-unlikely-segment, which might result in creating
that number of load segments (but I haven't tried).

Thanks,
Florian



Re: [PATCH] Allow elf header to be used when __BEGIN_DECLS and __END_DECLS aren't defined.

2020-10-27 Thread Florian Weimer via Elfutils-devel
* Érico Nogueira via Libc-alpha:

> This is mostly an initial PoC, and an additional comment for why this
> is needed could be added to the file.
>
> I accidentally sent the wrong patch a while ago, sorry!

extern "C++" does not have any effect for this header because it does
not declare anything that has language linkage.

I think you need to remove  to make the file self-contained.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



Re: [musl] Re: [QUESTION] Which fnmatch() functionality does elfutils depend on?

2020-10-27 Thread Florian Weimer via Elfutils-devel
* Rich Felker:

> As I stated in my other reply, I'm opposed to that because it does not
> admit implementation with the same (very desirable) big-O properties,
> and the "extmatch" syntax is not widely known or widely used.

The syntax comes from ksh and is used in shell scripts.  (bash requires
the extglob option to enable it, which makes it easy to search for
instances.)

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



Re: Golang binary dwarf

2021-02-07 Thread Florian Weimer via Elfutils-devel
* Mark Wielaard:

> Hi Manoj,
>
> On Sat, 2021-02-06 at 07:03 +, Manoj Kumar via Elfutils-devel
> wrote:
>> Hi,Using dyninst, I am trying to access local variables and arguments
>> of a function which is part of a go binary.To get local variables,
>> dyninst uses libdw to find all the modules. It uses dwarf_nextcu
>> function of libdw to find all the modules of binary.When I compile a
>> go binary using "go build filename.go" then dwarf_nextcu finds all
>> the modules (almost 118) and dyninst is able to access local
>> variables and arguments of any function.But when I compile a go
>> binary with linkshared option  "go build -linkshared filename.go"
>> then dwarf_nextcu finds only 5 modules. Due to that, dyninst is not
>> able to access any variable of a function.
>> Does go binary with linkshared option has different version of dwarf?
>> Could you please suggest how to use libdw for go binary with
>> linkshared option.
>
> Could you explain what the linkshared option is and what it does?

$ go tool link -help
[…]
  -linkshared
link against installed Go shared libraries
[…]

The Fedora version is broken, unfortunately:

$ cat t.go
package main
func main() { }
$ go build -linkshared t.go 
go build runtime/internal/sys: copying 
/home/fweimer/.cache/go-build/ef/ef8092ebd194e7ec34c864584fc2216541454e9626c98ffe9d5210d2784da429-d:
 open /usr/lib/golang/pkg/linux_amd64_dynlink/runtime/internal/sys.a: 
permission denied
go build internal/cpu: copying 
/home/fweimer/.cache/go-build/ca/ca0a2dbbc130aa21613ca27f378b8c9fda02c4082687d92b40a0d6c660bc86d1-d:
 open /usr/lib/golang/pkg/linux_amd64_dynlink/internal/cpu.a: permission denied
[…]
$ rpm -qf /usr/bin/go
golang-bin-1.15.7-1.fc33.x86_64

Filed here: 

So I'm not sure how we can reproduce the elfutils issue. 8-/

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



Re: [PATCH] tests: Add -ldl to dwfl_proc_attach_LDFLAGS

2021-11-19 Thread Florian Weimer via Elfutils-devel
* Dmitry V. Levin:

> Let's make clear what's going on here.  First of all, dwfl-proc-attach.c
> does not use dlopen so it doesn't pull it in and doesn't need -ldl.
> In regular builds, dwfl-proc-attach.o is linked with ../libdw/libdw.so
> which in turn uses dlopen and is already linked with -ldl.
> When elfutils is configured with --enable-gprof or --enable-gcov,
> BUILD_STATIC is enabled and dwfl-proc-attach.o is linked with
> ../libdw/libdw.a -lz $(zip_LIBS) $(libelf) $(libebl) -ldl -lpthread
> which already contains -ldl.
> In any case, I fail to understand why dwfl-proc-attach might need
> an extra -ldl, especially in LDFLAGS which goes before LDADD
> in the linking command.

It may have to do with --as-needed that some builds use.  If there are
no pending undefined references, some linkers drop earlier shared object
references with --as-needed (similar to what happens with static
archives).

The GCC LTO plugin results in ld looking at more objects in greater
detail for some reason.  Without LTO and --as-needed, you probably don't
get a dlopen export (if you do not link with -E) because indirect
dependencies are not consulted, breaking the valgrind workaround because
there is no interposition.

Thanks,
Florian



Re: [PATCH v2] libebl: recognize FDO Packaging Metadata ELF note

2021-11-30 Thread Florian Weimer via Elfutils-devel
* Frank Ch. Eigler via Elfutils-devel:

> Hi -
>
> On Tue, Nov 30, 2021 at 12:25:41PM +0100, Mark Wielaard wrote:
>> [...]
>> The spec does explain the requirements for JSON numbers, but doesn't
>> mention any for JSON strings or JSON objects. It would be good to also
>> explain how to make the strings and objects unambiguous. [...]
>> For Objects it should require that all names are unique. [...]
>> For Strings it should require that \u escaping isn't used [...]
>> 
>> That should get rid of various corner cases that different parsers are
>> known to get wrong. 
>
> Are such buggy parsers seen in the wild, now, after all this time with
> JSON?  It seems to me it's not really elfutils' or systemd's place to
> impose -additional- constraints on customary JSON.

JSON has been targeted at the Windows/Java UTF-16 world, there is always
going to be a mismatch if you try to represent it in UTF-8 or anything
that doesn't have surrogate pairs.

>> Especially \u escaping is really confusing when using the UTF-8
>> encoding (and it is totally necessary since UTF-8 can express any
>> valid UTF character already).
>
> Yes, and yet we have had the bidi situation recently where UTF-8 raw
> codes could visually confuse a human reader whereas escaped \u
> wouldn't.  If we forbid \u unilaterally, we literally become
> incompatible with JSON (RFC8259 7. String. "Any character may be
> escaped."), and for what?

RFC 8259 says this:

   However, the ABNF in this specification allows member names and
   string values to contain bit sequences that cannot encode Unicode
   characters; for example, "\uDEAD" (a single unpaired UTF-16
   surrogate).  Instances of this have been observed, for example, when
   a library truncates a UTF-16 string without checking whether the
   truncation split a surrogate pair.  The behavior of software that
   receives JSON texts containing such values is unpredictable; for
   example, implementations might return different values for the length
   of a string value or even suffer fatal runtime exceptions.

A UTF-8 environment has to enforce *some* additional constraints
compared to the official JSON syntax.

Thanks,
Florian



Re: [PATCH v2] libebl: recognize FDO Packaging Metadata ELF note

2021-12-02 Thread Florian Weimer via Elfutils-devel
* Frank Ch. Eigler:

>> > Yes, and yet we have had the bidi situation recently where UTF-8 raw
>> > codes could visually confuse a human reader whereas escaped \u
>> > wouldn't.  If we forbid \u unilaterally, we literally become
>> > incompatible with JSON (RFC8259 7. String. "Any character may be
>> > escaped."), and for what?
>> 
>> RFC 8259 says this:
>> 
>>However, the ABNF in this specification allows member names and
>>string values to contain bit sequences that cannot encode Unicode
>>characters; for example, "\uDEAD" (a single unpaired UTF-16
>>surrogate).  Instances of this have been observed, for example, when
>>a library truncates a UTF-16 string without checking whether the
>>truncation split a surrogate pair.  The behavior of software that
>>receives JSON texts containing such values is unpredictable; for
>>example, implementations might return different values for the length
>>of a string value or even suffer fatal runtime exceptions.
>> 
>> A UTF-8 environment has to enforce *some* additional constraints
>> compared to the official JSON syntax.
>
> I'm sorry, I don't see how.  If a JSON string were to include the
> suspect "\uDEAD", but from observing our hypothetical "no escapes!"
> rule they could reencode it as the UTF-8 octets 0xED 0xBA 0xAD.
> ISTM we're no better off.

These octets aren't UTF-8.  UTF-8 never contains surrogate pairs (paired
or unpaired). 8-(

Thanks,
Florian



Re: [PATCH] build: allow turning off --no-undefined and -z,defs

2021-12-04 Thread Florian Weimer via Elfutils-devel
* Evgeny Vereshchagin:

> ASan, UBSan and MSan provided by clang aren't compatible with --no-undefined 
> and -z,defs:
> https://clang.llvm.org/docs/AddressSanitizer.html#usage
> https://github.com/google/sanitizers/issues/380
> so to build elfutils with clang with the sanitizers it should be possible
> to turn them off.
>
> Without this patch something like
>
> sed -i 's/^\(ZDEFS_LDFLAGS=\).*/\1/' configure.ac
> find -name Makefile.am | xargs sed -i 's/,--no-undefined//'
>
> should be used to make elfutils compile.
>
> The patch was tested in https://github.com/evverx/elfutils/pull/24 by
> compiling elfutils with both gcc and clang with and without ASan/UBsan
> and running `make check && make distcheck`. --no-undefined and -z,defs
> are still passed by default as expected.

Why isn't this a bug in the compiler driver?  Nowadays, GCC passes
-lasan if -fsanitize=address is used.  I think that's quite reasonable.

Thanks,
Florian



Re: [PATCH] libelf: Use offsetof to get field of unaligned

2021-12-15 Thread Florian Weimer via Elfutils-devel
* Mark Wielaard:

> This seems a wrong warning since we aren't accessing the field member
> of the struct, but are taking the address of it. But we can do the
> same by adding the field offsetof to the base address. Which doesn't
> trigger a runtime error.

I think the warning is correct.  I believe it is motivated by the GCC
optimizers using this to infer alignment of the original pointer.  It
won't make a difference for this expression, but it can cause crashes
elsewhere with strict-alignment targets.

Thanks,
Florian



Re: Using libcurl in another library, when/if to call curl_global_init?

2022-04-05 Thread Florian Weimer via Elfutils-devel
* Mark Wielaard:

> Hi,
>
> On Thu, Mar 31, 2022 at 04:00:16PM +0300, Catalin Raceanu via curl-library 
> wrote:
>> On 31-Mar-22 15:04, Mark Wielaard wrote:
>> > whether there is a thread-safe way to call
>> > curl_global_init at a later time (to get rid of the library constructor
>> > init function).
>> 
>> I believe that this is an exact fit for C==11's std::call_once(). Boost also
>> has an equivalent, that most likely predates the other, in case older c++
>> standard is used.
>
> Thanks. Our library is pure C, but we can probably rely on
> pthread_once if it is allowable to call curl_global_init at a later
> time when multiple threads are already running.

The problem is that every caller of pthread_once needs to use the same
pthread_once_t synchronization object, so you still have the same
problem.

I strongly believe that library-safe code needs to perform lazy
initialization, that is, initialize the library on first use, and do
that in a thread-safe manner.  It solves the synchronization issue
between different users of the API, and it's the only way to report
errors properly (for example, a failure in an ELF constructor can only
be reported via process termination).

At least on Linux, the need to support multiple different threading
libraries is a thing of the past.

Thanks,
Florian