Re: patch 1/2 debuginfod client

2019-11-13 Thread Mark Wielaard
Hi,

On Mon, 2019-10-28 at 15:06 -0400, Frank Ch. Eigler wrote:
> Subject: [PATCH 1/2] debuginfod 1/2: client side
> 
> Introduce the debuginfod/ subdirectory, containing the client for a
> new debuginfo-over-http service, in shared-library and command-line
> forms.  Two functions in libdwfl make calls into the client library to
> fetch elf/dwarf files by buildid, as a fallback.  Instead of normal
> dynamic linking (thus pulling in a variety of curl dependencies),
> the libdwfl hooks use dlopen/dlsym.  Server & tests coming in patch 2.

And finally the libdw, libdwfl and m4 dirs.
Some of which was already discussed and changed.

> diff --git a/libdw/ChangeLog b/libdw/ChangeLog
> index 394c0df293f0..f0af348aff6e 100644
> --- a/libdw/ChangeLog
> +++ b/libdw/ChangeLog
> @@ -1,3 +1,7 @@
> +2019-10-28  Aaron Merey  
> +
> + * Makefile.am (libdw_so_LDLIBS): Add -ldl for libdebuginfod.so dlopen.
> +
>  2019-08-26  Jonathon Anderson  
>  
>   * libdw_alloc.c (__libdw_allocate): Added thread-safe stack allocator.
> diff --git a/libdw/Makefile.am b/libdw/Makefile.am
> index ce793e903b88..33b5838dc4e1 100644
> --- a/libdw/Makefile.am
> +++ b/libdw/Makefile.am
> @@ -109,7 +109,7 @@ libdw_so_LIBS = ../libebl/libebl_pic.a 
> ../backends/libebl_backends_pic.a \
>   ../libcpu/libcpu_pic.a libdw_pic.a ../libdwelf/libdwelf_pic.a \
>   ../libdwfl/libdwfl_pic.a
>  libdw_so_DEPS = ../lib/libeu.a ../libelf/libelf.so
> -libdw_so_LDLIBS = $(libdw_so_DEPS) -lz $(argp_LDADD) $(zip_LIBS) -pthread
> +libdw_so_LDLIBS = $(libdw_so_DEPS) -ldl -lz $(argp_LDADD) $(zip_LIBS) 
> -pthread
>  libdw_so_SOURCES =
>  libdw.so$(EXEEXT): $(srcdir)/libdw.map $(libdw_so_LIBS) $(libdw_so_DEPS)
>   $(AM_V_CCLD)$(LINK) $(dso_LDFLAGS) -o $@ \

Yes, ok.

> diff --git a/libdwfl/ChangeLog b/libdwfl/ChangeLog
> index 04a39637e9a4..be29cc00bc7e 100644
> --- a/libdwfl/ChangeLog
> +++ b/libdwfl/ChangeLog
> @@ -1,3 +1,9 @@
> +2019-10-28  Aaron Merey  
> +
> + * dwfl_build_id_find_elf.c (dwfl_build_id_find_elf): Call debuginfod
> + client functions via dlopen to look for elf/dwarf files as fallback.
> + * find-debuginfo.c (dwfl_standard_find_debuginfo): Ditto.
> +
>  2019-08-12  Mark Wielaard  
>  
>   * gzip.c (open_stream): Return DWFL_E_ERRNO on bad file operation.
> diff --git a/libdwfl/Makefile.am b/libdwfl/Makefile.am
> index 89ca92ed8110..29046e9e5e85 100644
> --- a/libdwfl/Makefile.am
> +++ b/libdwfl/Makefile.am
> @@ -31,7 +31,7 @@
>  ##
>  include $(top_srcdir)/config/eu.am
>  AM_CPPFLAGS += -I$(srcdir) -I$(srcdir)/../libelf -I$(srcdir)/../libebl \
> --I$(srcdir)/../libdw -I$(srcdir)/../libdwelf
> +-I$(srcdir)/../libdw -I$(srcdir)/../libdwelf 
> -I$(srcdir)/../debuginfod
>  VERSION = 1
>  
>  noinst_LIBRARIES = libdwfl.a
> @@ -39,6 +39,7 @@ noinst_LIBRARIES += libdwfl_pic.a
>  
>  pkginclude_HEADERS = libdwfl.h
>  
> +
>  libdwfl_a_SOURCES = dwfl_begin.c dwfl_end.c dwfl_error.c dwfl_version.c \
>   dwfl_module.c dwfl_report_elf.c relocate.c \
>   dwfl_module_build_id.c dwfl_module_report_build_id.c \

AM_CPPFLAGS change is ok. There is a stray whitespace change.

> diff --git a/libdwfl/dwfl_build_id_find_elf.c 
> b/libdwfl/dwfl_build_id_find_elf.c
> index cc6c3f62d276..1f3834180c4a 100644
> --- a/libdwfl/dwfl_build_id_find_elf.c
> +++ b/libdwfl/dwfl_build_id_find_elf.c

This part looks good on the debuginfod-submit branch
(although I am slightly sad about the > 80 char lines)

diff --git a/libdwfl/find-debuginfo.c b/libdwfl/find-debuginfo.c
> index 9267788d2d19..d36ec3cc39b8 100644
> --- a/libdwfl/find-debuginfo.c
> +++ b/libdwfl/find-debuginfo.c
> @@ -1,5 +1,5 @@
>  /* Standard find_debuginfo callback for libdwfl.
> -   Copyright (C) 2005-2010, 2014, 2015 Red Hat, Inc.
> +   Copyright (C) 2005-2010, 2014, 2015, 2019 Red Hat, Inc.
> This file is part of elfutils.
>  
> This file is free software; you can redistribute it and/or modify
> @@ -31,9 +31,13 @@
>  #endif
>  
>  #include "libdwflP.h"
> +#ifdef ENABLE_DEBUGINFOD
> +#include "debuginfod.h"
> +#endif
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include "system.h"
>  
> @@ -359,7 +363,8 @@ dwfl_standard_find_debuginfo (Dwfl_Module *mod,
>   other than just by finding nothing, that's all we do.  */
>const unsigned char *bits;
>GElf_Addr vaddr;
> -  if (INTUSE(dwfl_module_build_id) (mod, &bits, &vaddr) > 0)
> +  int bits_len;
> +  if ((bits_len = INTUSE(dwfl_module_build_id) (mod, &bits, &vaddr)) > 0)
>  {
>/* Dropping most arguments means we cannot rely on them in
>dwfl_build_id_find_debuginfo.  But leave it that way since
> @@ -397,6 +402,28 @@ dwfl_standard_find_debuginfo (Dwfl_Module *mod,
>free (canon);
>  }
>  
> +#if ENABLE_DEBUGINFOD
> +  {
> +static void *debuginfod_so;
> +static __typeof__ (debuginfod_find_debuginfo) 
> *fp_debuginfod_find_debuginfo;
> +
> +if (debuginfod_so == NULL)
> +  debuginfod_so

Re: patch 2/2 debuginfod server etc.

2019-11-13 Thread Mark Wielaard
Hi,

On Mon, 2019-10-28 at 15:07 -0400, Frank Ch. Eigler wrote:
> Add the server to the debuginfod/ subdirectory.  This is a highly
> multithreaded c++11 program (still buildable on rhel7's gcc 4.8,
> which is only partly c++11 compliant).  Includes an initial suite
> of tests, man pages, and a sample systemd service.

Some feedback on the first part, the config directory additions.

> diff --git a/config/ChangeLog b/config/ChangeLog
> index b641d0d5b74e..73643f919406 100644
> --- a/config/ChangeLog
> +++ b/config/ChangeLog
> @@ -1,3 +1,10 @@
> +2019-10-28  Frank Ch. Eigler  
> +
> + * eu.am (AM_CXXFLAGS): Clone & amend AM_CFLAGS for c++11 code.
> + * debuginfod.service, debuginfod.sysconfig: New files: systemd.
> + * Makefile.am: Install them.
> + * elfutils.spec.in: Add debuginfod and debuginfod-client subrpms.
> +
>  2019-08-29  Mark Wielaard  
>  
>   * elfutils.spec.in (%description devel): Remove libebl text.
> diff --git a/config/Makefile.am b/config/Makefile.am
> index 6425718efc54..7c75f59880d3 100644
> --- a/config/Makefile.am
> +++ b/config/Makefile.am
> @@ -28,8 +28,8 @@
>  ## the GNU Lesser General Public License along with this program.  If
>  ## not, see .
>  ##
> -EXTRA_DIST = elfutils.spec.in known-dwarf.awk 10-default-yama-scope.conf
> -  libelf.pc.in libdw.pc.in
> +EXTRA_DIST = elfutils.spec.in known-dwarf.awk 10-default-yama-scope.conf \
> + libelf.pc.in libdw.pc.in libdebuginfod.pc.in debuginfod.service 
> debuginfod.sysconfig
>  
>  pkgconfigdir = $(libdir)/pkgconfig
>  pkgconfig_DATA = libelf.pc libdw.pc libdebuginfod.pc

OK.

> diff --git a/config/debuginfod.service b/config/debuginfod.service
> new file mode 100644
> index ..15a8808baf0c
> --- /dev/null
> +++ b/config/debuginfod.service
> @@ -0,0 +1,15 @@
> +[Unit]
> +Description=elfutils debuginfo-over-http server
> +Documentation=http://elfutils.org/
> +After=network.target
> +
> +[Service]
> +EnvironmentFile=/etc/sysconfig/debuginfod
> +User=debuginfod
> +Group=debuginfod
> +#CacheDirectory=debuginfod
> +ExecStart=/usr/bin/debuginfod -d /var/cache/debuginfod/debuginfod.sqlite -p 
> $DEBUGINFOD_PORT $DEBUGINFOD_VERBOSE $DEBUGINFOD_PATHS
> +TimeoutStopSec=10

Why is CacheDirectory commented out, I cannot find it anywhere else in
the code.

> +[Install]
> +WantedBy=multi-user.target
> diff --git a/config/debuginfod.sysconfig b/config/debuginfod.sysconfig
> new file mode 100644
> index ..bee42a4f443a
> --- /dev/null
> +++ b/config/debuginfod.sysconfig
> @@ -0,0 +1,9 @@
> +#
> +DEBUGINFOD_PORT="8002"
> +#DEBUGINFOD_VERBOSE="-v"
> +DEBUGINFOD_PATHS="/usr/lib/debug /usr/bin /usr/sbin /usr/lib /usr/lib64 
> /usr/local"

Should this also include /usr/libexec ?
Isn't /usr/local too broad? Should it also include /opt and/or /srv?

> +# upstream debuginfods
> +#DEBUGINFOD_URLS="http://secondhost:8002 http://thirdhost:8002";
> +#DEBUGINFOD_TIMEOUT="5"
> +#DEBUGINFOD_CACHE_DIR=""
> diff --git a/config/elfutils.spec.in b/config/elfutils.spec.in
> index 6771d13ba740..694f8fde2f48 100644
> --- a/config/elfutils.spec.in
> +++ b/config/elfutils.spec.in
> @@ -23,10 +23,22 @@ BuildRequires: flex >= 2.5.4a
>  BuildRequires: bzip2
>  BuildRequires: m4
>  BuildRequires: gettext
> -BuildRequires: zlib-devel
> +BuildRequires: pkgconfig(zlib)
> +%if 0%{?rhel} == 7
>  BuildRequires: bzip2-devel
> -BuildRequires: xz-devel
> +%else
> +BuildRequires: pkgconfig(bzip2)
> +%endif
> +BuildRequires: pkgconfig(liblzma)

So now we only need the -devel package for bzip2 because it might not
have a pkgconfig file. Good.

I am slightly confused about xz-devel vs liblzma.
Could you remind me again what the relation was?
I see we do check for liblzma with pkgconfig in configure.ac.
So I assume the change is correct. Just confused about the naming.

>  BuildRequires: gcc-c++
> +BuildRequires: pkgconfig(libmicrohttpd) >= 0.9.33
> +BuildRequires: pkgconfig(libcurl) >= 7.29.0
> +BuildRequires: pkgconfig(sqlite3) >= 3.7.17
> +BuildRequires: pkgconfig(libarchive) >= 3.1.2
> +%if 0%{?rhel} >= 8 || 0%{?fedora} >= 20
> +Recommends: elfutils-debuginfod-client
> +%endif
> +

Should we add %else Requires: elfutils-debuginfod-client?
Also I think it would be more correct to move the Recommends/Requires up with 
the other Requires.
(Side note, in fedora we actually have a separate libs subpackage, in fedora it 
should be moved there.)
 
>  %define _gnu %{nil}
>  %define _programprefix eu-
> @@ -116,18 +128,45 @@ interprocess services, communication and introspection
>  (like synchronisation, signaling, debugging, tracing and
>  profiling) of processes.
>  
> +%package debuginfod-client
> +Summary: Libraries and command-line frontend for HTTP ELF/DWARF file server 
> addressed by build-id.
> +License: GPLv2+

That should probably be:
GPLv3+ and (GPLv2+ or LGPLv3+)
^ for the binary, ^ for the library

> +%package debuginfod
> +Summary: HTTP ELF/DWARF file server addressed by build-id.
> +License: GPLv2

Re: patch 2/2 debuginfod server etc.

2019-11-13 Thread Mark Wielaard
On Mon, 2019-10-28 at 15:07 -0400, Frank Ch. Eigler wrote:
> Add the server to the debuginfod/ subdirectory.  This is a highly
> multithreaded c++11 program (still buildable on rhel7's gcc 4.8,
> which is only partly c++11 compliant).  Includes an initial suite
> of tests, man pages, and a sample systemd service.

OK, some comments about the debuginfod dir additions.
Except the actual code...

> diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
> index 1a31cf6f4e27..b5679a2f9d16 100644
> --- a/debuginfod/ChangeLog
> +++ b/debuginfod/ChangeLog
> @@ -1,3 +1,9 @@
> +2019-10-28  Frank Ch. Eigler  
> +
> + * debuginfod.cxx: New file: debuginfod server.
> + * debuginfod.8: New file: man page.
> + * Makefile.am: Build it.

Like with the client manpages, I think they should go into the doc/ dir
just because all manpages are there at the moment.

>   * debuginfod-client.c: New file: debuginfod client library.
> diff --git a/debuginfod/Makefile.am b/debuginfod/Makefile.am
> index 3a4e94da2ad0..6e5c7290e68d 100644
> --- a/debuginfod/Makefile.am
> +++ b/debuginfod/Makefile.am
> @@ -55,9 +55,14 @@ libeu = ../lib/libeu.a
>  
>  AM_LDFLAGS = -Wl,-rpath-link,../libelf:../libdw:.
>  
> -bin_PROGRAMS = debuginfod-find
> +bin_PROGRAMS = debuginfod debuginfod-find
> +debuginfod_SOURCES = debuginfod.cxx
> +debuginfod_cxx_CXXFLAGS = -Wno-unused-functions

Should that be debuginfod_CXXFLAGS?
Why are there unused functions are there?

> +# g++ 4.8 on rhel7 otherwise errors gthr-default.h / 
> __gthrw2(__gthrw_(__pthread_key_create) undefs

Could you explain this comment a bit more?

> +dist_man8_MANS = debuginfod.8

As stated above, I think this should be moved to doc/

>  dist_man3_MANS = debuginfod_find_debuginfo.3 debuginfod_find_source.3 
> debuginfod_find_executable.3
>  dist_man1_MANS = debuginfod-find.1
> +debuginfod_LDADD = $(libdw) $(libelf) $(libeu) $(libdebuginfod) 
> $(libmicrohttpd_LIBS) $(libcurl_LIBS) $(sqlite3_LIBS) $(libarchive_LIBS) 
> -lpthread -ldl
>
>  debuginfod_find_SOURCES = debuginfod-find.c
>  debuginfod_find_LDADD = $(libeu) $(libdebuginfod)

OK.

> diff --git a/debuginfod/debuginfod.8 b/debuginfod/debuginfod.8
> new file mode 100644
> index ..02059a115430
> --- /dev/null
> +++ b/debuginfod/debuginfod.8
> @@ -0,0 +1,332 @@
> +'\"! tbl | nroff \-man
> +'\" t macro stdmacro
> +
> +.de SAMPLE
> +.br
> +.RS 0
> +.nf
> +.nh
> +..
> +.de ESAMPLE
> +.hy
> +.fi
> +.RE
> +..
> +
> +.TH DEBUGINFOD 8
> +.SH NAME
> +debuginfod \- debuginfo-related http file-server daemon

Right. section 8 because it is system administration command.

> +.SH SYNOPSIS
> +.B debuginfod
> +[\fIOPTION\fP]... [\fIPATH\fP]...
> +
> +.SH DESCRIPTION
> +\fBdebuginfod\fP serves debuginfo-related artifacts over HTTP.  It
> +periodically scans a set of directories for ELF/DWARF files and their
> +associated source code, as well as RPM files containing the above, to
> +build an index by their buildid.  This index is used when remote
> +clients use the HTTP webapi, to fetch these files by the same buildid.
> +
> +If a debuginfod cannot service a given buildid artifact request
> +itself, and it is configured with information about upstream
> +debuginfod servers, it queries them for the same information, just as
> +\fBdebuginfod-find\fP would.  If successful, it locally caches then
> +relays the file content to the original requester.
> +
> +Each listed PATH creates a thread to scan for matching
> +ELF/DWARF/source files under the given directory.  Source files are
> +matched with DWARF files based on the AT_comp_dir (compilation
> +directory) attributes inside it.  Duplicate directories are ignored.
> +You may use a file name for a PATH, but source code indexing may be
> +incomplete; prefer using a directory that contains the binaries.

So you give it directories with executables and directories with debug
files. They are indexed if they have a build-id. You never provide a
sources dir, but if something is indexed as a debug file then any
source file referenced from it can be requested.

In theory you only need to provide the executable dirs, from that you
can normally get the separate .debug files.

Why didn't you do that (only providing executable dirs)?
And why do you allow sources to be found indirectly?
Would it make sense to restrict the sources (prefix) dir?

The questions above are simply because it makes me nervous that some,
but not all, files that can be requested can indirectly be found
anywhere. If you have an explicit debug dir, then it might make sense
to also have an explicit sources dir.

In general I think it would be good to think about a selinux security
context for debuginfod. This is not urgent, but I think we should see
if we can get someone to do a security audit.

One concern I have is that if a process is run in a restricted security
context in which it might not be able to access certain files, but it
might have access to a local socket to debuginfod, then it can get
access to files it

Re: patch 1/2 debuginfod client

2019-11-13 Thread Frank Ch. Eigler
Hi -

> Hurrah! Documentation! Thanks.
> 
> But given that all other documentation is under doc/ could you move it
> there (guarded by DEBUGINFOD). It is just more consistent. If you leave
> it in this subdir I think you should also move the existing
> documentation files (and I think that is not worth it at the moment).

... ok, tried moving.  ... but but but that subdirectory is not ready
for rpm installations because it uses automake constructs that don't
exist ("notrans_dist_*_man*").  The notrans bit is needed because some
(and only SOME) elfutils artifacts will be renamed as a consequence of
the program-prefix manipulations.  automake has only limited support
for mix & match, and none for man pages.

So for debuginfod, the prefix stuff is mechanically overridden for the
whole directory (binaries + man pages).  That works fine.  For doc/,
if everything were in there - man pages for elfutils.3 functions (not
prefixed) and man pages for elfutils.1 binaries (prefixed), there will
be sadness and tears.  TEARS.  We'd need two doc/ directories.


> > +# automake std-options override: arrange to pass LD_LIBRARY_PATH
> > +installcheck-binPROGRAMS: $(bin_PROGRAMS)
> > +   bad=0; pid=; list="$(bin_PROGRAMS)"; for p in $$list; do \
> > + case ' $(AM_INSTALLCHECK_STD_OPTIONS_EXEMPT) ' in \
> > +  *" $$p "* | *" $(srcdir)/$$p "*) continue;; \
> > + esac; \
> > + f=`echo "$$p" | \
> > +sed 's,^.*/,,;s/$(EXEEXT)$$//;$(transform);s/$$/$(EXEEXT)/'`; \
> > + for opt in --help --version; do \
> > +   if LD_LIBRARY_PATH=$(DESTDIR)$(libdir) \
> > +  $(DESTDIR)$(bindir)/$$f $$opt > c$${pid}_.out 2> c$${pid}_.err \
> > +&& test -n "`cat c$${pid}_.out`" \
> > +&& test -z "`cat c$${pid}_.err`"; then :; \
> > +   else echo "$$f does not support $$opt" 1>&2; bad=1; fi; \
> > + done; \
> > +   done; rm -f c$${pid}_.???; exit $$bad
> 
> I see we also do this in src/Makefile.am but, ehe, why?

The --help / --version bit is apparently there to confirm that every
elfutils binary supports those two options at least.  Can remove if you
like.


> > +#include 
> 
> O, o... older (glibc) fts is problematic on 32bit systems when using
> large file offsets. See libdwfl/linux-kernel-modules.c. See the BAD_FTS
> check in configure.ac
> 
>  Older glibc had a broken fts that didn't work with Large File Systems.
>  We want the version that can handler LFS, but include workaround if we
>  get a bad one. Add define to CFLAGS (not AC_DEFINE it) since we need to
>  check it before including config.h (which might define _FILE_OFFSET_BITS).

OK, copied over the debuginfod.cxx BAD_FTS stuff.


> Have you tried to build and run on i686 on Debian stable or CentOS7?

No.


> > +static const int max_build_id_bytes = 256; /* typical: 40 for gnu C 
> > toolchain */

OK, replacing this with  throughout.


> > +  /* URL queried by this handle.  */
> > +  char url[PATH_MAX];
> 
> Are you sure that is the correct limit?
> Could it be made dynamic?

Could switch to self-allocating printfs throughout.


> > +/* Query each of the server URLs found in $DEBUGINFOD_URLS for the file
> > +   with the specified build-id, type (debuginfo, executable or source)
> > +   and filename. filename may be NULL. If found, return a file
> > +   descriptor for the target, otherwise return an error code.  */
> 
> You don't describe PATH. I assume the caller is responsible for freeing
> it? Is it set on error?

The man page covers this issue.


> > +  char cache_path[PATH_MAX];
> > +  char maxage_path[PATH_MAX*3]; /* These *3 multipliers are to shut up gcc 
> > -Wformat-truncation */
> > +  char interval_path[PATH_MAX*4];
> > +  char target_cache_dir[PATH_MAX*2];
> > +  char target_cache_path[PATH_MAX*3];
> > +  char target_cache_tmppath[PATH_MAX*4];
> > +  char suffix[PATH_MAX];
> 
> Ah, lots more uses of PATH_MAX. Now my worry is kind of the opposite.
> These are all stack allocated. Won't that blow up the stack?

Well, not that much.  We expect the web urls to be relatively short: a
couple dozen chars for buildid etc., and a source-file path, which I
suppose could be as long as PATH_MAX.  But modern glibc/gcc like to
warn when building strings out of pieces as large as PATH_MAX, even
if we are okay with truncation or whatever.

But really, alloc-on-the-fly type printf would be ideal here.  I am
pretty sure I looked for one being used in elfutils before but
couldn't find one just right.


> > +  char build_id_bytes[max_build_id_bytes * 2 + 1];
> > +
> > +  /* Copy lowercase hex representation of build_id into buf.  */
> > +  if ((build_id_len >= max_build_id_bytes) ||
> > +  (build_id_len == 0 &&
> > +   sizeof(build_id_bytes) > max_build_id_bytes*2 + 1))
> > +return -EINVAL;
> 
> I wonder if you also want to check for a build_id that is simply too
> small. Since build_id are supposed to be globally unique it doesn't
> really make sense to have a build_id length that is too small. libdwfl
> for

Re: patch 1/2 debuginfod client

2019-11-13 Thread Frank Ch. Eigler
Hi -

> > It'd be fractions of a second per configure run ... worth worrying
> > about?
> 
> Probably not, but if it is easy to hide after the check, why not do it?
> Don't if it requires nasty trickery though.

OK.

> > > > +PKG_PROG_PKG_CONFIG
> > > > +AC_ARG_ENABLE([debuginfod], AC_HELP_STRING([--enable-debuginfod], 
> > > > [Build debuginfo server and client solib]))
> > > > +AS_IF([test "x$enable_debuginfod" = "xyes"], [
> [...]
> 
> I think it is better to enable it by default and error out when the
> dependencies aren't found. Just add an explicit notice that people can
> disable it when they don't want it.
> AC_MSG_NOTICE([checking debuginfod dependencies, disable to skip])
> or something like that. 

OK  well wait, that'll slow everything down by default after all. :-)

- FChE