date:20240410

Re: Inquiry about Development Features and Integration of Source Code Version Control Systems

2024-04-10 Thread Frank Ch. Eigler

Hi -

> [...] Specifically, I am interested in understanding how the
> Debuginfod Server can facilitate the direct download of source code
> from Git repositories.  [...]

Can you explain under what situations you think this would be helpful?

The main challenge is that compilers don't generally have such
information (e.g. git repo + file commit/hash) available to them when
they build the object files, so it doesn't show up in DWARF, and
therefore it doesn't show up in packaged binaries.  Thus debuginfod
can't pass the info to clients or a version control system.

- FChE

Re: Inquiry about Development Features and Integration of Source Code Version Control Systems

2024-04-10 Thread Matt Schulte

Hi Tobias,

I wanted to provide an alternative solution that may work for you
(with or without support from debuginfod).

Support for reading source code directly from a code hosting service
(such as GitHub) has been supported for years by Microsoft via
SourceLink[1]. In 2018, a proposal[2] was submitted to the DWARF
standard to include sourcelink information in the DWARF format. In
2022, a modification to the proposal[2] was accepted and is in the
current working draft for DWARF6[3]. The proposal is to place the
source URL for any file with an available URL in the DWARF
information.

There is still a lot of work to do. The DWARF6 specification needs to
be published, compilers (or external tools) need to support adding the
URLs into the DWARF info, and debuggers need to support downloading
the source files. But it could provide a path for you (or debuginfod)
to support reading directly from the upstream source repository.

> Can you explain under what situations you think this would be helpful?

I can only think about situations that involve closed source software.
In my experience, companies shipping closed source software have
restrictions about where source code can be placed.

-Matt
[1]: 
https://learn.microsoft.com/en-us/cpp/build/reference/sourcelink?view=msvc-170
[2]: https://dwarfstd.org/issues/181223.1.html
[3]: 
https://snapshots.sourceware.org/dwarfstd/dwarf-spec/latest/dwarf6-20240227-2154.pdf

Re: Inquiry about Development Features and Integration of Source Code Version Control Systems

2024-04-10 Thread Frank Ch. Eigler

Hi -

> Support for reading source code directly from a code hosting service
> (such as GitHub) has been supported for years by Microsoft via
> SourceLink[1]. [...]

Thanks for that reminder.  Yeah, eventually that could be an alternate
way.

> [...]
> > Can you explain under what situations you think this would be helpful?
> 
> I can only think about situations that involve closed source software.
> In my experience, companies shipping closed source software have
> restrictions about where source code can be placed.

i.e., a situation where even an internal debuginfod instance cannot be
trusted with the source code, but some version control system can be?


- FChE

Re: [PATCH] libdw: dwarf_getsrcfiles should not imply dwarf_getsrclines

2024-04-10 Thread Mark Wielaard

Hi Aaron,

On Tue, 2024-04-09 at 23:45 -0400, Aaron Merey wrote:
> dwarf_getsrcfiles causes line data to be read in addition to file data.
> This is wasteful for programs which only need file or directory names.
> Debuginfod server is one such example.
> 
> Fix this by moving the srcfile reading in read_srclines into a separate
> function read_srcfiles.  This change improves debuginfod server's max
> resident set size by up to 75% during rpm indexing.
> 
>   * libdw/dwarf_getsrcfiles.c (dwarf_getsrcfiles): Replace
>   dwarf_getsrclines and __libdw_getsrclines with
>   __libdw_getsrcfiles.
>   * libdw/dwarf_getsrclines.c (read_line_header): New function.
>   (read_srcfiles): New function.
>   (read_srclines): Move srcfile reading into read_srcfiles.
>   Add parameter to use cached srcfiles if available.
>   Also merge srcfiles with any files from DW_LNE_define_file.
>   (__libdw_getsrclines): Changed to call get_lines_or_files.
>   (__libdw_getsrcfiles): New function.  Calls get_lines_or_files.
>   (get_lines_or_files): New function based on the old
>   __libdw_getsrclines.  Call read_srcfiles if linesp is NULL,
>   otherwise call read_srclines.  Pass previously read srcfiles
>   to read_srclines if available.
>   * libdw/dwarf_macro_getsrcfiles.c (dwarf_macro_getsrcfiles):
>   Replace __libdw_getsrclines with __libdw_getsrcfiles.
>   * libdw/libdwP.h (__libdw_getsrcfiles): New declaration.
>   * tests/.gitignore: Add new test binary.
>   * tests/get-files.c: Verify that dwarf_getsrcfiles does
>   not cause srclines to be read.
>   * tests/get-lines.c: Verify that srclines can be read
>   after srcfiles have been read.
>   * tests/Makefile.am: Add new testfiles.
>   * tests/get-files-define-file.c: Print file names before
>   and after reading DW_LNE_define_file.
>   * tests/run-get-files.sh: Add get-files-define-file test.
>   * tests/testfile-define-file.bz2: New testfile.  Copy of
>   testfile36.debug but with a line program consisting of two
>   DW_LNE_define_file opcodes.
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=27405
> 
> Signed-off-by: Aaron Merey 
> ---
> 
> v2 changes:
> Restored support for DW_LNE_define_file.

Great. And sorry I first suggested to just drop it and then said I
would like it back. This was more work than I though.

> Added DW_LNE_define_file testcase and testfile.

Nice. How did you edit this file?

> Added new function get_lines_or_files which is based on the old
> __libdw_getsrclines.  __libdw_getsrclines and __libdw_getsrcfiles now
> call get_lines_or_files.

It is just a simple rename, but so much nicer to my eyes. Thanks.

> Mentioned max resident set size improvement in the commit message.  I
> did not mention a 5% speed up to rpm indexing that I have previously
> talked about since I could not reliably reproduce it with v2 of the patch.
> I suspect that speed up may have been due to the smaller max RSS reducing
> swapping during that round of testing.

Makes sense. And a reduction of 75% of the max resident size it pretty
huge in itself.

> +case DW_LNE_define_file:
> +  {
> +char *fname = (char *) linep;
> +uint8_t *endp = memchr (linep, '\0', lineendp - linep);
> +if (endp == NULL)
> +  goto invalid_data;
> +size_t fnamelen = endp - linep;
> +linep = endp + 1;
> +
> +unsigned int diridx;
> +if (unlikely (linep >= lineendp))
> +  goto invalid_data;
> +get_uleb128 (diridx, linep, lineendp);
> +
> + size_t ndirs = (*filesp)->ndirs;

This one line uses tabs for indentation, all others spaces.

> +if (unlikely (diridx >= ndirs))
> +  {
> +__libdw_seterrno (DWARF_E_INVALID_DIR_IDX);
> +goto invalid_data;
> +  }
> +Dwarf_Word mtime;
> +if (unlikely (linep >= lineendp))
> +  goto invalid_data;
> +get_uleb128 (mtime, linep, lineendp);
> +Dwarf_Word filelength;
> +if (unlikely (linep >= lineendp))
> +  goto invalid_data;
> +get_uleb128 (filelength, linep, lineendp);
> +
> + /* Add new_file to filelist that will be merged with filesp.  */
> +struct filelist *new_file = malloc (sizeof (struct 
> filelist));
> + if (unlikely (new_file == NULL))
> {
> - __libdw_seterrno (DWARF_E_INVALID_DIR_IDX);
> - goto invalid_data;
> + __libdw_seterrno (DWARF_E_NOMEM);
> + goto out;
> }

And this last block uses tabs again.

> - Dwarf_Word mtime;
> - if (unlikely (linep >= lineendp))
> -   goto invali

Re: [PATCH] libdw: dwarf_getsrcfiles should not imply dwarf_getsrclines

2024-04-10 Thread Aaron Merey

Hi Mark,

On Wed, Apr 10, 2024 at 3:48 PM Mark Wielaard  wrote:
> > v2 changes:
> > Restored support for DW_LNE_define_file.
>
> Great. And sorry I first suggested to just drop it and then said I
> would like it back. This was more work than I though.

No problem, better if we support this just in case.

> This one line uses tabs for indentation, all others spaces.
> [...]
> And this last block uses tabs again.
> [...]
> Here tabs again, then the rest spaces.
> [...]
> So inconsistent indentation, but the code looks good.

Fixed the indentation.

> > -  /* Pass the file data structure to the caller.  */
> > -  if (filesp != NULL)
> > -*filesp = files;
> > +  const char **newdirs = (void *) &newfiles->info[nnewfiles];
> > +  const char **prevdirs = (void *) &prevfiles->info[nprevfiles];
> > +
> > +  /* Copy prevdirs to newdirs.  */
> > +  for (size_t n = 0; n < ndirs; n++)
> > + newdirs[n] = prevdirs[n];
>
> Again slightly off indentation.
> But I also don't fully follow the prevdirs/newdirs copying.
> Why is this? No newdirs are defined here, are there?
> Maybe I don't understand the data-structure used here.

The directories are the same but we still need to copy them so that
dwarf_getsrcdirs can find newfiles' dir names.

Dwarf_Files is an unusual structure since it doesn't contain a member
specifically for the array of dirnames.  Instead they're stored at
the end of the Dwarf_Fileinfo array member.

> So testfile-define-file is actually testfile36.debug but with a new
> line program? How did you edit/insert that one?

I used xxd to create a hexdump of testfile36.debug and modified the line
program by hand with a text editor.  I converted the modified hexdump
back to a binary with xxd -r.

I chose testfile36.debug because its .debug_line includes multiple
directory and file names.  It also contains a single short line program
that was easy to replace with two DW_LNE_define_file opcodes without
corrupting things.

Aaron

Re: [rfc] [patch] PR28204: debuginfod ima signature verification

2024-04-10 Thread Frank Ch. Eigler

Hi, Mark -


> > - to drop "permissive" mode
> 
> We discussed a bit on irc about "wording". But I think it isn't really
> how it is worded, but that there is just different features. What is
> called "enforcing" is an authenticity scheme. While "permissive" is
> more like an (optional) error-detecting mode. IMHO it makes sense to
> simply separate those.

For the record, on IRC, I presented these two additional arguments:


in the case of federated debuginfod servers, such as
debuginfod.elfutils.org, a user is stuck with "ignore" mode
operation, because not all upstream servers will have ima
signatures to pass.  this sacrifices all possible assurance that
could come from the federated server relaying ima signatures from
those servers that have them

even in non-federated cases such as fedoraproject.org, the ideal
case for "enforcing" mode as a default, it will fail the moment
the user happens to try to debug some pre-fedora-38 binary that
may be sitting on the system --- because those rpms just don't
have signatures available at all

for both these scenarios, the original "permissive" mode would be
suitable


IOW, without a "permissive" mode being available at all, we could not
ask users to enable this new code at all for our own federated
servers, nor even for fedora.  That's because no server can guarantee
the availability of signatures for all content they can serve.


> That way you don't have a authentication scheme that is easily
> defeated (when put in "permissive" mode).

This "easily defeated" theory needs a threat model.  It sounds like
you are thinking of
- an evil debuginfod instance that
- you trust enough to list in DEBUGINFOD_URLS
- but not enough to label it with "ima:enforcing"
- which may strip the X-DEBUGINFOD-FILE-SIGNATURE header" en route
then yeah, but sounds far-fetched rather than easy.


> And you can implement a simpler error-detection mode that can work
> in more cases (by using the executable .gnu_debuglink CRC)

No, we already know that this is incapable of checking e.g. source
files.  And there is no separate CRC for executables vs. debuginfo
files.  And note that this provides zero protection in the same threat
model above (as the CRC field could be MITM'd).


> [...]
> One "big picture" question is whether this should be a per server URL
> policy or something that is enabled/disabled for all server URLs?
> That makes it less "flexible" but should simplify things a bit for the
> user (and the server urls parsing).

Blame some guy named Mark for getting Ryan to build that out last year. :-)

https://sourceware.org/pipermail/elfutils-devel/2023q3/006299.html

> Also is this all rpm/koji specific? Or do other distros also use ima
> signatures, but encode/store them differently?

As far as I know, Fedora/RHEL are the first.  Yeah it's all
package/style specific.


> [...]
> We now also have a fish profile.

I'm afraid I don't speak fish. :-)


> > +debuginfod_ima_verification_enabled="no"
> > +if test "$enable_ima_verification" = "xrpmimaevmcrypto"; then
> > +  debuginfod_ima_verification_enabled="yes"
> > +  default_ima_cert_path=`eval echo 
> > "/etc/keys/ima:/etc/pki/rpm-ima:$sysconfdir/debuginfod/ima-certs"` # expand 
> > $prefix too
> > +  AC_DEFINE([ENABLE_IMA_VERIFICATION], [1], [Define if the required ima 
> > verification libraries are available])
> > +  AC_DEFINE_UNQUOTED(DEBUGINFOD_IMA_CERT_PATH_DEFAULT, 
> > "$default_ima_cert_path", [Default IMA certificate path])
> > +fi
> > +AM_CONDITIONAL([ENABLE_IMA_VERIFICATION],[test "$enable_ima_verification" 
> > = "xrpmimaevmcrypto"])
> > +
> 
> Not all these are needed anymore now are they?
> [...]

The elaborate default_ima_cert_path?  Yeah probably not, just force
Fedora etc. to use the configure parameter.  The AM_CONDITIONAL
"xrpmimaevmcrypto" part?  Yeah we need those.  (The iamevm part
is just for the headers.)

At some point, I'd like to reformat the debuginfod code base, it's
become a bit of a mishmash.  Separate commit later, I guess?


- FChE

Re: Inquiry about Development Features and Integration of Source Code Version Control Systems

2024-04-10 Thread Matt Schulte

> i.e., a situation where even an internal debuginfod instance cannot be
> trusted with the source code, but some version control system can be?

That was my thought exactly.

On Wed, Apr 10, 2024 at 9:07 AM Frank Ch. Eigler  wrote:
>
> Hi -
>
> > Support for reading source code directly from a code hosting service
> > (such as GitHub) has been supported for years by Microsoft via
> > SourceLink[1]. [...]
>
> Thanks for that reminder.  Yeah, eventually that could be an alternate
> way.
>
> > [...]
> > > Can you explain under what situations you think this would be helpful?
> >
> > I can only think about situations that involve closed source software.
> > In my experience, companies shipping closed source software have
> > restrictions about where source code can be placed.
>
> i.e., a situation where even an internal debuginfod instance cannot be
> trusted with the source code, but some version control system can be?
>
>
> - FChE
>

Re: Inquiry about Development Features and Integration of Source Code Version Control Systems

Re: Inquiry about Development Features and Integration of Source Code Version Control Systems

Re: Inquiry about Development Features and Integration of Source Code Version Control Systems

Re: [PATCH] libdw: dwarf_getsrcfiles should not imply dwarf_getsrclines

Re: [PATCH] libdw: dwarf_getsrcfiles should not imply dwarf_getsrclines

Re: [rfc] [patch] PR28204: debuginfod ima signature verification

Re: Inquiry about Development Features and Integration of Source Code Version Control Systems

7 matches

Site Navigation

Mail list logo

Footer information