[PATCH v3 0/7] debuginfod: speed up extraction from kernel debuginfo packages by 200x

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval This is v3 of my patch series optimizing debuginfod for kernel debuginfo. v1 is here [7], v2 is here [8]. This version fixes a couple of minor bugs and adds test cases. Changes from v2 to v3: - Added a test case with seekable rpm and deb files. - Added a couple of independ

[PATCH v3 4/7] debugifod: add new table and views for seekable archives

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval In order to extract a file from a seekable archive, we need to know where in the uncompressed archive the file data starts and its size. Additionally, in order to populate the response headers, we need the file modification time (since we won't be able to get it from the archi

[PATCH v3 5/7] debuginfod: optimize extraction from seekable xz archives

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval The kernel debuginfo packages on Fedora, Debian, and Ubuntu, and many of their downstreams, are all compressed with xz in multi-threaded mode, which allows random access. We can use this to bypass the full archive extraction and dramatically speed up kernel debuginfo requests

[PATCH v3 7/7] debuginfod: populate _r_seekable on request

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval Since the schema change adding _r_seekable was done in a backward compatible way, seekable archives that were previously scanned will not be in _r_seekable. Whenever an archive is going to be extracted to satisfy a request, check if it is seekable. If so, populate _r_seekabl

[PATCH v3 6/7] debuginfod: populate _r_seekable on scan

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval Whenever a new archive is scanned, check if it is seekable with a little liblzma magic, and populate _r_seekable if so. With this, newly scanned seekable archives will used the optimized extraction path added in the previous commit. Also add a test case using some artificial

[PATCH v3 1/7] debuginfod: fix skipping source file

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval dwarf_extract_source_paths explicitly skips source files that equal "", but dwarf_filesrc may return a path like "dir/". Check for and skip that case, too. In particular, the test debuginfod RPMs have paths like this. However, the test cases didn't catch this because they ha

[PATCH v3 3/7] debuginfod: factor out common code for responding from an archive

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval handle_buildid_r_match has two very similar branches where it optionally extracts a section and then creates a microhttpd response. In preparation for adding a third one, factor it out into a function. Signed-off-by: Omar Sandoval --- debuginfod/debuginfod.cxx | 213 ++

[PATCH v3 2/7] tests/run-debuginfod-fd-prefetch-caches.sh: disable fdcache limit check

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval Since commit acd9525e93d7 ("PR31265 - rework debuginfod archive-extract fdcache"), the fdcache limit is only applied when a new file is interned and it has been at least 10 seconds since the limit was last applied. This means that the fdcache can go over the limit temporarily.

Re: [PATCH 7/9 v2] libdw: Make libdw_findcu thread-safe

2024-07-19 Thread Mark Wielaard
Hi, On Wed, 2024-07-17 at 18:34 -0400, Aaron Merey wrote: > From: Heather McIntyre > > * libdw/libdw_findcu.c (__libdw_findcu): Use eu_tfind > and dwarf_lock > (__libdw_intern_next_unit): Use per-Dwarf_CU locks. > > Signed-off-by: Heather S. McIntyre > Signed-off-by: Aaro

Re: [PATCH 3/9 v2] lib: Add eu_tsearch, eu_tfind, eu_tdelete and eu_tdestroy

2024-07-19 Thread Mark Wielaard
Hi, On Wed, 2024-07-17 at 18:34 -0400, Aaron Merey wrote: > From: Heather McIntyre > > Add new struct search_tree to hold tree root and lock. Add new eu_t* > functions for ensuring synchronized tree access. > > Replace tsearch, tfind, etc with eu_t* equivalents. > > Move the rwlock_* macros o

Re: [PATCH v3 0/7] debuginfod: speed up extraction from kernel debuginfo packages by 200x

2024-07-19 Thread Frank Ch. Eigler
Hi - > This is v3 of my patch series optimizing debuginfod for kernel > debuginfo. v1 is here [7], v2 is here [8]. This version fixes a couple > of minor bugs and adds test cases. [...] Thanks, LGTM, running through try-buildbots to make sure. - FChE

Re: [PATCH v3 0/7] debuginfod: speed up extraction from kernel debuginfo packages by 200x

2024-07-19 Thread Omar Sandoval
On Fri, Jul 19, 2024 at 01:34:48PM -0400, Frank Ch. Eigler wrote: > Hi - > > > This is v3 of my patch series optimizing debuginfod for kernel > > debuginfo. v1 is here [7], v2 is here [8]. This version fixes a couple > > of minor bugs and adds test cases. [...] > > Thanks, LGTM, running through

[PATCH v4 0/7] debuginfod: speed up extraction from kernel debuginfo packages by 200x

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval This is v4 of my patch series optimizing debuginfod for kernel debuginfo. v1 is here [1], v2 is here [2], v3 is here [3]. The only changes from v3 in this version are fixing a bogus maybe-uninitialized error on the Debian build and adding the new test files to EXTRA_DIST so

[PATCH v4 1/7] debuginfod: fix skipping source file

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval dwarf_extract_source_paths explicitly skips source files that equal "", but dwarf_filesrc may return a path like "dir/". Check for and skip that case, too. In particular, the test debuginfod RPMs have paths like this. However, the test cases didn't catch this because they ha

[PATCH v4 2/7] tests/run-debuginfod-fd-prefetch-caches.sh: disable fdcache limit check

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval Since commit acd9525e93d7 ("PR31265 - rework debuginfod archive-extract fdcache"), the fdcache limit is only applied when a new file is interned and it has been at least 10 seconds since the limit was last applied. This means that the fdcache can go over the limit temporarily.

[PATCH v4 4/7] debugifod: add new table and views for seekable archives

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval In order to extract a file from a seekable archive, we need to know where in the uncompressed archive the file data starts and its size. Additionally, in order to populate the response headers, we need the file modification time (since we won't be able to get it from the archi

[PATCH v4 5/7] debuginfod: optimize extraction from seekable xz archives

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval The kernel debuginfo packages on Fedora, Debian, and Ubuntu, and many of their downstreams, are all compressed with xz in multi-threaded mode, which allows random access. We can use this to bypass the full archive extraction and dramatically speed up kernel debuginfo requests

[PATCH v4 3/7] debuginfod: factor out common code for responding from an archive

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval handle_buildid_r_match has two very similar branches where it optionally extracts a section and then creates a microhttpd response. In preparation for adding a third one, factor it out into a function. Signed-off-by: Omar Sandoval --- debuginfod/debuginfod.cxx | 213 ++

[PATCH v4 7/7] debuginfod: populate _r_seekable on request

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval Since the schema change adding _r_seekable was done in a backward compatible way, seekable archives that were previously scanned will not be in _r_seekable. Whenever an archive is going to be extracted to satisfy a request, check if it is seekable. If so, populate _r_seekabl

[PATCH v4 6/7] debuginfod: populate _r_seekable on scan

2024-07-19 Thread Omar Sandoval
From: Omar Sandoval Whenever a new archive is scanned, check if it is seekable with a little liblzma magic, and populate _r_seekable if so. With this, newly scanned seekable archives will used the optimized extraction path added in the previous commit. Also add a test case using some artificial