Re: [PATCH v4 0/7] debuginfod: speed up extraction from kernel debuginfo packages by 200x

2024-07-23 Thread Aaron Merey
Hi Omar, On Fri, Jul 19, 2024 at 2:24 PM Omar Sandoval wrote: > > From: Omar Sandoval > > This is v4 of my patch series optimizing debuginfod for kernel > debuginfo. v1 is here [1], v2 is here [2], v3 is here [3]. The only > changes from v3 in this version are fixing a bogus maybe-uninitialize

Re: [PATCH v4 0/7] debuginfod: speed up extraction from kernel debuginfo packages by 200x

2024-07-23 Thread Omar Sandoval
On Tue, Jul 23, 2024 at 05:47:50PM -0400, Aaron Merey wrote: > Hi Omar, > > On Fri, Jul 19, 2024 at 2:24 PM Omar Sandoval wrote: > > > > From: Omar Sandoval > > > > This is v4 of my patch series optimizing debuginfod for kernel > > debuginfo. v1 is here [1], v2 is here [2], v3 is here [3]. The

[PATCH v5 1/7] debuginfod: fix skipping source file

2024-07-23 Thread Omar Sandoval
From: Omar Sandoval dwarf_extract_source_paths explicitly skips source files that equal "", but dwarf_filesrc may return a path like "dir/". Check for and skip that case, too. In particular, the test debuginfod RPMs have paths like this. However, the test cases didn't catch this because they ha

[PATCH v5 0/7] debuginfod: speed up extraction from kernel debuginfo packages by 200x

2024-07-23 Thread Omar Sandoval
From: Omar Sandoval This is v4 of my patch series optimizing debuginfod for kernel debuginfo. v1 is here [1], v2 is here [2], v3 is here [3], v4 is here [4]. The only change from v4 in this version is adding --fdcache-mbs and --fdcache-mintmp to the new test to fix some sporadic test failures.

[PATCH v5 2/7] tests/run-debuginfod-fd-prefetch-caches.sh: disable fdcache limit check

2024-07-23 Thread Omar Sandoval
From: Omar Sandoval Since commit acd9525e93d7 ("PR31265 - rework debuginfod archive-extract fdcache"), the fdcache limit is only applied when a new file is interned and it has been at least 10 seconds since the limit was last applied. This means that the fdcache can go over the limit temporarily.

[PATCH v5 7/7] debuginfod: populate _r_seekable on request

2024-07-23 Thread Omar Sandoval
From: Omar Sandoval Since the schema change adding _r_seekable was done in a backward compatible way, seekable archives that were previously scanned will not be in _r_seekable. Whenever an archive is going to be extracted to satisfy a request, check if it is seekable. If so, populate _r_seekabl

[PATCH v5 3/7] debuginfod: factor out common code for responding from an archive

2024-07-23 Thread Omar Sandoval
From: Omar Sandoval handle_buildid_r_match has two very similar branches where it optionally extracts a section and then creates a microhttpd response. In preparation for adding a third one, factor it out into a function. Signed-off-by: Omar Sandoval --- debuginfod/debuginfod.cxx | 213 ++

[PATCH v5 6/7] debuginfod: populate _r_seekable on scan

2024-07-23 Thread Omar Sandoval
From: Omar Sandoval Whenever a new archive is scanned, check if it is seekable with a little liblzma magic, and populate _r_seekable if so. With this, newly scanned seekable archives will used the optimized extraction path added in the previous commit. Also add a test case using some artificial

[PATCH v5 4/7] debugifod: add new table and views for seekable archives

2024-07-23 Thread Omar Sandoval
From: Omar Sandoval In order to extract a file from a seekable archive, we need to know where in the uncompressed archive the file data starts and its size. Additionally, in order to populate the response headers, we need the file modification time (since we won't be able to get it from the archi

[PATCH v5 5/7] debuginfod: optimize extraction from seekable xz archives

2024-07-23 Thread Omar Sandoval
From: Omar Sandoval The kernel debuginfo packages on Fedora, Debian, and Ubuntu, and many of their downstreams, are all compressed with xz in multi-threaded mode, which allows random access. We can use this to bypass the full archive extraction and dramatically speed up kernel debuginfo requests