[PATCH] debuginfod/debuginfod-client.c: use long for cache time configurations
time_t is platform dependent and some of architectures e.g. x32, riscv32, arc use 64bit time_t even while they are 32bit architectures, therefore directly using integer printf formats will not work portably. Use a plain long everywhere as the intervals are small enough that it will not be problematic. Signed-off-by: Alexander Kanavin --- debuginfod/debuginfod-client.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index 9bf97bfc..024b0954 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -135,17 +135,17 @@ struct debuginfod_client how frequently the cache should be cleaned. The file's st_mtime represents the time of last cleaning. */ static const char *cache_clean_interval_filename = "cache_clean_interval_s"; -static const time_t cache_clean_default_interval_s = 86400; /* 1 day */ +static const long cache_clean_default_interval_s = 86400; /* 1 day */ /* The cache_miss_default_s within the debuginfod cache specifies how frequently the 000-permision file should be released.*/ -static const time_t cache_miss_default_s = 600; /* 10 min */ +static const long cache_miss_default_s = 600; /* 10 min */ static const char *cache_miss_filename = "cache_miss_s"; /* The cache_max_unused_age_s file within the debuginfod cache specifies the the maximum time since last access that a file will remain in the cache. */ static const char *cache_max_unused_age_filename = "max_unused_age_s"; -static const time_t cache_default_max_unused_age_s = 604800; /* 1 week */ +static const long cache_default_max_unused_age_s = 604800; /* 1 week */ /* Location of the cache of files downloaded from debuginfods. The default parent directory is $HOME, or '/' if $HOME doesn't exist. */ -- 2.20.1
Re: [PATCH] debuginfod/debuginfod-client.c: correct string format on 32bit arches with 64bit time_t
Alright, I just sent a patch that replaces time_t with long instead. I tested it on x86, x86-x32, x86-64. Alex On Wed, 8 Dec 2021 at 16:31, Mark Wielaard wrote: > Hi Alexander, > > On Sun, 2021-12-05 at 21:45 +0100, Alexander Kanavin wrote: > > I'm not sure; the point of this patch is simply to ensure debuginfod > builds > > everywhere without errors. Making the types consistent is perhaps better > > done as a followup? > > I think the issue of the code not compiling in some environments is > because of the inconsistent use of types. So I rather fix the build > errors by fixing the used types than to simply cast the errors away. > > Cheers, > > Mark >
Re: [PATCH] debuginfod/debuginfod-client.c: use long for cache time configurations
Hi - > x32, riscv32, arc use 64bit time_t even while they are 32bit > architectures, therefore directly using integer printf formats will not > work portably. > Use a plain long everywhere as the intervals are small enough > that it will not be problematic. lgtm! - FChE
right, trying --enable-sanitizer-address on armv7l
Hi, I was trying the new --enable-sanitizer-address on our armv7l buildbot worker and it almost works as is, except for... debuginfod.cxx:3472:12: runtime error: reference binding to misaligned address 0x00561ec9 for type '', which requires 2 byte alignment 0x00561ec9: note: pointer points here 12 0a 00 2d e9 f0 4f 85 b0 00 af f8 60 df f8 bc e1 fe 44 f8 68 fa 68 00 2a 04 d0 00 f0 03 0c bc ^ And indeed removing 'right' here: diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx index 35424e47..6fad331a 100644 --- a/debuginfod/debuginfod.cxx +++ b/debuginfod/debuginfod.cxx @@ -3469,7 +3469,7 @@ database_stats_report() throw sqlite_exception(rc, "step"); obatched(clog) -<< right << setw(20) << ((const char*) sqlite3_column_text(ps_query, 0) ?: (const char*) "NULL") +<< setw(20) << ((const char*) sqlite3_column_text(ps_query, 0) ?: (const char*) "NULL") << " " << (sqlite3_column_text(ps_query, 1) ?: (const unsigned char*) "NULL") << endl; Makes everything PASS. But I don't understand why. It might be a bug in gcc/libasan (this is gcc 8.3.0 Debian 10.11 - Buster). I can try upgrading the machine to Debian 11 - Bullseye this weekend to see if that helps. Also, do we really want to right align the log here? We don't seem to align the log text anywhere else. Cheers, Mark
Re: right, trying --enable-sanitizer-address on armv7l
Hi, On Thu, 2021-12-09 at 15:23 +0100, Mark Wielaard wrote: > I was trying the new --enable-sanitizer-address on our armv7l > buildbot worker and it almost works as is, except for... I was confusing address sanitizer and the undefined sanitizer. It is the undefined sanitizer that produces the following runtime error. The new address sanitizer seems to work as expected. So I'll at least use --enable-sanitizer-address on the armv7l CI buildbot worker, but it would be nice to figure out why the undefined sanitizer is complaining about this usage of "right" and whether we really need it. > debuginfod.cxx:3472:12: runtime error: reference binding to misaligned > address 0x00561ec9 for type '', which requires 2 byte > alignment > 0x00561ec9: note: pointer points here > 12 0a 00 2d e9 f0 4f 85 b0 00 af f8 60 df f8 bc e1 fe 44 f8 68 fa > 68 00 2a 04 d0 00 f0 03 0c bc > ^ > > And indeed removing 'right' here: > > diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx > index 35424e47..6fad331a 100644 > --- a/debuginfod/debuginfod.cxx > +++ b/debuginfod/debuginfod.cxx > @@ -3469,7 +3469,7 @@ database_stats_report() > throw sqlite_exception(rc, "step"); > >obatched(clog) > -<< right << setw(20) << ((const char*) > sqlite3_column_text(ps_query, 0) ?: (const char*) "NULL") > +<< setw(20) << ((const char*) sqlite3_column_text(ps_query, 0) > ?: (const char*) "NULL") > << " " > << (sqlite3_column_text(ps_query, 1) ?: (const unsigned char*) > "NULL") > << endl; > > Makes everything PASS. > > But I don't understand why. It might be a bug in gcc/libasan (this is > gcc 8.3.0 Debian 10.11 - Buster). I can try upgrading the machine to > Debian 11 - Bullseye this weekend to see if that helps. > > Also, do we really want to right align the log here? We don't seem to > align the log text anywhere else. > > Cheers, > > Mark
Re: right, trying --enable-sanitizer-address on armv7l
Hi - > I was trying the new --enable-sanitizer-address on our armv7l buildbot > worker and it almost works as is, except for... > > debuginfod.cxx:3472:12: runtime error: reference binding to misaligned > address 0x00561ec9 for type '', which requires 2 byte > alignment > [...] > But I don't understand why. It might be a bug in gcc/libasan (this is > gcc 8.3.0 Debian 10.11 - Buster). [...] It must be a bug in gcc/libasan or something. > Also, do we really want to right align the log here? We don't seem to > align the log text anywhere else. This one just prettifies the messages because there is a sequence metrics & values being printed at startup, so it makes the numbers line up. But no great loss to drop; we export those as prometheus metrics too. - FChE
Re: patch rfc: PR28661: debuginfod thread-pool
Hi Frank, On Wed, 2021-12-08 at 22:55 -0500, Frank Ch. Eigler via Elfutils-devel wrote: > While I think this patch itself is fine, and works around the > libmicrohttpd bug that motivated it, I don't know how to test it > seriously in the testsuite. (We can certainly try few -C options for > parsing & operability.) The error edge cases only appear to occur > under very high load and task limits such as those imposed by systemd > cgroups. We have a couple of tests that try to seek the limit that they can still safely run under (e.g. run-large-elf-file.sh), but they were probably mistakes to include in the normal regression test suite. It is tricky to find the appropriate limit on a machine without hitting it and causing the test machine to fall over (briefly). If you can use ulimit -u or ulimit -T in the run-test.sh script then please use that, but that probably requires launching sub-shells and you quickly end up in shell-hell. So I would recommend to simply add a testcase that just uses no-option, -C and -C 256 (or take any arbitrary number, 1 might be an interesting corner case) and see that if you have 4 (8, 16, ...?) debuginfod-find (or curl metric) processes doing some parallel queries works as expected. > Author: Frank Ch. Eigler > Date: Wed Dec 8 22:41:47 2021 -0500 > > PR28661: debuginfo connection thread pool support > > Add an option -C, which activates libmicrohttpd's thread-pool mode for > handling incoming http connections. Add libmicrohttpd error-logging > callback function so as to receive indication of its internal errors, > and relay counts to our metrics. Some of these internal errors tipped > us off to a microhttpd bug that thread pooling works around. Document > in debuginfod.8 page. Hand-tested against "ulimit -u NNN" shells. > > Signed-off-by: Frank Ch. Eigler > > diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog > index 21d0721ef080..125e7d816f51 100644 > --- a/debuginfod/ChangeLog > +++ b/debuginfod/ChangeLog > @@ -1,3 +1,11 @@ > +2021-12-08 Frank Ch. Eigler > + > + * debuginfod.cxx (connection_pool): New global. > + (parse_opt): Parse & check -C option to set it. > + (error_cb): New callback for libmicrohttpd error counting. > + (main): Activate MHD_OPTION_THREAD_POOL_SIZE if appropriate. > + Activate error_cb. > + > 2021-12-01 Mark Wielaard > > * debuginfod-client.c (debuginfod_query_server): Free tmp_url on > diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx > index 0d3f02978ee2..bb0c6cd65670 100644 > --- a/debuginfod/debuginfod.cxx > +++ b/debuginfod/debuginfod.cxx > @@ -352,7 +352,9 @@ static const struct argp_option options[] = > { "rescan-time", 't', "SECONDS", 0, "Number of seconds to wait between > rescans, 0=disable.", 0 }, > { "groom-time", 'g', "SECONDS", 0, "Number of seconds to wait between > database grooming, 0=disable.", 0 }, > { "maxigroom", 'G', NULL, 0, "Run a complete database groom/shrink pass > at startup.", 0 }, > - { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to > NUM.", 0 }, > + { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to > NUM, default=#CPUs.", 0 }, > + { "connection-pool", 'C', "NUM", OPTION_ARG_OPTIONAL, > + "Use webapi connection pool with NUM threads, default=unlim.", 0 }, > { "include", 'I', "REGEX", 0, "Include files matching REGEX, > default=all.", 0 }, > { "exclude", 'X', "REGEX", 0, "Exclude files matching REGEX, > default=none.", 0 }, > { "port", 'p', "NUM", 0, "HTTP port to listen on, default 8002.", 0 }, > @@ -411,6 +413,7 @@ static unsigned rescan_s = 300; > static unsigned groom_s = 86400; > static bool maxigroom = false; > static unsigned concurrency = std::thread::hardware_concurrency() ?: 1; > +static int connection_pool = 0; > static set source_paths; > static bool scan_files = false; > static map scan_archives; > @@ -557,6 +560,16 @@ parse_opt (int key, char *arg, >concurrency = (unsigned) atoi(arg); >if (concurrency < 1) concurrency = 1; >break; > +case 'C': > + if (arg) > +{ > + connection_pool = atoi(arg); > + if (connection_pool < 2) > +argp_failure(state, 1, EINVAL, "-C NUM minimum 2"); > +} > + else // arg not given > +connection_pool = std::thread::hardware_concurrency() * 2 ?: 2; > + break; Aha, forget about my -C 1 corner case above then :) > case 'I': >// NB: no problem with unconditional free here - an earlier failed > regcomp would exit program >if (passive_p) > @@ -3684,7 +3698,13 @@ sigusr2_handler (int /* sig */) > } > > > - > +static void // error logging callback from libmicrohttpd internals > +error_cb (void *arg, const char *fmt, va_list ap) > +{ > + (void) arg; > + inc_metric("error_count","libmicrohttpd",fmt); > + (void) vdprintf (STDERR_FILENO, fmt, ap); > +} > > > // A user-
Re: patch rfc: PR28661: debuginfod thread-pool
Hi - > [...] > If you can use ulimit -u or ulimit -T in the run-test.sh script then > please use that, but that probably requires launching sub-shells and > you quickly end up in shell-hell. A problem I found with that is that ulimit -u appears to be systemwide in the sense that a new process/thread will be prevented if the total systemwide processes of that user exceed this number. It's not just for accounting that particular processes' own children. (ulimit -T doesn't work on my f35 machine with bash or zsh or csh.) > So I would recommend to simply add a testcase that just uses no-option, > -C and -C 256 (or take any arbitrary number, 1 might be an interesting > corner case) and see that if you have 4 (8, 16, ...?) debuginfod-find > (or curl metric) processes doing some parallel queries works as > expected. Yeah, I can/will do that. Well, for any of these -C settings, even thousands of concurrently issued curl jobs should succeed, just not quite as quickly. - FChE
Re: right, trying --enable-sanitizer-address on armv7l
Hi Frank, On Thu, 2021-12-09 at 10:47 -0500, Frank Ch. Eigler via Elfutils-devel wrote: > > debuginfod.cxx:3472:12: runtime error: reference binding to > > misaligned > > address 0x00561ec9 for type '', which requires 2 byte > > alignment > > [...] > > But I don't understand why. It might be a bug in gcc/libasan (this > > is > > gcc 8.3.0 Debian 10.11 - Buster). [...] > > It must be a bug in gcc/libasan or something. Yeah, I'll try to see if an gcc upgrade on the machine will help (but that has to wait a couple of days). > > Also, do we really want to right align the log here? We don't seem > > to > > align the log text anywhere else. > > This one just prettifies the messages because there is a sequence > metrics & values being printed at startup, so it makes the numbers > line up. But no great loss to drop; we export those as prometheus > metrics too. I am not proud of needing this workaround, but I did push the attached. Thanks, Mark From 7fc69582efcfb5f005f04c818a7aab76ff1090be Mon Sep 17 00:00:00 2001 From: Mark Wielaard Date: Thu, 9 Dec 2021 18:00:05 +0100 Subject: [PATCH] debuginfod: Don't format clog using 'right' or 'setw(20)'. Keep the logs just plain unformatted text. This really is a workaround for an apparent bug with gcc 8.3 -fsanitizer=undefined on arm32, which complains about the 'right' formatter: debuginfod.cxx:3472:12: runtime error: reference binding to misaligned address 0x00561ec9 for type '', which requires 2 byte alignment Signed-off-by: Mark Wielaard --- debuginfod/ChangeLog | 5 + debuginfod/debuginfod.cxx | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog index 7a4840ff..df373201 100644 --- a/debuginfod/ChangeLog +++ b/debuginfod/ChangeLog @@ -1,3 +1,8 @@ +2021-12-09 Mark Wielaard + + * debuginfod.cxx (database_stats_report): Don't format clog + using 'right' and 'setw(20)'. + 2021-12-04 Mark Wielaard * debuginfod.cxx (main): Call debuginfod_pool_groom before exit. diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx index 35424e47..887e4f5a 100644 --- a/debuginfod/debuginfod.cxx +++ b/debuginfod/debuginfod.cxx @@ -3469,7 +3469,7 @@ database_stats_report() throw sqlite_exception(rc, "step"); obatched(clog) -<< right << setw(20) << ((const char*) sqlite3_column_text(ps_query, 0) ?: (const char*) "NULL") +<< ((const char*) sqlite3_column_text(ps_query, 0) ?: (const char*) "NULL") << " " << (sqlite3_column_text(ps_query, 1) ?: (const unsigned char*) "NULL") << endl; -- 2.18.4
New buildbot CI worker setups
Hi, I tweaked the configure options various buildbot do: https://builder.wildebeest.org/buildbot/#/builders?tags=elfutils elfutils-centos-x86_64: --enable-valgrind --enable-sanitize-undefined elfutils-debian-amd64: --enable-valgrind --enable-sanitize-undefined elfutils-debian-arm64: --enable-valgrind (--enable-sanitize-undefined slows down too much, --enable-sanitize-address is 3 times slower than --enable-valgrind) elfutils-debian-armhf: --enable-sanitize-address --enable-sanitize-undefined elfutils-debian-i386: --enable-sanitize-address --enable-sanitize-undefined elfutils-fedora-ppc64: --enable-sanitize-undefined elfutils-fedora-ppc64le: --enable-valgrind --enable-sanitize-undefined elfutils-fedora-s390x: --enable-valgrind --enable-sanitize-undefined elfutils-fedora-x86_64 does a make distcheck, which enables --enable-valgrind --enable-sanitize-undefined Hopefully that is a good mix of extra checking options. Cheers, Mark
[Bug libelf/28666] memmove() reads out-of-range in elf32_xlatetom
https://sourceware.org/bugzilla/show_bug.cgi?id=28666 Mark Wielaard changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Mark Wielaard --- Could you open a new bug report for the issue in comment #6? It is indeed a different issue, although very similar to this one. I can probably create a fix just from that backtrace, but having a reproducer would be great. Thanks for testing, pushed as: commit 809f2d70ec770d512cf6b1e70a67f5eb84c4508c Author: Mark Wielaard Date: Wed Dec 8 13:39:47 2021 +0100 libdwfl: Don't try to convert too many bytes in dwfl_link_map_report When trying to read (corrupt) phdrs from a core file we only want to read and convert the bytes we could read. Also make sure we don't try to allocate too big buffers. https://sourceware.org/bugzilla/show_bug.cgi?id=28666 Signed-off-by: Mark Wielaard -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libdw/28657] UBSan seems to report a divison-by-zero in dwfl_link_map_report
https://sourceware.org/bugzilla/show_bug.cgi?id=28657 Mark Wielaard changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Mark Wielaard --- commit c21c606602e1160c19d01e2836e23aa1a9e13432 Author: Mark Wielaard Date: Wed Dec 8 20:48:45 2021 +0100 libdwfl: Make sure we know the phdr entry size before searching phdrs. Without the program header entry size we cannot search through the phdrs. https://sourceware.org/bugzilla/show_bug.cgi?id=28657 Signed-off-by: Mark Wielaard -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libelf/28666] memmove() reads out-of-range in elf32_xlatetom
https://sourceware.org/bugzilla/show_bug.cgi?id=28666 --- Comment #8 from Evgeny Vereshchagin --- Created attachment 13840 --> https://sourceware.org/bugzilla/attachment.cgi?id=13840&action=edit File triggering an "invalid read" I've just added a file triggering that issue. ``` $ git describe elfutils-0.186-20-g98e7adf7 autoreconf -i -f ./configure --enable-maintainer-mode make -j$(nproc) V=1 $ DEBUGINFOD_URLS= LD_PRELOAD="./libelf/libelf.so ./libdw/libdw.so" valgrind --leak-check=full ./src/stack --core ../crash-51ef1bcb1fbd741ff5fde645079625a8ed871225 ==55019== Memcheck, a memory error detector ==55019== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==55019== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==55019== Command: ./src/stack --core ../crash-51ef1bcb1fbd741ff5fde645079625a8ed871225 ==55019== ==55019== Invalid read of size 8 ==55019==at 0x484B214: memmove (vg_replace_strmem.c:1382) ==55019==by 0x48585DF: memmove (string_fortified.h:36) ==55019==by 0x48585DF: elf32_xlatetom (elf32_xlatetom.c:96) ==55019==by 0x48C793A: dwfl_link_map_report (link_map.c:1013) ==55019==by 0x48C8CA5: dwfl_core_file_report@@ELFUTILS_0.158 (core-file.c:548) ==55019==by 0x402EC6: parse_opt (stack.c:595) ==55019==by 0x4C4D591: argp_parse (in /usr/lib64/libc.so.6) ==55019==by 0x4024EA: main (stack.c:695) ==55019== Address 0x6026000 is not stack'd, malloc'd or (recently) free'd ==55019== ``` -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libdw/28659] UBSan seems to complain about an "integer overflow" in dwfl_segment_report_module
https://sourceware.org/bugzilla/show_bug.cgi?id=28659 Mark Wielaard changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Mark Wielaard --- Thanks for testing, pushed as: commit b9ed67836b6f4e580927b4e8e1c8517e70a086be Author: Mark Wielaard Date: Wed Dec 8 22:20:17 2021 +0100 libdwfl: Don't trust e_shentsize in dwfl_segment_report_module When calulating the possible section header table end us the actual size of the section headers (sizeof (Elf32_Shdr) or sizeof (Elf64_Shdr)), not the ELF header e_shentsize value, which can be corrupted. This prevents a posssible overflow, but we check the shdrs_end is sane later anyway. https://sourceware.org/bugzilla/show_bug.cgi?id=28659 Signed-off-by: Mark Wielaard > it fixed one LGTM alert as well. I'm not sure if I mentioned this anywhere but > LGTM builds those reports on a daily basis and those reports can be found at > https://lgtm.com/projects/g/evverx/elfutils/alerts/?mode=tree . Hmmm. At first I thought this was pretty useful to add to our own buildbot CI setup. But it comes with a horribly proprietary license :{ "CodeQL can’t be used for automated analysis, continuous integration or continuous delivery" Sigh. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libdw/28655] There seems to be a memory leak in file_read_elf
https://sourceware.org/bugzilla/show_bug.cgi?id=28655 Mark Wielaard changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Mark Wielaard --- Thanks for testing, pushed as: commit 98e7adf70896ac179258d41c2aac38e9e91614bb Author: Mark Wielaard Date: Wed Dec 8 23:44:34 2021 +0100 libdwfl: Don't install an Elf handle in a Dwfl_Module twice dwfl_segment_report_module can be called with the same module name, start and end address twice (probably because of a corrupt core file). In that case don't override the main.elf handle if it already exists. https://sourceware.org/bugzilla/show_bug.cgi?id=28655 Signed-off-by: Mark Wielaard -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libdw/28677] New: Bad dynamic entry conversion in dwfl_link_map_report
https://sourceware.org/bugzilla/show_bug.cgi?id=28677 Bug ID: 28677 Summary: Bad dynamic entry conversion in dwfl_link_map_report Product: elfutils Version: unspecified Status: NEW Severity: normal Priority: P2 Component: libdw Assignee: unassigned at sourceware dot org Reporter: mark at klomp dot org CC: elfutils-devel at sourceware dot org Target Milestone: --- Created attachment 13841 --> https://sourceware.org/bugzilla/attachment.cgi?id=13841&action=edit Corrupt core file Reproducer from Bug 28666 $ valgrind -q eu-stack --core ./poc2 ==500881== Invalid read of size 8 ==500881==at 0x4851268: __GI_memmove (vg_replace_strmem.c:1271) ==500881==by 0x48994CB: elf32_xlatetom (in /usr/lib/aarch64-linux-gnu/libelf-0.183.so) ==500881==by 0x49232D3: ??? (in /usr/lib/aarch64-linux-gnu/libdw-0.183.so) ==500881==by 0x4924977: dwfl_core_file_report (in /usr/lib/aarch64-linux-gnu/libdw-0.183.so) ==500881==by 0x10AAB7: ??? (in /usr/bin/eu-stack) ==500881==by 0x4C2ABAF: group_parse (argp-parse.c:257) ==500881==by 0x4C2ABAF: parser_finalize (argp-parse.c:604) ==500881==by 0x4C2ABAF: argp_parse (argp-parse.c:922) ==500881==by 0x10A273: ??? (in /usr/bin/eu-stack) ==500881==by 0x4B6C217: (below main) (libc-start.c:308) ==500881== Address 0x4034000 is 0 bytes after the brk data segment limit 0x4034000 ==500881== ==500881== Invalid read of size 8 ==500881==at 0x4851278: __GI_memmove (vg_replace_strmem.c:1271) ==500881==by 0x48994CB: elf32_xlatetom (in /usr/lib/aarch64-linux-gnu/libelf-0.183.so) ==500881==by 0x49232D3: ??? (in /usr/lib/aarch64-linux-gnu/libdw-0.183.so) ==500881==by 0x4924977: dwfl_core_file_report (in /usr/lib/aarch64-linux-gnu/libdw-0.183.so) ==500881==by 0x10AAB7: ??? (in /usr/bin/eu-stack) ==500881==by 0x4C2ABAF: group_parse (argp-parse.c:257) ==500881==by 0x4C2ABAF: parser_finalize (argp-parse.c:604) ==500881==by 0x4C2ABAF: argp_parse (argp-parse.c:922) ==500881==by 0x10A273: ??? (in /usr/bin/eu-stack) ==500881==by 0x4B6C217: (below main) (libc-start.c:308) ==500881== Address 0x4034008 is 8 bytes after the brk data segment limit 0x4034000 ==500881== ==500881== Invalid read of size 8 ==500881==at 0x4851280: __GI_memmove (vg_replace_strmem.c:1271) ==500881==by 0x48994CB: elf32_xlatetom (in /usr/lib/aarch64-linux-gnu/libelf-0.183.so) ==500881==by 0x49232D3: ??? (in /usr/lib/aarch64-linux-gnu/libdw-0.183.so) ==500881==by 0x4924977: dwfl_core_file_report (in /usr/lib/aarch64-linux-gnu/libdw-0.183.so) ==500881==by 0x10AAB7: ??? (in /usr/bin/eu-stack) ==500881==by 0x4C2ABAF: group_parse (argp-parse.c:257) ==500881==by 0x4C2ABAF: parser_finalize (argp-parse.c:604) ==500881==by 0x4C2ABAF: argp_parse (argp-parse.c:922) ==500881==by 0x10A273: ??? (in /usr/bin/eu-stack) ==500881==by 0x4B6C217: (below main) (libc-start.c:308) ==500881== Address 0x4034010 is 16 bytes after the brk data segment limit 0x4034000 ==500881== ==500881== Invalid read of size 8 ==500881==at 0x4851288: __GI_memmove (vg_replace_strmem.c:1271) ==500881==by 0x48994CB: elf32_xlatetom (in /usr/lib/aarch64-linux-gnu/libelf-0.183.so) ==500881==by 0x49232D3: ??? (in /usr/lib/aarch64-linux-gnu/libdw-0.183.so) ==500881==by 0x4924977: dwfl_core_file_report (in /usr/lib/aarch64-linux-gnu/libdw-0.183.so) ==500881==by 0x10AAB7: ??? (in /usr/bin/eu-stack) ==500881==by 0x4C2ABAF: group_parse (argp-parse.c:257) ==500881==by 0x4C2ABAF: parser_finalize (argp-parse.c:604) ==500881==by 0x4C2ABAF: argp_parse (argp-parse.c:922) ==500881==by 0x10A273: ??? (in /usr/bin/eu-stack) ==500881==by 0x4B6C217: (below main) (libc-start.c:308) ==500881== Address 0x4034018 is 24 bytes after the brk data segment limit 0x4034000 -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libelf/28666] memmove() reads out-of-range in elf32_xlatetom
https://sourceware.org/bugzilla/show_bug.cgi?id=28666 --- Comment #9 from Mark Wielaard --- (In reply to Evgeny Vereshchagin from comment #8) > Created attachment 13840 [details] > File triggering an "invalid read" > > I've just added a file triggering that issue. Thanks, I opened a new bug report for this: https://sourceware.org/bugzilla/show_bug.cgi?id=28677 -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libdw/28659] UBSan seems to complain about an "integer overflow" in dwfl_segment_report_module
https://sourceware.org/bugzilla/show_bug.cgi?id=28659 --- Comment #4 from Evgeny Vereshchagin --- > But it comes with a horribly proprietary license Unfortunately LGTM (like many other CI services) is tightly coupled with GitHub (where it can be used for automated analysis of open source projects) so in theory it should be possible to keep a read-only fork of the elfutils repository updated by a cron script run by buildbot. I'm not sure whether it's worth it though. -- You are receiving this mail because: You are on the CC list for the bug.
[PATCH] libdwfl: Don't try to convert too many dyns in dwfl_link_map_report
When trying to read (corrupt) dynamic entries from a core file we only want to read and convert the entries we could read. Also make sure we don't try to allocate too bug a buffer. Signed-off-by: Mark Wielaard --- libdwfl/ChangeLog | 6 ++ libdwfl/link_map.c | 14 +- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/libdwfl/ChangeLog b/libdwfl/ChangeLog index 1f593ac9..856bd335 100644 --- a/libdwfl/ChangeLog +++ b/libdwfl/ChangeLog @@ -1,3 +1,9 @@ +2021-12-09 Mark Wielaard + + * link_map.c (dwfl_link_map_report): Limit dyn_filesz malloc size + to max possible. When converting make sure we don't exceed the number + of bytes available in either in.d_buf or out.d_buf. + 2021-12-08 Mark Wielaard * dwfl_segment_report_module.c (dwfl_segment_report_module): Check diff --git a/libdwfl/link_map.c b/libdwfl/link_map.c index 623b3062..ad93501e 100644 --- a/libdwfl/link_map.c +++ b/libdwfl/link_map.c @@ -994,6 +994,17 @@ dwfl_link_map_report (Dwfl *dwfl, const void *auxv, size_t auxv_size, if ((*memory_callback) (dwfl, dyn_segndx, &in.d_buf, &in.d_size, dyn_vaddr, dyn_filesz, memory_callback_arg)) { + size_t entsize = (elfclass == ELFCLASS32 + ? sizeof (Elf32_Dyn) : sizeof (Elf64_Dyn)); + if (unlikely (dyn_filesz > SIZE_MAX / entsize)) + { + __libdwfl_seterrno (DWFL_E_NOMEM); + return false; + } + /* We can only process as many bytes as there are in +in.d_size. The data might have been truncated. */ + if (dyn_filesz > in.d_size) + dyn_filesz = in.d_size; void *buf = malloc (dyn_filesz); Elf32_Dyn (*d32)[dyn_filesz / sizeof (Elf32_Dyn)] = buf; Elf64_Dyn (*d64)[dyn_filesz / sizeof (Elf64_Dyn)] = buf; @@ -1009,7 +1020,8 @@ dwfl_link_map_report (Dwfl *dwfl, const void *auxv, size_t auxv_size, .d_size = dyn_filesz, .d_buf = buf }; - in.d_size = out.d_size; + if (in.d_size > out.d_size) + in.d_size = out.d_size; if (likely ((elfclass == ELFCLASS32 ? elf32_xlatetom : elf64_xlatetom) (&out, &in, elfdata) != NULL)) -- 2.30.2
[Bug libdw/28677] Bad dynamic entry conversion in dwfl_link_map_report
https://sourceware.org/bugzilla/show_bug.cgi?id=28677 Mark Wielaard changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at sourceware dot org |mark at klomp dot org --- Comment #1 from Mark Wielaard --- Proposed patch: https://sourceware.org/pipermail/elfutils-devel/2021q4/004507.html -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libdw/28660] ASan seems to complain about a "heap-buffer-overflow"
https://sourceware.org/bugzilla/show_bug.cgi?id=28660 --- Comment #3 from Evgeny Vereshchagin --- Looks like it keeps popping up with all the patches applied ``` 0a2c8345 libdwfl: Don't try to convert too many dyns in dwfl_link_map_report ea8ce550 libdwfl: Don't install an Elf handle in a Dwfl_Module twice 906e0ca5 libdwfl: Don't trust e_shentsize in dwfl_segment_report_module a5dc98be libdwfl: Make sure we know the phdr entry size before searching phdrs. 8ae296dc libdwfl: Add overflow check while iterating in dwfl_segment_report_module c0dd1c35 libdwfl: Don't try to convert too many bytes in dwfl_link_map_report 5ba884a5 configure: Add --enable-sanitize-address ``` I'll attach a file triggering it once the fuzz target runs into it again -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libdw/28660] ASan seems to complain about a "heap-buffer-overflow"
https://sourceware.org/bugzilla/show_bug.cgi?id=28660 --- Comment #4 from Evgeny Vereshchagin --- Created attachment 13842 --> https://sourceware.org/bugzilla/attachment.cgi?id=13842&action=edit File triggering an "invalid read" I've just attached a file triggering the issue: ``` autoreconf -i -f ./configure --enable-maintainer-mode make -j$(nproc) V=1 DEBUGINFOD_URLS= LD_PRELOAD="./libelf/libelf.so ./libdw/libdw.so" valgrind --leak-check=full ./src/stack --core ../crash-e8e47de6a28b1be30e3a7e2f92b7c9e4f4fffa9d ==87229== Memcheck, a memory error detector ==87229== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==87229== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==87229== Command: ./src/stack --core ../crash-e8e47de6a28b1be30e3a7e2f92b7c9e4f4fffa9d ==87229== ==87229== Invalid read of size 4 ==87229==at 0x48C783F: dwfl_link_map_report (link_map.c:917) ==87229==by 0x48C8DC5: dwfl_core_file_report@@ELFUTILS_0.158 (core-file.c:548) ==87229==by 0x402EC6: parse_opt (stack.c:595) ==87229==by 0x4C4D591: argp_parse (in /usr/lib64/libc.so.6) ==87229==by 0x4024EA: main (stack.c:695) ==87229== Address 0x5029ae0 is 0 bytes after a block of size 4,096 alloc'd ==87229==at 0x484186F: malloc (vg_replace_malloc.c:381) ==87229==by 0x48C7D6B: dwfl_link_map_report (link_map.c:891) ==87229==by 0x48C8DC5: dwfl_core_file_report@@ELFUTILS_0.158 (core-file.c:548) ==87229==by 0x402EC6: parse_opt (stack.c:595) ==87229==by 0x4C4D591: argp_parse (in /usr/lib64/libc.so.6) ==87229==by 0x4024EA: main (stack.c:695) ``` -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libdw/28677] Bad dynamic entry conversion in dwfl_link_map_report
https://sourceware.org/bugzilla/show_bug.cgi?id=28677 Evgeny Vereshchagin changed: What|Removed |Added CC||evvers at ya dot ru --- Comment #2 from Evgeny Vereshchagin --- (In reply to Mark Wielaard from comment #1) > Proposed patch: > https://sourceware.org/pipermail/elfutils-devel/2021q4/004507.html With that patch applied I can no longer reproduce the issue. Thanks! -- You are receiving this mail because: You are on the CC list for the bug.
Re: patch rfc: PR28661: debuginfod thread-pool
Hi, Mark - > So I would recommend to simply add a testcase that just uses no-option, > -C and -C 256 (or take any arbitrary number, 1 might be an interesting > corner case) and see that if you have 4 (8, 16, ...?) debuginfod-find > (or curl metric) processes doing some parallel queries works as > expected. OK, here's the current version, with a test, and with one more small improvement to the error message printer (to add timestamp & batching). Author: Frank Ch. Eigler Date: Wed Dec 8 22:41:47 2021 -0500 PR28661: debuginfo connection thread pool support Add an option -C, which activates libmicrohttpd's thread-pool mode for handling incoming http connections. Add libmicrohttpd error-logging callback function so as to receive indication of its internal errors, and relay counts to our metrics. Some of these internal errors tipped us off to a microhttpd bug that thread pooling works around. Document in debuginfod.8 page. Hand-tested against "ulimit -u NNN" shells, and with a less strenuous new test case. Signed-off-by: Frank Ch. Eigler diff --git a/NEWS b/NEWS index 490932ae4ef9..6be58866f100 100644 --- a/NEWS +++ b/NEWS @@ -1,3 +1,8 @@ +Version 0.187 after 0.186 + +debuginfod: Support -C option for connection thread pooling. + + Version 0.186 debuginfod-client: Default $DEBUGINFOD_URLS is computed from drop-in files diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog index df373201d5c6..2642ef5eeaf1 100644 --- a/debuginfod/ChangeLog +++ b/debuginfod/ChangeLog @@ -3,6 +3,14 @@ * debuginfod.cxx (database_stats_report): Don't format clog using 'right' and 'setw(20)'. +2021-12-08 Frank Ch. Eigler + + * debuginfod.cxx (connection_pool): New global. + (parse_opt): Parse & check -C option to set it. + (error_cb): New callback for libmicrohttpd error counting. + (main): Activate MHD_OPTION_THREAD_POOL_SIZE if appropriate. + Activate error_cb. + 2021-12-04 Mark Wielaard * debuginfod.cxx (main): Call debuginfod_pool_groom before exit. diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx index 887e4f5abd44..bb8e8e102896 100644 --- a/debuginfod/debuginfod.cxx +++ b/debuginfod/debuginfod.cxx @@ -353,7 +353,9 @@ static const struct argp_option options[] = { "rescan-time", 't', "SECONDS", 0, "Number of seconds to wait between rescans, 0=disable.", 0 }, { "groom-time", 'g', "SECONDS", 0, "Number of seconds to wait between database grooming, 0=disable.", 0 }, { "maxigroom", 'G', NULL, 0, "Run a complete database groom/shrink pass at startup.", 0 }, - { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to NUM.", 0 }, + { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to NUM, default=#CPUs.", 0 }, + { "connection-pool", 'C', "NUM", OPTION_ARG_OPTIONAL, + "Use webapi connection pool with NUM threads, default=unlim.", 0 }, { "include", 'I', "REGEX", 0, "Include files matching REGEX, default=all.", 0 }, { "exclude", 'X', "REGEX", 0, "Exclude files matching REGEX, default=none.", 0 }, { "port", 'p', "NUM", 0, "HTTP port to listen on, default 8002.", 0 }, @@ -412,6 +414,7 @@ static unsigned rescan_s = 300; static unsigned groom_s = 86400; static bool maxigroom = false; static unsigned concurrency = std::thread::hardware_concurrency() ?: 1; +static int connection_pool = 0; static set source_paths; static bool scan_files = false; static map scan_archives; @@ -558,6 +561,16 @@ parse_opt (int key, char *arg, concurrency = (unsigned) atoi(arg); if (concurrency < 1) concurrency = 1; break; +case 'C': + if (arg) +{ + connection_pool = atoi(arg); + if (connection_pool < 2) +argp_failure(state, 1, EINVAL, "-C NUM minimum 2"); +} + else // arg not given +connection_pool = std::thread::hardware_concurrency() * 2 ?: 2; + break; case 'I': // NB: no problem with unconditional free here - an earlier failed regcomp would exit program if (passive_p) @@ -1368,6 +1381,7 @@ class libarchive_fdcache if (verbose > 3) obatched(clog) << "fdcache interned a=" << a << " b=" << b << " fd=" << fd << " mb=" << mb << " front=" << front_p << endl; + set_metrics(); } @@ -3708,7 +3722,15 @@ sigusr2_handler (int /* sig */) } - +static void // error logging callback from libmicrohttpd internals +error_cb (void *arg, const char *fmt, va_list ap) +{ + (void) arg; + inc_metric("error_count","libmicrohttpd",fmt); + char errmsg[512]; + (void) vsnprintf (errmsg, sizeof(errmsg), fmt, ap); // ok if slightly truncated + obatched(cerr) << "libmicrohttpd error: " << errmsg; // MHD_DLOG calls already include \n +} // A user-defined sqlite function, to score the sharedness of the @@ -3853,7 +3875,7 @@ main (int argc, char *argv[]) // Start httpd server threads.