[Bug backends/21541] New: eu-readelf --notes fails to dump PRSTATUS data?
https://sourceware.org/bugzilla/show_bug.cgi?id=21541 Bug ID: 21541 Summary: eu-readelf --notes fails to dump PRSTATUS data? Product: elfutils Version: unspecified Status: UNCONFIRMED Severity: normal Priority: P2 Component: backends Assignee: unassigned at sourceware dot org Reporter: myocytebd at sina dot com CC: elfutils-devel at sourceware dot org Target Milestone: --- Created attachment 10081 --> https://sourceware.org/bugzilla/attachment.cgi?id=10081&action=edit The core file mentioned Version: latest master I built elfutils with --prefix with a non-default path. Native build on x64. Problem: Run eu-readelf --notes core.dump The core-dump is generated from a x64 program. A. The one I built cannot dump details PRSTATUS data. B. On another machine, distribution shipped elfutils-0.158 dumps details of PRSTATUS data for the same core-dump. I have no glue where it is wrong. (Or the core file is broken?) There seems to be no relevant configure options. At runtime eu-readelf doesn't tell why it does not dump details of PRSTATUS. config report: = elfutils: 0.169 (eu_version: 169) = Prefix : /home/xxx/ins Program prefix ("eu-" recommended) : eu- Source code location : . Maintainer mode: libebl modules subdirectory: elfutils build arch : x86_64-unknown-linux-gnu RECOMMENDED FEATURES (should all be yes) gzip support : yes bzip2 support : yes lzma/xz support: yes libstdc++ demangle support : yes File textrel check : yes Symbol versioning : yes NOT RECOMMENDED FEATURES (should all be no) Experimental thread safety : no OTHER FEATURES Deterministic archives by default : false Native language support: yes EXTRA TEST FEATURES (used with make check) have bunzip2 installed (required) : yes debug branch prediction: no gprof support : no gcov support : no run all tests under valgrind : no gcc undefined behaviour sanitizer : no use rpath in tests : no test biarch: no -- You are receiving this mail because: You are on the CC list for the bug.
[Bug backends/21541] eu-readelf --notes fails to dump PRSTATUS data?
https://sourceware.org/bugzilla/show_bug.cgi?id=21541 Mark Wielaard changed: What|Removed |Added CC||mark at klomp dot org --- Comment #1 from Mark Wielaard --- Since you are installing in a non-standard location make sure PATH and LD_LIBRARY_PATH are setup correctly so the correct version of the binaries, libraries and backends are picked up. In your case it should probably be export PATH=/home/xxx/ins/bin:$PATH export LD_LIBRARY_PATH=/home/xxx/ins/lib:/home/xxx/ins/lib/elfutils -- You are receiving this mail because: You are on the CC list for the bug.
[Bug backends/21541] eu-readelf --notes fails to dump PRSTATUS data?
https://sourceware.org/bugzilla/show_bug.cgi?id=21541 --- Comment #2 from myocytebd at sina dot com --- (In reply to Mark Wielaard from comment #1) > Since you are installing in a non-standard location make sure PATH and > LD_LIBRARY_PATH are setup correctly so the correct version of the binaries, > libraries and backends are picked up. > > In your case it should probably be > export PATH=/home/xxx/ins/bin:$PATH > export LD_LIBRARY_PATH=/home/xxx/ins/lib:/home/xxx/ins/lib/elfutils I checked that it is using the correct libdw/libelf. I patched rpath of eu-readelf(In reply to Mark Wielaard from comment #1) > Since you are installing in a non-standard location make sure PATH and > LD_LIBRARY_PATH are setup correctly so the correct version of the binaries, > libraries and backends are picked up. > > In your case it should probably be > export PATH=/home/xxx/ins/bin:$PATH > export LD_LIBRARY_PATH=/home/xxx/ins/lib:/home/xxx/ins/lib/elfutils Thanks. /home/xxx/ins/lib/elfutils => This is the problem. I looked at: libebl/eblopenbackend.c, and found openbackend() implementation surprising: 1. It doesn't try relative path from the executable. 2. It doesn't try the path from --prefix. 3. When it failed to load, it doesn't print any message. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug tools/21525] Multiple .shstrtab sections since eu-readelf 0.166
https://sourceware.org/bugzilla/show_bug.cgi?id=21525 Mark Wielaard changed: What|Removed |Added CC||mark at klomp dot org --- Comment #1 from Mark Wielaard --- Thanks so much for identifying the bad commit. The issue is this hunk from src/strip.c: @@ -1035,9 +1037,10 @@ handle_elf (int fd, Elf *elf, const char *prefix, const char *fname, } } - /* Mark the section header string table as unused, we will create - a new one. */ - shdr_info[shstrndx].idx = 0; + /* Although we always create a new section header string table we + don't explicitly mark the existing one as unused. It can still + be used through a symbol table section we are keeping. If not it + will already be marked as unused. */ That comment makes a wrong assumption. We never actually mark the original .shstrtab as possible removal candidate. So now it is always kept whether or not there are other references to that sections. I am testing the following patch which does mark .shsstrtab as a section that could possibly be removed: diff --git a/src/strip.c b/src/strip.c index f747441..4015db5 100644 --- a/src/strip.c +++ b/src/strip.c @@ -711,11 +711,13 @@ handle_elf (int fd, Elf *elf, const char *prefix, const char *fname, in the sh_link or sh_info element it cannot be removed either */ for (cnt = 1; cnt < shnum; ++cnt) -/* Check whether the section can be removed. */ +/* Check whether the section can be removed. Since we will create + a new .shstrtab assume it will be removed too. */ if (remove_shdrs ? !(shdr_info[cnt].shdr.sh_flags & SHF_ALLOC) - : ebl_section_strip_p (ebl, ehdr, &shdr_info[cnt].shdr, - shdr_info[cnt].name, remove_comment, - remove_debug)) + : (ebl_section_strip_p (ebl, ehdr, &shdr_info[cnt].shdr, + shdr_info[cnt].name, remove_comment, + remove_debug) + || cnt == ehdr->e_shstrndx)) { /* For now assume this section will be removed. */ shdr_info[cnt].idx = 0; -- You are receiving this mail because: You are on the CC list for the bug.
Re: How to debug broken unwinding?
On Donnerstag, 1. Juni 2017 22:57:12 CEST Milian Wolff wrote: > Hey all, > > on my ArchLinux box I regularly see cases where libdw fails to unwind > properly. I can reproduce this both with upstream perf as well as with the > perfparser utility. > > How should I debug this, or how can I report a good bug report for this? I > guess I could upload a perf archive and document the steps required to build > perf with libdw as the unwinder, as it allows to easily compare libunwind > and libdw for unwinding. When I then diff the output of `perf script` for > the two unwinders for one and the same perf.data file, I see issues like > this: > > $ diff -u script.libunwind script.elfutils > --- script.libunwind2017-06-01 22:30:52.418029474 +0200 > +++ script.elfutils22017-06-01 22:35:10.987823055 +0200 > @@ -510,10 +510,6 @@ > e8ed _dl_fixup (/usr/lib/ld-2.25.so) >15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so) >ed94c KDynamicJobTracker::KDynamicJobTracker > (/home/milian/ projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0) > - 608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp > (/home/milian/ projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0) > - f199 call_init.part.0 (/usr/lib/ld-2.25.so) > - f2a5 _dl_init (/usr/lib/ld-2.25.so) > -db9 _dl_start_user (/usr/lib/ld-2.25.so) > > NOTE: it seems as if unwinding through _GLOBAL__sub* always fails? This part I now investigated with extensive debug output and figured out the issue: For the last frame that works in both cases, i.e. ed94c, libdwfl says it knows that this address belongs to /usr/lib/ld-2.25.so. In reality, it belongs to libKF5KIOWidgets.so.5.35.0. Previously, perf just checked whether any module is known to libdwfl for a given address and then trusted it to do the right thing. Now I created a patch that double-checks whether the mapping known to libdwfl matches what perf knows. If not, we report the correct map (as known to perf) and this fixes the issue. In general, I believe that libdwfl's API is lacking here. Both perf and perfparser know the exact mapping of a file, i.e. the file, it's start and end address, as well as the pgoff. But the integration with dwfl simply calls dwfl_report_elf, which only takes a start address. For things like ld-2.25.so this is often not enough. Is there any chance to expand the API to let us set the explicit mapping addresses? I see there's dwfl_report_module, which at least takes start and end address, but so far I always failed to use it for unwinding - it seems as if that function is not setting up the internal ELF file and thus all of the functions relying on that will break. Thanks -- Milian Wolff m...@milianw.de http://milianw.de signature.asc Description: This is a digitally signed message part.
[Bug tools/21522] eu-strip generates empty output if there is nothing to do
https://sourceware.org/bugzilla/show_bug.cgi?id=21522 Mark Wielaard changed: What|Removed |Added CC||mark at klomp dot org --- Comment #1 from Mark Wielaard --- We seem to never remove the output file if we created it, but couldn't finish it (either because there is nothing to do or some error occurred). We should always remove it in that case. Testing the following: diff --git a/src/strip.c b/src/strip.c index f747441..c5dbc9c 100644 --- a/src/strip.c +++ b/src/strip.c @@ -1063,8 +1063,11 @@ handle_elf (int fd, Elf *elf, const char *prefix, const char *fname, /* Test whether we are doing anything at all. */ if (cnt == idx) -/* Nope, all removable sections are already gone. */ -goto fail_close; +{ + /* Nope, all removable sections are already gone. */ + result = 1; + goto fail_close; +} /* Create the reference to the file with the debug info. */ if (debug_fname != NULL && !remove_shdrs) @@ -2226,7 +2229,11 @@ cannot set access and modification date of '%s'"), /* Close the file descriptor if we created a new file. */ if (output_fname != NULL) -close (fd); +{ + close (fd); + if (result != 0) + unlink (output_fname); +} return result; } -- You are receiving this mail because: You are on the CC list for the bug.
Re: How to debug broken unwinding?
On Donnerstag, 1. Juni 2017 22:57:12 CEST Milian Wolff wrote: > Hey all, > heaptrack_gui 2228 135073.400474: 613969 cycles: > 108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0) > @@ -533,8 +529,6 @@ > 2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0) > 297c53 QCoreApplicationPrivate::init (/usr/lib/ > libQt5Core.so.5.8.0) >f7cde QGuiApplicationPrivate::init > (/usr/lib/libQt5Gui.so. 5.8.0) > - 1589e8 QApplicationPrivate::init > (/usr/lib/libQt5Widgets.so. 5.8.0) > - 78622 main (/home/milian/projects/compiled/other/bin/ > heaptrack_gui) >20439 __libc_start_main (/usr/lib/libc-2.25.so) >78299 _start (/home/milian/projects/compiled/other/bin/ > heaptrack_gui) > > NOTE: this is super odd, it simply misses two frames in the middle?! This is really quite odd - looking at the debug output, the frames in the middle are really just skipped for some reason: unwind: access_mem addr 0x7ffca0a88330, val 4edc50, offset 2808 unwind: access_mem addr 0x7ffca0a88338, val 7f69bfce443a, offset 2816 unwind: pc: = 0x7f69c10fecde found map: 7f69c1007000 7f69c1766000 dso found: libQt5Gui.so.5.8.0 /usr/lib/libQt5Gui.so.5.8.0 reported: libQt5Gui.so.5.8.0 /usr/lib/libQt5Gui.so.5.8.0, 1 unwind: QGuiApplicationPrivate::init():ip = 0x7f69c10fecde (0xf7cde) -> so far so good, this frame is properly found inside libQt5Gui, but then: unwind: pc: = 0x7f69bfce4439 found map: 7f69bfcc4000 7f69c0069000 dso found: libc-2.25.so /usr/lib/libc-2.25.so reported: libc-2.25.so /usr/lib/libc-2.25.so, 1 unwind: __libc_start_main:ip = 0x7f69bfce4439 (0x20439) -> the next frame is is supposedly the one in libc, but what happened to the two frames in QApplicationPrivate::init and main? I also note that no calls to access_mem are occuring - is this maybe some (wrong) caching in libdw or so that interfers here? Any insight would be appreciated, thanks! -- Milian Wolff m...@milianw.de http://milianw.de signature.asc Description: This is a digitally signed message part.
Re: How to debug broken unwinding?
On Freitag, 2. Juni 2017 15:26:10 CEST Milian Wolff wrote: > On Donnerstag, 1. Juni 2017 22:57:12 CEST Milian Wolff wrote: > > Hey all, > > > > > heaptrack_gui 2228 135073.400474: 613969 cycles: > > 108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0) > > > > @@ -533,8 +529,6 @@ > > > > 2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0) > > 297c53 QCoreApplicationPrivate::init (/usr/lib/ > > > > libQt5Core.so.5.8.0) > > > >f7cde QGuiApplicationPrivate::init > > > > (/usr/lib/libQt5Gui.so. 5.8.0) > > - 1589e8 QApplicationPrivate::init > > (/usr/lib/libQt5Widgets.so. 5.8.0) > > - 78622 main (/home/milian/projects/compiled/other/bin/ > > heaptrack_gui) > > > >20439 __libc_start_main (/usr/lib/libc-2.25.so) > >78299 _start (/home/milian/projects/compiled/other/bin/ > > > > heaptrack_gui) > > > > NOTE: this is super odd, it simply misses two frames in the middle?! > > This is really quite odd - looking at the debug output, the frames in the > middle are really just skipped for some reason: > > unwind: access_mem addr 0x7ffca0a88330, val 4edc50, offset 2808 > unwind: access_mem addr 0x7ffca0a88338, val 7f69bfce443a, offset 2816 > unwind: pc: = 0x7f69c10fecde > found map: 7f69c1007000 7f69c1766000 > dso found: libQt5Gui.so.5.8.0 /usr/lib/libQt5Gui.so.5.8.0 > reported: libQt5Gui.so.5.8.0 /usr/lib/libQt5Gui.so.5.8.0, 1 > unwind: QGuiApplicationPrivate::init():ip = 0x7f69c10fecde (0xf7cde) > > -> so far so good, this frame is properly found inside libQt5Gui, but then: > > unwind: pc: = 0x7f69bfce4439 > found map: 7f69bfcc4000 7f69c0069000 > dso found: libc-2.25.so /usr/lib/libc-2.25.so > reported: libc-2.25.so /usr/lib/libc-2.25.so, 1 > unwind: __libc_start_main:ip = 0x7f69bfce4439 (0x20439) > > -> the next frame is is supposedly the one in libc, but what happened to the > two frames in QApplicationPrivate::init and main? I also note that no calls > to access_mem are occuring - is this maybe some (wrong) caching in libdw or > so that interfers here? > > Any insight would be appreciated, thanks! Some more debugging and going after my gut feeling brings me to the following conclusion: The real issue seems to be the on-demand reporting of the elf file. We used to do: Dwarf_Addr pc; bool isactivation; if (!dwfl_frame_pc(state, &pc, &isactivation)) { pr_err("%s", dwfl_errmsg(-1)); return DWARF_CB_ABORT; } // report the module before we query for isactivation report_module(pc, ui); This looks safe and fine and actually works most of the time. But passing a non-null isactivation flag to dwfl_frame_pc potentially leads to a second unwind step, before we got the change to report the module! I can workaround this by instead doing Dwarf_Addr pc; bool isactivation; if (!dwfl_frame_pc(state, &pc, NULL)) { pr_err("%s", dwfl_errmsg(-1)); return DWARF_CB_ABORT; } // report the module before we query for isactivation report_module(pc, ui); if (!dwfl_frame_pc(state, &pc, &isactivation)) { pr_err("%s", dwfl_errmsg(-1)); return DWARF_CB_ABORT; } This fixes all the issues in my original email. So sorry for the noise - it doesn't see as if the unwinding code in elfutils is broken - quite the contrary! It's just our misuse of the API that is to blame. May I suggest though to move the isactivation code into a separate function to prevent this issue from arising in the future? I.e. it would be nice if the code above could read: Dwarf_Addr pc; bool isactivation; if (!dwfl_frame_pc(state, &pc)) { pr_err("%s", dwfl_errmsg(-1)); return DWARF_CB_ABORT; } // report the module before we query for isactivation report_module(pc, ui); if (!dwfl_frame_is_activation(state)) { --pc; } Thanks -- Milian Wolff m...@milianw.de http://milianw.de signature.asc Description: This is a digitally signed message part.