https://sourceware.org/bugzilla/show_bug.cgi?id=24319
Bug ID: 24319 Summary: Compressed / uncompressed debug info confusion in BFD Product: binutils Version: 2.32 Status: UNCONFIRMED Severity: normal Priority: P2 Component: binutils Assignee: unassigned at sourceware dot org Reporter: hadrien.grasland at gmail dot com Target Milestone: --- A project which I work on recently started experimenting with splitting and compressing debug info. The "splitting" part works perfectly, but unfortunately the "compressing" part seems to confuse some BFD-based DWARF readers such as objdump or the perf profiler in --call-graph=dwarf mode. This is the kind of commands we are using to split and compress debug info: ---- $ objcopy --only-keep-debug --compress-debug-sections <source> <dest> $ objcopy --strip-unneeded -R .comment -R .GCC.command.line -R .note.gnu.gold-version <source> ---- Now, my problem is that if I try to profile a binary that links to libraries whose debug info was split like this using `perf record --call-graph=dwarf`, it quickly becomes apparent at `perf report` time that something strange is happening inside of BFD: ---- BFD: DWARF error: found dwarf version '4113', this reader only handles version 2, 3, 4 and 5 information BFD: DWARF error: found dwarf version '20817', this reader only handles version 2, 3, 4 and 5 information BFD: DWARF error: found dwarf version '65361', this reader only handles version 2, 3, 4 and 5 information BFD: DWARF error: found dwarf version '49675', this reader only handles version 2, 3, 4 and 5 information BFD: DWARF error: found dwarf version '22998', this reader only handles version 2, 3, 4 and 5 information BFD: DWARF error: found dwarf version '2259', this reader only handles version 2, 3, 4 and 5 information BFD: DWARF error: found dwarf version '23499', this reader only handles version 2, 3, 4 and 5 information BFD: DWARF error: found dwarf version '13811', this reader only handles version 2, 3, 4 and 5 information BFD: DWARF error: found dwarf version '29659', this reader only handles version 2, 3, 4 and 5 information [ ... abridged ... ] ---- It seems pretty clear to me that during this `perf report` run, BFD is trying to decode as a DWARF version number a piece of data which is not a DWARF version number. I'm not 100% sure what it is, but since the setup works when I split the DWARF info without compressing it, my guess is that BFD is trying to decode compressed DWARF data without uncompressing it first. Now, perf is quite a complex beast, so I tried to narrow down the source of the problem. Given that the error comes from calls to BFD from perf, it seems to me that at least one of following statements must be true : 1. BFD is doing something wrong. 2. Perf is using BFD incorrectly. One way to figure out if statement #1 is true is to try to reproduce fishy behavior with a simpler BFD-based tool like objdump. This is the track which I have been following so far. Indeed, if I point objdump at the split debug info for one of the libraries that I am using, it already says rather strange things about its DWARF contents : ---- $ ls -lh development.strip-comp/Linux_x86_64/opt/lib/.debug/libG4geometry.so.dbg -r--r--r-- 1 hadrien users 55M Mar 11 11:52 development.strip-comp/Linux_x86_64/opt/lib/.debug/libG4geometry.so.dbg $ objdump --dwarf=info --dwarf-depth=1 --dwarf-check development.strip-comp/Linux_x86_64/opt/lib/.debug/libG4geometry.so.dbg development.strip-comp/Linux_x86_64/opt/lib/.debug/libG4geometry.so.dbg: file format elf64-x86-64 Section '.debug_info' has an invalid size: 0x651b8e4. ---- If I got my hex conversion right, that error message is speaking aboug a .debug_info which is 100MB large inside of a file which is 55MB large. I can see how that would be a problem. Since the error message mentioned a bad section size, I tried to look at the corresponding region of the ELF section table using objdump: ---- $ objdump -h development.strip-comp/Linux_x86_64/opt/lib/.debug/libG4geometry.so.dbg development.strip-comp/Linux_x86_64/opt/lib/.debug/libG4geometry.so.dbg: file format elf64-x86-64 Sections: Idx Name Size VMA LMA File off Algn [ ... abridged ... ] 26 .debug_aranges 0001d310 0000000000000000 0000000000000000 00000218 2**4 CONTENTS, READONLY, DEBUGGING 27 .debug_info 0651b8e4 0000000000000000 0000000000000000 00006268 2**0 CONTENTS, READONLY, DEBUGGING 28 .debug_abbrev 00171be7 0000000000000000 0000000000000000 027fe1c8 2**0 CONTENTS, READONLY, DEBUGGING [ ... abridged ... ] ---- At least the reported size of .debug_info was consistent between the two invocations. However, I was surprised to see that file_offset(.debug_info) + size(.debug_info) >> file_offset(.debug_abbrev). I'm not an ELF / DWARF expert, but it seems to me that these sections should not overlap the way they seem to. This led me to suspect that the reported .debug_info size was actually the size of the _uncompressed_ debug info section, something which I was able to confirm by taking a look at the section table of the original file: ---- $ objdump -h development.orig/Linux_x86_64/opt/lib/libG4geometry.so development.orig/Linux_x86_64/opt/lib/libG4geometry.so: file format elf64-x86-64 Sections: Idx Name Size VMA LMA File off Algn [ ... abridged ... ] 26 .debug_aranges 0001d310 0000000000000000 0000000000000000 0063e3d0 2**4 CONTENTS, READONLY, DEBUGGING 27 .debug_info 0651b8e4 0000000000000000 0000000000000000 0065b6e0 2**0 CONTENTS, READONLY, DEBUGGING 28 .debug_abbrev 00171be7 0000000000000000 0000000000000000 06b76fc4 2**0 CONTENTS, READONLY, DEBUGGING [ ... abridged ... ] ---- Putting it all together, I get the impression that the above objdump error message is a product of objdump mixing up compressed and uncompressed section sizes in its internal calculations, and incorrectly rejecting compressed debug info because its uncompressed size is too large. If I try to join this with my earlier perf observations, it gives me the impression that BFD can more generally sometimes end up munching compressed DWARF data when it should be munching uncompressed one, or vice versa. As far I can tell, it is the simplest hypothesis that can account for all of the observations above. Unfortunately, I did not manage to reproduce the above objdump behavior on small binaries that I could easily include in this bug report. So I cannot give you a ready-to-run debug setup, only tell you that there might be something fishy to investigate around this area for now... Can you give me some ideas of extra things to try in order to study these problems further and find more info about them? -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils