The stripped redpanda binary you link to is systematically off by 2 in
its dynamic symbol table:
some data symbols point to section 7 (.rela.dyn), .rodata is 9
function symbols point to section 13 (.frame_ehdr), .text is 15
some data symbols (not TLS) point to section 20 (.tbss), .data.rel.ro is 22
This can be an artifact of (broken) post-build modifications done to the
binary.
Ignore the stated section and lookup the section by the symbol address
instead.
Try to get an unmodified binary.
If you look closely, the .note.gnu.build-id section is at the very end.
It is normally produced at the beginning, so that it will be included in
core dumps.
I know that patchelf (not part of elfutils) produces such artifacts, for
example when trying to modify RPATH when there is no space for the
modification in the original .dynstr section.
Regards,
Henning
On 30.03.25 11:02, Hengqi Chen wrote:
Hi Mark,
Sorry for the late reply
On Wed, Mar 19, 2025 at 8:55 PM Mark Wielaard <m...@klomp.org> wrote:
Hi Hengqi,
On Tue, 2025-03-11 at 13:27 +0800, Hengqi Chen wrote:
I want to ask you a question regarding elf internals.
How to calculate a symbol's file offset (which is kernel uprobe expects)
in an elf (executable or shared object)?
Could you point me to a description of what uprobe expects?
I only found this one:
https://docs.kernel.org/trace/uprobetracer.html
Some real world use case use either section header like libbpf:
https://github.com/libbpf/libbpf/blob/374036c9f1cdfe2a8df98d9d6a53c34fd02de14b/src/elf.c#L259-L270
Or use program header like BCC:
https://github.com/iovisor/bcc/blob/82f9d1cb633aa3b4ebcbbc5d8b809f48d3dfa222/src/cc/bcc_syms.cc#L767-L775
Which is correct ? Is there a unified way to get the file offset of a symbol ?
I am not sure I understand enough of what uprobe expects to know right
now. In general it depends on the ELF file type, for ET_REL files the
st_value is relative to to associated section (unless SHN_ABS),
otherwise the associated section load address doesn't really matter
except for where the program header says it is loaded, which might be
absolute (for ET_EXEC) or dynamic (for ET_DYN). It might also depend on
whether the dynamic loader has relocated the symbol and/or section
addresses (so whether you are reading the values from memory or on
disk).
From the link above, it seems like the linux kernel expects values reading
from disk (using objdump).
The binary provided here ([0]), contains the following symbol:
_ZN7cluster15topics_frontend13create_topicsE17fragmented_vectorINS_37custom_assignable_topic_configurationELm18446744073709551615EENSt3__16chrono10time_pointIN7seastar12lowres_clockENS5_8durationIxNS4_5ratioILl1ELl1000000000EEEEEEE
The symbol points to section #13, which is .eh_frame_hdr.
But value (virtual address) 00000000081658d0 actually belongs to section
#15 (.text):
[15] .text PROGBITS 00000000035e0700 35df700
7a9de55 00 AX 0 0 64
[0]: https://github.com/libbpf/libbpf-rs/issues/1110#issuecomment-2699221802
This results in different offsets calculated from section header and
program header.
Cheers,
Mark