The stripped redpanda binary you link to is systematically off by 2 in its dynamic symbol table:

some data symbols point to section 7 (.rela.dyn), .rodata is 9
function symbols point to section 13 (.frame_ehdr), .text is 15
some data symbols (not TLS) point to section 20 (.tbss), .data.rel.ro is 22

This can be an artifact of (broken) post-build modifications done to the binary. Ignore the stated section and lookup the section by the symbol address instead.

Try to get an unmodified binary.

If you look closely, the .note.gnu.build-id section is at the very end.
It is normally produced at the beginning, so that it will be included in core dumps. I know that patchelf (not part of elfutils) produces such artifacts, for example when trying to modify RPATH when there is no space for the modification in the original .dynstr section.

Regards,
Henning

On 30.03.25 11:02, Hengqi Chen wrote:
Hi Mark,

Sorry for the late reply

On Wed, Mar 19, 2025 at 8:55 PM Mark Wielaard <m...@klomp.org> wrote:
Hi Hengqi,

On Tue, 2025-03-11 at 13:27 +0800, Hengqi Chen wrote:
I want to ask you a question regarding elf internals.
How to calculate a symbol's file offset (which is kernel uprobe expects)
in an elf (executable or shared object)?
Could you point me to a description of what uprobe expects?

I only found this one:
   https://docs.kernel.org/trace/uprobetracer.html

Some real world use case use either section header like libbpf:
   
https://github.com/libbpf/libbpf/blob/374036c9f1cdfe2a8df98d9d6a53c34fd02de14b/src/elf.c#L259-L270
Or use program header like BCC:
   
https://github.com/iovisor/bcc/blob/82f9d1cb633aa3b4ebcbbc5d8b809f48d3dfa222/src/cc/bcc_syms.cc#L767-L775

Which is correct ? Is there a unified way to get the file offset of a symbol ?
I am not sure I understand enough of what uprobe expects to know right
now. In general it depends on the ELF file type, for ET_REL files the
st_value is relative to to associated section (unless SHN_ABS),
otherwise the associated section load address doesn't really matter
except for where the program header says it is loaded, which might be
absolute (for ET_EXEC) or dynamic (for ET_DYN). It might also depend on
whether the dynamic loader has relocated the symbol and/or section
addresses (so whether you are reading the values from memory or on
disk).

 From the link above, it seems like the linux kernel expects values reading
from disk (using objdump).

The binary provided here ([0]),  contains the following symbol:
   
_ZN7cluster15topics_frontend13create_topicsE17fragmented_vectorINS_37custom_assignable_topic_configurationELm18446744073709551615EENSt3__16chrono10time_pointIN7seastar12lowres_clockENS5_8durationIxNS4_5ratioILl1ELl1000000000EEEEEEE

The symbol points to section #13, which is .eh_frame_hdr.
But value (virtual address) 00000000081658d0 actually belongs to section
#15 (.text):

[15] .text             PROGBITS        00000000035e0700 35df700
7a9de55 00  AX  0   0 64

   [0]: https://github.com/libbpf/libbpf-rs/issues/1110#issuecomment-2699221802

This results in different offsets calculated from section header and
program header.

Cheers,

Mark

Reply via email to