Hi Aaron, On Mon, 2025-06-30 at 23:12 -0400, Aaron Merey wrote: > Signed-off-by: Aaron Merey <ame...@redhat.com> > --- > doc/Makefile.am | 1 + > doc/elf_getdata_rawchunk.3 | 133 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 134 insertions(+) > create mode 100644 doc/elf_getdata_rawchunk.3 > > diff --git a/doc/Makefile.am b/doc/Makefile.am > index 3b63dc38..8afda0bd 100644 > --- a/doc/Makefile.am > +++ b/doc/Makefile.am > @@ -66,6 +66,7 @@ notrans_dist_man3_MANS= elf32_checksum.3 \ > elf_flagshdr.3 \ > elf_getbase.3 \ > elf_getdata.3 \ > + elf_getdata_rawchunk.3 \ > elf_getscn.3 \ > elf_hash.3 \ > elf_kind.3 \
OK. > diff --git a/doc/elf_getdata_rawchunk.3 b/doc/elf_getdata_rawchunk.3 > new file mode 100644 > index 00000000..3b7f644f > --- /dev/null > +++ b/doc/elf_getdata_rawchunk.3 > @@ -0,0 +1,133 @@ > +.TH ELF_GETDATA_RAWCHUNK 3 2025-06-30 "Libelf" "Libelf Programmer's Manual" > + > +.SH NAME > +elf_getdata_rawchunk \- create a raw data descriptor from a file offset > + > +.SH SYNOPSIS > +.nf > +#include <libelf.h> > + > +.B Elf_Data * elf_getdata_rawchunk("Elf *elf", "int64_t offset", "size_t > size", "Elf_Type type"); > +.fi OK. > +.SH DESCRIPTION > +The > +.BR elf_getdata_rawchunk () > +function returns an > +.B Elf_Data > +descriptor that refers to a raw region of the ELF file, starting at the > +given file > +.I offset > +and spanning > +.I size > +bytes. This is used to access arbitrary byte ranges in the ELF file that may > +not correspond to any section or segment. > + > +If a descriptor for the same offset, size, and type has previously been > requested, > +the cached result is returned. Otherwise, the data is loaded and prepared > according > +to the conditions below. > + > +If the ELF descriptor is backed by a memory-mapped file and the file offset > yields > +a pointer suitably aligned for the given > +.I type , > +the buffer is returned as a direct pointer into the mapped file. If alignment > +constraints are not met, or if the file is not memory-mapped, a copy of the > data is > +allocated and returned instead. The returned buffer is always read-only and > must not > +be modified. Yes, but both previous paragraphs kind of describe implementation details. The important thing is that the returned Elf_Data d_buf will be properly alligned for the requested type and that the returned Elf_Data is owned by libelf and is valid till elf_end is called on the corresponding Elf object. > +If the file’s byte order matches the host, and alignment permits, the raw > data is > +returned as-is. If the file’s byte order differs, the buffer is converted > into native > +byte order. Maybe explicitly mention it acts like elf_getdata (and not elf_rawdata)? > +The returned data is not associated with any section or update mechanism. It > will not > +be included in any output written by > +.BR elf_update (3). Nice, good to mention. > +.SH PARAMETERS > +.TP > +.I elf > +A pointer to an > +.B Elf > +descriptor representing the ELF file. > + > +.TP > +.I offset > +The starting file offset in bytes. This must lie within the bounds of the > file. > + > +.TP > +.I size > +The number of bytes to include in the data chunk, starting at > +.IR offset . OK. > +.TP > +.I type > +An > +.B Elf_Type > +enumeration value indicating the intended interpretation of the data. > Values include: > + > +.RS > +.TP > +.B ELF_T_BYTE > +Raw bytes. > +.TP > +.B ELF_T_ADDR, ELF_T_OFF, ELF_T_HALF, ELF_T_WORD, ELF_T_XWORD, ELF_T_SWORD, > ELF_T_SXWORD > +Integer and pointer types. > +.TP > +.B ELF_T_EHDR, ELF_T_SHDR, ELF_T_PHDR > +ELF file, section, and program headers. > +.TP > +.B ELF_T_SYM, ELF_T_DYN, ELF_T_REL, ELF_T_RELA, ELF_T_RELR > +Symbols and relocations. > +.TP > +.B ELF_T_NHDR, ELF_T_CHDR, ELF_T_NHDR8 > +ELF note headers and compressed data headers. > +.TP > +.B ELF_T_VDEF, ELF_T_VDAUX, ELF_T_VNEED, ELF_T_VNAUX > +Versioning structures. > +.TP > +.B ELF_T_SYMINFO, ELF_T_MOVE, ELF_T_LIB > +Other ELF metadata. > +.TP > +.B ELF_T_GNUHASH > +GNU-style hash table. > +.TP > +.B ELF_T_AUXV > +Auxiliary vectors. > +.RE Nice overview of ELF data types. But might there be a better place for this? I would assume elf_data, but that man page is currently very bare bones. > +.SH RETURN VALUE > +Returns a pointer to an > +.B Elf_Data > +descriptor on success. > + > +Returns NULL if any arguments are invalid, or if memory allocation or file > reading > +fails. The result is cached and shared between repeated calls using the same > arguments. Say something about elf_errmsg? > +.SH SEE ALSO > +.BR elf (3), > +.BR elf_getdata (3), > +.BR elf_getscn (3), > +.BR elf_rawdata (3), > +.BR libelf (3), > +.BR elf (5) OK. > +.SH ATTRIBUTES > +.TS > +allbox; > +lbx lb lb > +l l l. > +Interface Attribute Value > +T{ > +.na > +.nh > +.BR elf_getdata_rawchunk () > +T} Thread safety MT-Safe > +.TE Speaking of thread safety... This should be MT-Safe and documented that way. But is it really? The code takes a read lock on elf->lock. Checks whether it can find an existing Elf_Data_Chunk. If yes, it returns that one. Which seems fine. But if not it inserts the dummy chunk without actually data, allocates/creates the data, drops the rdlock (!), takes a write lock and overwrites the dummy chunk with the real data, drops the lock again and returns the data. What if at (!) another call gets the read lock first before the current thread can (re)take the write lock? That other thread could find the existing dummy key already in the cache and return the dummy data? > +.SH REPORTING BUGS > +Report bugs to <elfutils-devel@sourceware.org> or > https://sourceware.org/bugzilla/. > + > +.SH HISTORY > +.B elf_getdata_rawchunk > +first appeared in elfutils 0.130. This function is a elfutils libelf > extension and > +may not be available in other libelf implementations. Yes. Thanks, Mark