On Dienstag, 26. Juli 2022 17:28:11 CEST Mark Wielaard wrote:
> Hi Milian,
> 
> On Mon, 2022-07-11 at 18:40 +0200, Milian Wolff wrote:
> > in heaptrack I have code to runtime attach to a program and then
> > rewrite the
> > various rel / rela / jmprel tables to intercept calls to malloc & friends.
> > 
> > This works, but now I have received a crash report for what seems to
> > be an
> > invalid DSO file: The jmprel table contains an invalid entry which
> > points to
> > an out-of-bounds symbol, leading to a crash when we try to look at
> > the
> > symbol's name.
> > 
> > I would like to protect against this crash by detecting the invalid
> > symbols.
> > But to do that, I would need to know the size of the symbol table,
> > which is
> > much harder than I would have hoped:
> > 
> > We have:
> > 
> > ```
> > #define DT_SYMTAB   6               /* Address of symbol table */
> > #define DT_SYMENT   11              /* Size of one symbol table
> > entry */
> > ```
> > 
> > But there is no `DT_SYMSZ` or similar, which we would need to
> > validate symbol
> > indices. Am I overlooking something or is that really missing? Does
> > anyone
> > know why? The other tables have that, e.g.:
> > 
> > ```
> > #define DT_PLTRELSZ 2               /* Size in bytes of PLT relocs */
> > #define DT_RELASZ   8               /* Total size of Rela relocs */
> > #define DT_STRSZ    10              /* Size of string table */
> > #define DT_RELSZ    18              /* Total size of Rel relocs
> > */
> > ```
> > 
> > Why is this missing for the symtab?
> > 
> > The only viable alternative seems to be to mmap the file completely
> > to access
> > the Elf header and then iterate over the Elf sections to query the
> > size of the
> > SHT_DYNSYM section. This is pretty complicated, and costly. Does
> > anyone have a
> > better solution that would allow me to validate symbol indices?
> 
> I don't know why it is missing, but it is indeed a tricky issue. You
> really want to know the number of elements (or the size) of the symbol
> table, but it takes a little gymnastics to get that.

Thanks for confirming that this isn't available currently. Would it be 
possible to add this? What's the process for standardization here? I guess it 
would take a very long time, yet this seems to me as if it would be beneficial 
in the long term.

> Di Chen recently
> (or actually not that recently, I just still haven't reviewed, sorry!)
> posted a patch for
> https://sourceware.org/bugzilla/show_bug.cgi?id=28873 to print out the
> symbols from the dynamic segment
> https://sourceware.org/pipermail/elfutils-devel/2022q2/005086.html

Interesting. But from what I can tell, this patch has access to the full Elf 
object and thus can access segments which are not normally loaded at runtime?

> > PS: eu-elflint reports this for the broken DSOs e.g.:
> > ```
> > $ eu-elflint libQt5Qml.so.5.12
> > section [ 3] '.dynsym': symbol 1272: st_value out of bounds
> > section [ 3] '.dynsym': symbol 3684: st_value out of bounds
> > section [29] '.symtab': _GLOBAL_OFFSET_TABLE_ symbol size 0 does not
> > match
> > .got section size 18340
> > section [29] '.symtab': _DYNAMIC symbol size 0 does not match dynamic
> > segment
> > size 336
> > section [29] '.symtab': symbol 25720: st_value out of bounds
> > section [29] '.symtab': symbol 27227: st_value out of bounds
> > ```
> > 
> > Does anyone know how this can happen? Is this a bug in the toolchain?
> 
> Try with eu-elflint --gnu which suppresses some known issues.

Indeed, with `--gnu` the tool reports `No errors`.

> Also could you show those symbol values (1272, 3684, 25720, 27227) they
> might have a special type, so their st_value isn't really an address?

```
$ eu-readelf -s libQt5Qml.so.5.12.0 | grep -E "^\s*(1272|3684|25720|27227):"
 1272: 003f9974      0 NOTYPE  GLOBAL DEFAULT       25 __bss_start__@@Qt_5
 3684: 003f9974      0 NOTYPE  GLOBAL DEFAULT       25 __bss_start@@Qt_5
 1272: 003ccc4c      0 NOTYPE  LOCAL  DEFAULT       17 $d
 3684: 003cbfec      0 NOTYPE  LOCAL  DEFAULT       17 $d
25720: 003f9974      0 NOTYPE  GLOBAL DEFAULT       25 __bss_start
27227: 003f9974      0 NOTYPE  GLOBAL DEFAULT       25 __bss_start__
```

The first two matches come from the `.dynsym`, the last four come from 
`.symtab`.

Can anyone tell me how `eu-readelf` resolves these symbol names?

Thanks

-- 
Milian Wolff
m...@milianw.de
http://milianw.de

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to