Hi Aaron, On Sun, 2023-11-12 at 15:16 -0500, Aaron Merey wrote: > It is possible for segments of different shared libaries to be interleaved > in memory such that the segments of one library are located in between > non-contiguous segments of another library. > > For example, this can be seen with firefox on RHEL 7.9 where multiple > shared libraries could be mapped in between ld-2.17.so segments: > > [...] > 7f0972082000-7f09720a4000 00000000 139264 /usr/lib64/ld-2.17.so > 7f09720a4000-7f09720a5000 00000000 4096 /memfd:mozilla-ipc > (deleted) > 7f09720a5000-7f09720a7000 00000000 8192 /memfd:mozilla-ipc > (deleted) > 7f09720a7000-7f09720a9000 00000000 8192 /memfd:mozilla-ipc > (deleted) > 7f0972134000-7f0972136000 00000000 8192 > /usr/lib64/firefox/libmozwayland.so > 7f0972136000-7f0972137000 00002000 4096 > /usr/lib64/firefox/libmozwayland.so > 7f0972137000-7f0972138000 00003000 4096 > /usr/lib64/firefox/libmozwayland.so > 7f0972138000-7f0972139000 00003000 4096 > /usr/lib64/firefox/libmozwayland.so > 7f097213a000-7f0972147000 00000000 53248 > /usr/lib64/firefox/libmozsqlite3.so > 7f0972147000-7f097221e000 0000d000 880640 > /usr/lib64/firefox/libmozsqlite3.so > 7f097221e000-7f0972248000 000e4000 172032 > /usr/lib64/firefox/libmozsqlite3.so > 7f0972248000-7f0972249000 0010e000 4096 > /usr/lib64/firefox/libmozsqlite3.so > 7f0972249000-7f097224c000 0010e000 12288 > /usr/lib64/firefox/libmozsqlite3.so > 7f097224c000-7f0972250000 00111000 16384 > /usr/lib64/firefox/libmozsqlite3.so > 7f0972250000-7f0972253000 00000000 12288 > /usr/lib64/firefox/liblgpllibs.so > [...] > 7f09722a3000-7f09722a4000 00021000 4096 /usr/lib64/ld-2.17.so > 7f09722a4000-7f09722a5000 00022000 4096 /usr/lib64/ld-2.17.so > > dwfl_segment_report_module did not account for the possibility of > interleaving non-contiguous segments, resulting in premature closure > of modules as well as failing to report modules.
Nice description of the issue. > Fix this by removing segment skipping in dwfl_segment_report_module. > When dwfl_segment_report_module reported a module, it would return > the index of the segment immediately following the end address of the > current module. Since there's a chance that other modules might fall > within this address range, dwfl_segment_report_module instead returns > the index of the next segment. This makes sense. > This patch also fixes premature module closure that can occur in > dwfl_segment_report_module when interleaving non-contiguous segments > are found. Previously modules with start and end addresses that overlap > with the current segment would have their build-id compared the build-id > associated with the current segment. If there was a mismatch, that module > would be closed. Avoid closing modules in this case when mismatching > build-ids correspond to distinct modules. Nice find. > A couple caveats should be mentioned. First, start and end addresses > of reported modules cannot be assumed to contain segments from only > that module. This has always been the case however. There is dwfl_addrmodule/dwfl_addrsegment to find the module that covers a specific address. Defined in libdwfl/segment.c. I think this should handle this by checking the closes load address. But I have not tested it. Normally only kernel modules (.ko ET_REL files) have multiple segments. So it might make sense to double check none of this impacts systemtap. > Second, the testcases in this patch use a firefox corefile that is > fairly large. The .bz2 corefile is about 47M. A clean elfutils repo > is currently about 42M, so this corefile more than doubles the size of > the elfutils repo. I looked for a much smaller process with > interleaving non-contiguous shared library sections but was not able > to find one. I've included the corefile and tests in this patch but > they can be removed if we'd prefer to not approx. double the size of > the repo. I really appreciate the testcase, but it really is too big. It is 736M bunzip2ed (which would happen on any make check). I think this makes the repo and the build/check a little too heavy. Also we try to include instructions to recreate any binary test files and that isn't really possible in this case. > https://sourceware.org/bugzilla/show_bug.cgi?id=30975 > > Signed-off-by: Aaron Merey <ame...@redhat.com> > --- > libdwfl/dwfl_segment_report_module.c | 37 +++++-- > tests/Makefile.am | 5 +- > tests/run-unstrip-noncontig.sh | 155 +++++++++++++++++++++++++++ > tests/testcore-noncontig.bz2 | Bin 0 -> 49146091 bytes > 4 files changed, 184 insertions(+), 13 deletions(-) > create mode 100755 tests/run-unstrip-noncontig.sh > create mode 100644 tests/testcore-noncontig.bz2 > > diff --git a/libdwfl/dwfl_segment_report_module.c > b/libdwfl/dwfl_segment_report_module.c > index 3ef62a7d..09ee37b3 100644 > --- a/libdwfl/dwfl_segment_report_module.c > +++ b/libdwfl/dwfl_segment_report_module.c > @@ -737,17 +737,34 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const > char *name, > && invalid_elf (module->elf, module->disk_file_has_build_id, > &build_id)) > { > - elf_end (module->elf); > - close (module->fd); > - module->elf = NULL; > - module->fd = -1; > + /* If MODULE's build-id doesn't match the disk file's > + build-id, close ELF only if MODULE and ELF refer to > + different builds of files with the same name. This > + prevents premature closure of the correct ELF in cases > + where segments of a module are non-contiguous in memory. */ > + if (name != NULL && module->name[0] != '\0' > + && strcmp (basename (module->name), basename (name)) == 0) > + { > + elf_end (module->elf); > + close (module->fd); > + module->elf = NULL; > + module->fd = -1; > + } > } Nice. > - if (module->elf != NULL) > + else if (module->elf != NULL) > { > - /* Ignore this found module if it would conflict in address > - space with any already existing module of DWFL. */ > + /* This module has already been reported. */ > skip_this_module = true; > } > + else > + { > + /* Only report this module if we haven't already done so. */ > + for (Dwfl_Module *mod = dwfl->modulelist; mod != NULL; > + mod = mod->next) > + if (mod->low_addr == module_start > + && mod->high_addr == module_end) > + skip_this_module = true; > + } > } > if (skip_this_module) > goto out; OK. So now we only skip modules if we have already seen this exact module. > @@ -781,10 +798,6 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const > char *name, > } > } > > - /* Our return value now says to skip the segments contained > - within the module. */ > - ndx = addr_segndx (dwfl, segment, module_end, true); > - > /* Examine its .dynamic section to get more interesting details. > If it has DT_SONAME, we'll use that as the module name. > If it has a DT_DEBUG, then it's actually a PIE rather than a DSO. > @@ -929,6 +942,8 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const > char *name, > ndx = -1; > goto out; > } > + else > + ndx++; > > /* We have reported the module. Now let the caller decide whether we > should read the whole thing in right now. */ Right, so addr_segndx would normally return this ndx because next == true. But now we just always just report the next ndx. > diff --git a/tests/Makefile.am b/tests/Makefile.am > index 7fb8efb1..a12df1d3 100644 > --- a/tests/Makefile.am > +++ b/tests/Makefile.am > @@ -212,7 +212,7 @@ TESTS = run-arextract.sh run-arsymtest.sh run-ar.sh > newfile test-nlist \ > $(asm_TESTS) run-disasm-bpf.sh run-low_high_pc-dw-form-indirect.sh \ > run-nvidia-extended-linemap-libdw.sh > run-nvidia-extended-linemap-readelf.sh \ > run-readelf-dw-form-indirect.sh run-strip-largealign.sh \ > - run-readelf-Dd.sh > + run-readelf-Dd.sh run-unstrip-noncontig.sh > > if !BIARCH > export ELFUTILS_DISABLE_BIARCH = 1 > @@ -632,7 +632,8 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \ > run-nvidia-extended-linemap-libdw.sh > run-nvidia-extended-linemap-readelf.sh \ > testfile_nvidia_linemap.bz2 \ > testfile-largealign.o.bz2 run-strip-largealign.sh \ > - run-funcretval++11.sh > + run-funcretval++11.sh \ > + run-unstrip-noncontig.sh testcore-noncontig.bz2 > > > if USE_VALGRIND I really like to have a testcase, but only if we can find/construct something much smaller. Thanks, Mark