Hi Aleksei,

On Fri, Nov 17, 2023 at 10:35:40PM +0000, vvv...@google.com wrote:
> When archive is processed in process_archive (libdwfl/offline.c), it
> creates an Elf object for each archive member. Then in
> process_archive_member it calls process_file to create a Dwfl_Module
> through __libdwfl_report_elf.
> 
> The ownership of the Elf object is expected to be:
> 
> * either transfered to the Dwfl_Module, if __libdwfl_report_elf returns
>   not NULL;
> 
> * or handled at the end of process_archive_member by calling elf_end.
> 
> Moreover, Elf object is expected to be alive, if __libdwfl_report_elf
> returns not NULL, because at the end of process_archive_member it
> advances to the next member through the elf_next call.
> 
> The problem happens when __libdwfl_report_elf encounters Elf with the
> same name and content as it seen before. In that case dwfl_report_module
> will reuse existing Dwfl_Module object. This leads to a codepath that
> calls elf_end on the Elf object, while returning not NULL, breaking the
> elf_next call to the next member.
> 
> The fix is to destroy m->main.elf instead and put the new Elf object in
> the already existing Dwfl_Module.
> 
>     * libdwfl/dwfl_report_elf.c (__libdwfl_report_elf): Replace Elf in
>       the Dwfl_Module in case of overlapping or duplicate modules to
>       prolong its lifetime for subsequent processing.

Thanks for that analysis and proposed solution. The ownership
reasoning makes sense. But I do have one question.

> diff --git a/libdwfl/dwfl_report_elf.c b/libdwfl/dwfl_report_elf.c
> index 581f4079..58b06aea 100644
> --- a/libdwfl/dwfl_report_elf.c
> +++ b/libdwfl/dwfl_report_elf.c
> @@ -276,7 +276,8 @@ __libdwfl_report_elf (Dwfl *dwfl, const char *name, const 
> char *file_name,
>       }
>        else
>       {
> -       elf_end (elf);
> +       elf_end (m->main.elf);
> +       m->main.elf = elf;
>         if (m->main_bias != bias
>             || m->main.vaddr != vaddr || m->main.address_sync != address_sync)
>           goto overlap;

If we goto overlap here don't we still have a problem? overlap will
set m->gc = true; and return NULL. So the caller will think they
still owns the elf handle and will probably close it. But then when
the module is GCed in dwfl_report_end it will close the elf handle
again.

Should we instead move the elf_end and reassignment of main.elf to
after this if statement?

Thanks,

Mark

Reply via email to