Hi Heather,

On Tue, Oct 17, 2023 at 02:57:39PM -0500, Heather McIntyre wrote:
> I see now that this is incomplete considering the other places that also
> call this function. I do agree that global locking may be heavy if 1)
> implemented in all of these locations, or 2) implemented directly in
> __libdw_dieabbrev. We could use atomics here directly in __libdw_dieabbrev.
> I have given this a try and it is currently passing all tests, including
> the new ones I added for data race detection.
> 
> I know you mentioned that taking any pthread lock at all might be a big
> overhead, but since I implemented a per dwarf struct lock, would using that
> be a possibility? Assuming multiple calls to __libdw_dieabbrev will be
> working on different dwarf objects.

I have been thinking about this issue and I think we made a mistake in
designing how a Dwarf_Die is lazy initialized. The abbrev field of a
Dwarf_Die is only set when needed by calling __libdw_dieabbrev, which
means we need some kind of locking or atomic swapping whenever we try
to use that field. I assume the idea originally was that calling
__libdw_dieabbrev is fairly "heavy" (it is, potentially reading the
whole .debug_abbrev for the CU). So we try to postpone it till it is
really needed.

But in practice it is always needed. Without the abbrev field set you
can just call dwarf_dieoffset, dwarf_cuoffset, dwarf_diecu and
dwarf_getabbrev. In theory you could avoid adding the abbrev for the
initial CU DIE for a Dwarf_CU when you are iterating over all CUs and
know you don't need the CU without inspecting the initial CU DIE. But
the Dwarf_Abbrev_Hash for the Dwarf_CU will already be
initialized. And normally the abbrev for the first DIE will also be
the first abbrev, so searching for it should be really quick.

So I think setting the Dwarf_Die abbrev field lazy is not really
helpful and makes the code needlessly complex. If we set the abbrev
field when a Dwarf_Die is created we can simplify the code and don't
need all this locking when we just want to access the field.

This of course is still a lot of coding, we'll have to check every
place that initializes a new Dwarf_Die. Which will have to call
__libdw_dieabbrev directly. But I think that will not need any extra
locking because the Dwarf_Abbrev_Hash used is already thread-safe (it
was written by Srđan Milaković also from Rice University).

What do you think?

Cheers,

Mark

Reply via email to