Re: [PATCH] libdw: add thread-safety to dwarf_getabbrev()

2019-08-25 Thread Mark Wielaard
Hi Jonathon,

On Sat, 2019-08-24 at 20:10 -0500, Jonathon Anderson wrote:
> On Sat, Aug 24, 2019 at 6:24 PM, Mark Wielaard  wrote:
> > But what if realloc moves the block?
> > Then all dbg->mem_tails[thread_id] pointers become invalid.
> > And this function drops the lock before returning a libdw_memblock*.
> 
> Not quite, dbg->mem_tails is an array of pointers (note the ** in its 
> definition, and the use of the plural "tails"). So while the 
> dbg->mem_tails pointer becomes invalid, the dbg->mem_tails[thread_id] 
> pointers don't.
> 
> It would be a different story if dbg->mem_tails was an array of 
> libdw_memblocks, but its not.

Aha. Yes, they are pointers to the mem_blocks, not the mem_blocks
themselves. The pointer values are moved, but never changed. So this is
indeed fine. I was confused.

>  That would increase the "memory leak" 
> issue mentioned previously (to ~4K per dead thread) and require more 
> work on the part of the reallocing thread to initialize the new entries 
> (which at the moment should reduce to a memset, assuming the compiler 
> is smart enough and NULL == 0).
> 
> > 
> > So I think the lock needs to extend beyond this function somehow.  Or
> > we need to prevent another thread reallocing while we are dealing with
> > the assigned memblock.
> 
> No need, in fact we want to drop the lock as soon as possible to let 
> new threads in for realloc's. Under the assumption that we don't need 
> to allocate new blocks (extremely) often, the extra cost to relock when 
> we do should be relatively small. Also see __libdw_allocate.

Yes, now that I have a better (correct) mental picture of the data
structure, this makes total sense.

> As mentioned in other mails, this management scheme isn't (always) 
> optimally memory efficient, but its speed is good under parallel 
> stress. Far better than wrapping the whole thing with a single 
> pthread_mutex_t.

I wouldn't tweak it too much if you know it is working correctly now.
We do have to make sure it doesn't slow down the single-threaded use
case (since most programs still are single threaded). There is the new
locking, which slows things down. But the memory use should be about
the same since you don't duplicate the mem_blocks till there is access
from multiple threads.

If there isn't much parallel stress or allocation in practice. That is,
even in multi-threaded programs, there is still just one thread that
does most/all of the allocations. Then you could maybe optimize a bit
by having one "owner" thread for a Dwarf. That would be the first that
hits __libdw_alloc_tail. For that "owner thread" you then setup an
owner thread_id field, and have one special mem_block allocated. You
only expand the mem_blocks once another thread does an allocation. Then
the memory use in multi-threaded programs, where only one thread
handles a Dwarf object (or happens to allocate) would be the same as
for single-threaded programs.

But this depends on the usage pattern. Which might vary very between
programs. So don't try it, till you know it actually helps for a real
world program. And the extra memory use, might not really matter that
much in practice anyway.

Cheers,

Mark


Re: [PATCH] libdw: add thread-safety to dwarf_getabbrev()

2019-08-25 Thread Jonathon Anderson

Hello Mark,

Next iteration of the patch(es) are ready, hosted on Github (git 
request-pull output below). I think most of the issues we've discussed 
have been addressed; the setup with stdatomic.h is a little WIP, I 
wasn't certain how it should be done moving forward (and, I'm not all 
that familiar with Autotools). The one extra line in dwarf_getcfi.c is 
to account for a failure in the test suite when Valgrind is enabled 
(didn't catch it the first time around).


I like the idea of "owner" threads (and I think it should be simple 
enough to implement), but I'd like to do some performance comparisons 
against the pthread_key_* option before adding too much more complexity.


-Jonathon

The following changes since commit 
37fa94df1554aca83ec10ce50bc9bcb6957b204e:


 config/elfutils.spec.in: package eu-elfclassify (2019-08-15 09:17:41 
+0200)


are available in the Git repository at:

 https://github.com/blue42u/elfutils.git

for you to fetch changes up to 9e40e0b8bb329692b1140e99896164bcb7f791b8:

 lib + libdw: Add and use a concurrent version of the dynamic-size 
hash table. (2019-08-25 18:36:38 -0500)



Jonathon Anderson (3):
 Add configure options for Valgrind annotations.
 Add some supporting framework for C11-style atomics.
 libdw: Rewrite the memory handler to be thread-safe.

Srđan Milaković (1):
 lib + libdw: Add and use a concurrent version of the dynamic-size 
hash table.


ChangeLog|   5 +
configure.ac |  42 
lib/ChangeLog|  11 +
lib/Makefile.am  |   5 +-
lib/atomics.h|  37 +++
lib/dynamicsizehash_concurrent.c | 522 
+++

lib/dynamicsizehash_concurrent.h | 118 +
lib/stdatomic-fbsd.h | 442 
+

libdw/ChangeLog  |  12 +
libdw/Makefile.am|   4 +-
libdw/dwarf_abbrev_hash.c|   2 +-
libdw/dwarf_abbrev_hash.h|   2 +-
libdw/dwarf_begin_elf.c  |  13 +-
libdw/dwarf_end.c|  22 +-
libdw/dwarf_getcfi.c |   1 +
libdw/libdwP.h   |  19 +-
libdw/libdw_alloc.c  |  70 +-
17 files changed, 1300 insertions(+), 27 deletions(-)
create mode 100644 lib/atomics.h
create mode 100644 lib/dynamicsizehash_concurrent.c
create mode 100644 lib/dynamicsizehash_concurrent.h
create mode 100644 lib/stdatomic-fbsd.h