On Tue, Apr 5, 2011 at 2:08 PM, Paul Pluzhnikov <[email protected]> wrote:
> Now all we have to do is figure out how to fix it ;-) I see a couple of possible solutions: 1. Document the problem; ask users to call backtrace() early (before calling pthread_key_create too many times), so it gets one of the "pre-allocated" descriptors. 2. Arrange for libunwind __attribute__((constructor)) function to do the same, and hope that it fires early enough. 3. Switch to using __thread, figure out some (likely extremely non-portable) way to perform cleanup on thread termination. All of 1, 2 and 3 are non-portable -- there is no guarantee that pthread_key_create will not *alloc every time it is invoked, nor is pthread* async-signal-safe. On Tue, Apr 5, 2011 at 3:13 PM, Lassi Tuura <[email protected]> wrote: > How far do we want to go in attempting to avoid the one calloc()? :-) > Choices seem to be: > a. Use __thread, require per-thread wrapper callbacks from app In the context of e.g. malloc stack recorder, application callback is generally not sufficient. Consider: application is about to call pthread_exit, so calls libunwind callback, which frees per-thread cache for current thread. The app then calls pthread_exit. Now the fun begins: pthread_exit calls __libc_thread_freeres, which calls free(), which calls unwinder, which reallocates per-thread trace cache, which is then leaked. I think the best you can do is mark per-thread cache that it will likely become cold soon, and deallocate it some time later (effectively turning this into B). > b. Use lock-free global cache stack, must still free 'unused' caches. > c. Use pthread_getspecific, deal with calloc from pthread_key_create, > maybe require app to call some init function once at 'safe' time if > it uses unw_backtrace? In general, C has the same problem for a malloc stack recorder: the very first call to backtrace() may well come from within libc-internal call to calloc(), and attempt to call pthread_setspecific at that point may be unsafe, and the app has not even gained execution control yet! OTOH, for glibc this wouldn't be a problem, as pthread_setspecific will not call calloc() before 32 TSD keys have been created. > I guess I'd go with c, b, then a. We can call once to get the key created > at a safe time (= initialisation for our profiler), then never need to > worry about destructor calls and don't need per-thread callbacks. Failing > that I think I'd prefer b over a. I think the only completely automatic and reasonably portable solution is B, though it *is* going to a lot of trouble for a problem we don't really have ;-( How about a variation of C: 4. Require the app to call e.g. libunwind_per_thread_init() from a safe context for each thread in which it desires fast backtrace(). This call will allocate trace cache and do pthread_setspecific. In tdep_trace(), if pthread_getspecific() returns NULL, then fall back to the slow unwind. Thanks, -- Paul Pluzhnikov _______________________________________________ Libunwind-devel mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/libunwind-devel
