On Wed, 2014-11-26 at 09:13 +0100, Uros Bizjak wrote: > Hello! > > > cgraph*.c and ipa-*.c use xstrdup on strings when dumping them via > > fprintf, leaking all of the duplicated buffers. > > > > Is/was there a reason for doing this? > > Yes, please see [1] and PR 53136 [2]. As said in [1]: > > "There is a problem with multiple calls of cgraph_node_name in fprintf > dumps. Please note that C++ uses caching in > cxx_printable_name_internal (aka LANG_HOOKS_DECL_PRINTABLE_NAME), so > when cxx_printable_name_internal is called multiple times from printf > (i.e. fprintf "%s/%i -> %s/%i"), it can happen that the first string > gets evicted by the second call, before fprintf is fully evaluated." > > > Taking them out fixes these leaks (seen when dumping is enabled): > > But you will get "Invalid read of size X" instead. > > The patch at [1] fixed these, but introduced memory leaks, which were > tolerable at the time: > > "I think that small memory leak is tolerable here (the changes are > exclusively in the dump code), and follows the same approach as in > java frontend." > > It seems that these assumptions are not valid anymore. > > [1] https://gcc.gnu.org/ml/gcc-patches/2012-04/msg01904.html > [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53136
Aha! Thanks. I was only running under valgrind when testing the jit code, and the jit code looks like a frontend to the rest of GCC, it doesn't provide an implementation of LANG_HOOKS_DECL_PRINTABLE_NAME. I ran the rest of the gcc testsuite outside of gcc. So perhaps the duplications of the transient decl strings is still needed, and my patch would undo that fix. Oops. I'm about to disappear for a US holiday, but maybe a good solution here is to invent something like a "xstrdup_for_dump" function like this: /* When using fprintf (or similar), problems can arise with transient generated strings. Many string-generation APIs only support one result being alive at once (e.g. by returning a pointer to a statically-allocated buffer). If there is more than one generated string within one fprintf call: the first string gets evicted or overwritten by the second, before fprintf is fully evaluated. See e.g. PR/53136. This function provides a workaround for this, by providing a simple way to create copies of these transient strings, without the need to have explicit cleanup: fprintf (dumpfile, "string 1: %s string 2:%s\n", xstrdup_for_dump (EXPR_1), xstrdup_for_dump (EXPR_2)); The copied string is eventually freed, from code within toplev::finalize. */ extern const char * xstrdup_for_dump (const char *transient_str); We could then use this instead of xstrdup at these callsites, which would document what's going on. (I'm not in love with the name, but hopefully the idea is clear). This could be implemented either: (A) in terms of the "long-term allocator" idea here: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02420.html which uses a obstack on the gcc::context, eventually freeing the allocations from toplev::finalize (although that's currently only called from libgccjit.so, but we could call it from cc1 etc when running under valgrind). (B) or in terms of ggc_xstrdup, with the assumption that no GC can occur in anything within a fprintf call. That would document what's going on, ensure that we don't have use-after-free issues, and stop memory leak warnings from valgrind concerning these duplicates. Thoughts? Dave