On 2021-04-23 15:31, Daniel Shahaf wrote:
> Rick van der Zwet wrote on Fri, Apr 23, 2021 at 02:57:02 +0200:
>> FYI: I did some futher investigation on the specific problem within Trac
>> I was trying to fix and I am getting more hints by experimenting with
>> flags and combinations. If I compile APR (1.7.0) without debugging it
>> produces a coredump with some endless(?) recursion, see attached
>> gdb-output.txt for details. If I compile apr with
>> '--enable-pool-debug=yes', the program executes without errors,
>> to-be-continued.
I find this very confusing, when compiling with '--enable-pool-debug'
the code seems behaves differently with regards to thread locking, which
might explain the differences in behaviour:
[[[
apr-1.7.x/memory/unix/apr_pools.c:
1656 /* No matter what the creation flags say, always create
1657 * a lock. Without it integrity_check and apr_pool_num_bytes
1658 * blow up (because they traverse pools child lists that
1659 * possibly belong to another thread, in combination with
1660 * the pool having no lock). However, this might actually
1661 * hide problems like creating a child pool of a pool
1662 * belonging to another thread.
1663 */
]]]
>> Since this is most likely not subversion related I
>> think is best discussed on different mailinglist.
>
> Ack, but the backtrace does go through Subversion's swig-py bindings and
> libsvn_fs_fs, so we might be involved nevertheless. Does the problem
> reproduce in «svnlook help»? In «svnlook youngest /path/to/repo»? In
> «./fs-test» (in subversion/tests/*/)?
>
> I'm guessing frame 87035 is subversion/libsvn_fs_fs/fs.c:fs_open().
> That function uses a subpool, but I don't see why. A subpool would have
> made sense if «subpool» had been passed to a function that was
> documented to install pool cleanup handlers on its provided pool, for
> example, but none of the uses of «subpool» are as anything other than
> a called function's ordinary scratch_pool. So, could you try the
> following patch?
>
> [[[
> Index: subversion/libsvn_fs_fs/fs.c
> ===================================================================
> --- subversion/libsvn_fs_fs/fs.c (revision 1889073)
> +++ subversion/libsvn_fs_fs/fs.c (working copy)
> @@ -427,19 +427,15 @@ fs_open(svn_fs_t *fs,
> apr_pool_t *scratch_pool,
> apr_pool_t *common_pool)
> {
> - apr_pool_t *subpool = svn_pool_create(scratch_pool);
> -
> SVN_ERR(svn_fs__check_fs(fs, FALSE));
>
> SVN_ERR(initialize_fs_struct(fs));
>
> - SVN_ERR(svn_fs_fs__open(fs, path, subpool));
> + SVN_ERR(svn_fs_fs__open(fs, path, scratch_pool));
>
> - SVN_ERR(svn_fs_fs__initialize_caches(fs, subpool));
> + SVN_ERR(svn_fs_fs__initialize_caches(fs, scratch_pool));
> SVN_MUTEX__WITH_LOCK(common_pool_lock,
> - fs_serialized_init(fs, common_pool, subpool));
> -
> - svn_pool_destroy(subpool);
> + fs_serialized_init(fs, common_pool, scratch_pool));
>
> return SVN_NO_ERROR;
> }
> ]]]
>
> This passes fs-test and basic_tests.py (I didn't do a full «make
> check» run).
>
> To be clear, the scratch_pool usage in that function in HEAD is
> non-idiomatic without an obvious reason, but I don't see why it would
> result in an infinite loop.
I have tried the patch, it did not fix it.
> About which, the pool address, 0x0804542028 (note I added a leading zero
> to keep the number of nibbles even), looks suspiciously like ASCII: it's
> "\t\004T (". That _could_ be a coincidence, especially since there's no
> obvious mechanism by which the value of «subpool» could be corrupted
> between the svn_pool_create() and the svn_pool_destroy(), but still,
> a valgrind run on a pool-debug-enabled build might be a good idea.
I did try an valgrind run, how-ever this showed no issues. I found
compiling with '--enable-pool-concurrency-check' triggered a concurrency
error which seems to be related to the Trac code, I filled it over
there: https://trac.edgewall.org/ticket/13401
Kind regards,
Rick