Jason Wong wrote on Thu, Mar 01, 2012 at 10:01:26 -0800:
> I have had a developer here create a build of the latest SVN code
> with your changes you mentioned in r1294470 for the svnadmin verify
Okay, that's great news, for two reasons:
1. It means building svn on windows isn't as painful as it used to be :)
2. It means I can ask you to build a custom server with the 'inprocess'
cache disabled, or (if all else fails) to bisect, per my previous email.
One of the things you could try is to disable caching: simply modify
the function create_cache() in libsvn_fs_fs/caching.c to always return
NULL in *CACHE_P. See below for another suggestion.
> command. We have run 'svnadmin verify' against every revision of our
> hotcopy of our repository taken when we first brought this issue to
> the forums and are now tracking down each of the revisions to see
> what actions were being done at those times.
>
Thanks! I do hope this work enables us to pinpoint and fix the bug.
> From the results, we see 25 error messages for predecessor count is
> wrong and the first one appeared on January 26, 2011. Near that time
> the following events occurred:
> Jan. 14, 2011 - svn upgraded from 1.6.6 to 1.6.15
> Jan. 14, 2011 - Apache HTTP server upgraded from 2.2.15 to 2.2.17
> Jan. 21, 2011 - repository was pruned to delete some binary files.
>
> Between January and our upgrade in Dec. to 1.7.1, we have had about
> 14,000 revisions and seen only 25 instances of this node revision
> issue. During the times we had these errors, we were using svn
> versions 1.6.15 and 1.6.16.
>
Thanks, very valuable information.
I've reviewed the 1.6.6->1.6.15 diff, and I have the following
suggestions:
- Change subversion/libsvn_fs_fs/fs.h such that
SVN_FS_FS__USE_LOCK_MUTEX is set to 1. It was set to 1 in 1.6.6
but to 0 in 1.6.15.
(This wouldn't explain why ASF saw it, but it might explain why you're
seeing it.)
> Fail2ban from what I could find does not look like it has a Windows
> port which I currently have my production environment hosted on.
>
Yeah, sorry. But you can write a cron job -- I mean, a Scheduled Task
-- that greps your error logs for "160004" every night and mails you it
it found anything, right?
That's the error code to watch for for many FS error conditions:
% ./tools/dev/which-error.py E160004
00160004 SVN_ERR_FS_CORRUPT
> Thanks.
>
> Jason
For convenience I'm attaching a patch that implements both of my
suggestions. Let us know please if it has any effect.
Cheers,
Daniel
Index: subversion/libsvn_fs_fs/fs.h
===================================================================
--- subversion/libsvn_fs_fs/fs.h (revision 1295418)
+++ subversion/libsvn_fs_fs/fs.h (working copy)
@@ -161,7 +161,7 @@ typedef struct fs_fs_shared_txn_data_t
per file. On Windows apr implements the locking as per file handle
locks, so we don't have to add our own mutex for just in-process
synchronization. */
-#if APR_HAS_THREADS && !defined(WIN32)
+#if APR_HAS_THREADS /* disabled: and !defined(WIN32) */
#define SVN_FS_FS__USE_LOCK_MUTEX 1
#else
#define SVN_FS_FS__USE_LOCK_MUTEX 0
Index: subversion/libsvn_fs_fs/caching.c
===================================================================
--- subversion/libsvn_fs_fs/caching.c (revision 1295418)
+++ subversion/libsvn_fs_fs/caching.c (working copy)
@@ -209,6 +209,9 @@ create_cache(svn_cache__t **cache_p,
const char *prefix,
apr_pool_t *pool)
{
+ *cache_p = NULL;
+ return SVN_NO_ERROR;
+
if (memcache)
{
SVN_ERR(svn_cache__create_memcache(cache_p, memcache,