Stefan Sperling wrote on Tue, Feb 28, 2012 at 03:18:35 +0100:
> On Mon, Feb 27, 2012 at 03:25:00PM -0800, Jason Wong wrote:
> > I guess I am wondering that if this is the case, then why is it that
> > if the check-in fails, and then we manually check it in again using
> > tortoisesvn, that it works the second time?
> 
> As far as I know, we don't know what the underlying problem is yet.
> So I'm afraid I cannot answer all of your questions in a satisfactory
> manner because I don't know the answers.

I'm not really surprised by this situation.  I guess Jason means "the
same set of tree/text/prop/revprop changes" when he says "it", but by
the nature of the bug it may well have to do with some concurrency
issues on the server --- i.e., it depends on more than just the contents
being committed.

> Daniel might know more -- he has been following the problem more closely
> than me. And I hope he will correct me if I'm making any wrong or
> misleading statements in this post :)

Yeah, I'm following this thread :)

> > We have been using svn for a while now and I am
> > wondering what this means that for 1.6, that this issue has been
> > occurring from communcations between 1.6 client and 1.6 server.
> 
> It has probably been happening since before the 1.7 release.
> The problem was first discovered in the ASF repository. The first
> occurrence there has been traced to a time before 1.7.0 was released.
> At the time the ASF server was still running 1.6, and the first commit
> that introduced the problem in the ASF repository was very likely
> a 1.6 client.
> 

AI We should check whether it _was_ a 1.6 client..

> > Also, is this bug something that svnadmin verify will not detect?
> > The last time we ran svnadmin verify, it said all was good.
> 
> Apparently, svnadmin verify won't find it.
> This should probably be fixed. But the priority right now is to
> understand why it is happening in the first place.
> 

AI The 'verify' part is a relatively low-hanging fruit, though.

AI File an issue for the pred-count bug

> > If it is the case that this bug has been occuring for a long time,
> > what are the implications of the history links for affected
> > revisions?
> 
> 'svn log' skips some revisions where it shouldn't. This is the
> only harmful effect of this bug, as far as I know.
> 

As I'd mentioned upthread the ASF repository triggering this bug also
has bogus minfo-cnt headers.  It's not known yet whether that is the
same bug, a separate bug, or a bug in something other than svn.

(that's also why I'm not filing an issue for the minfo-cnt bug yet --
it's not clear that the bug is in svn.)

> > When you say the history links are incorrect, does it
> > just put in a random value or is it actually unreadable values?
> 
> I don't know if it is random or somehow predetermined. That depends
> on how the wrong predecessor value actually comes about.
> 

The links are not garbage values -- they point to an existing location
(node-rev) in history.  They don't point to the right location.

> > Does this mean subsequent revisions that occur after these bad
> > revisions will propagate this bad information?
> 
> Yes, the predecessor count of subsequent revisions is off by
> some constant value.
> 

Indeed; but that's just error propagation, rather than a bug.
Practically it implies fixing the effects of the bug will involve
changing all the historical metadata.  (This probably means dump/load.
There is a way to implement this by doing in-place surgery (see l1488)
but it is of the "Don't try this at home" variety.)

The propagation of wrong counts, and the skipping of revisions during
backwards history walk, are both expected behaviours from the DAG layer
given the DAG in question (which includes a replacement of the root
node).  In English: the lower level APIs are performing correctly given
the violating-the-invariants-of-higher-layers state of the filesystem.

> > A developer asked me to pose the following question. If he was to
> > open a bad revision, would the client fail and give an error prompt
> > or would it display history information which could belong to other
> > files?
> 
> So far this hasn't triggered any noticeable errors apart from missing
> items when viewing the log history of a path. Once the problem was

Indeed.  And that's about all the direct effects it can have.  (Perhaps
some code is doing that history walk internally as part of another
higher-level operation, though.)

I guess minfo-cnt bug mentioned earlier should normally be very obvious,
as the values are off by an order of magnitude.  If they are too high
then the tree walk to find svn:mergeinfo-ful nodes will not stop at the
root of trees that lack those nodes (so, inefficiency but not
incorrectness); if it's too low then it will ignore the svn:mergeinfo on
some nodes, so I expect merges will visibly misperform (but Stefan will
know better).

> discovered by inspection of 'svn log' output, a sanity check was put
> in place to prevent the problem from happening in new revisions.
> This check is what you are hitting. What we need to do now is to figure
> out why it's happening, and then fix that problem.

Daniel
(will read Jason's mail later -- time for breakfast.)

Reply via email to