How to fix corrupt revision in repo?
Greetings, I have an SVN repo that is failing svnadmin verify on revision 192. For some reason the verify output says: [...various successful revisions...] * Verified revision 191. svnadmin: Can't read file 'E:\Repositories\Client_Name\db\revs\0\16 2': End of file found I'm not sure why the verification command for revision 192 would throw an error description for revision 162. Revision 192 affected a completely different part of the repository to revision 162, so there is no obvious relationship between them. All the revisions from 193 to 332 (HEAD) are ok. It might be a one-off. This looks like a FSFS-backed repository (I am very new to SVN and inherited the server from someone else!). The server is VisualSVN 2.1.4, which is based on SVN 1.6.13. The clients are mostly TortoiseSVN 1.6.16, which uses SVN 1.6.17. What steps should I take to fix the corrupted revision? Is there more information that I should provide? (eg a copy of the rev 192 file?) This problem is causing checkouts and updates to fail for files that were last modified in that revision. Regards, David Hopkins Serck Controls = PRIVACY AND CONFIDENTIALITY NOTICE = The information contained in this message is intended for the named recipient only. It may contain privileged and confidential information. If you are not the intended recipient, you must not copy, distribute, take any action in reliance on it, or disclose any details of the message to any person, firm or corporation. If you have received this message in error, please notify the sender immediately by reply e-mail and delete all copies of this transmission together with any attachments. The views or opinions expressed in this e-mail or any attachment are not necessarily those of Serck Controls Pty Ltd. NOTE - You should carry out your own virus checks before opening any attachment.
RE: How to fix corrupt revision in repo?
Thanks Daniel. Responses inline. >> David Hopkins wrote on Thu, Sep 15, 2011 at 10:30:46 +0800: I'm not sure >> why the verification command for revision 192 would throw an error >> description for revision 162. Revision 192 affected a completely >> different part of the repository to revision 162, so there is no obvious >> relationship between them. > > Possibly due to rep-sharing? Does db/revs/0/192 contain the number "162" > in ASCII decimal delimited by whitespace? You can check that with the > following command: grep -a "^text:" db/revs/0/192 Yes it does. Here's the context in the rev file: id: y-31.0-32.r192/673830 type: file pred: y-31.0.r31/264323 count: 1 text: 162 670867 6111 52486 5117fb0964ca1a78dd97447d23452e73 609f4745460d6e14860daff0803ee7024c54898c 191-5r/_m cpath: [redacted] copyroot: 32 [redacted] Looking at the other nearby entries, they have "text: 192 [...]" instead of "text: 162 [...]". Is that likely to be the problem? > Does 'svnadmin dump -r 162 >NUL' work ? Yes it does. > To answer your question: yes, most definitely a copy of the r192 (and/or > r162) rev file would allow to pinpoint the problem, however you might > not want to share those files on a public list as they may contain > sensitive data (versioned file contents). I'll find out if I can release the broken revisions in their entirety. The corrupted revision doesn't actually contain anything particularly important (almost all the modified files in it have since been replaced by newer versions anyway). Can I fix the repository by dumping every revision except 192, and then reloading the good revisions into a new repo? Or will cause problems for the revisions after 192 since one of the revisions no longer exists? Regards, David Hopkins Serck Controls = PRIVACY AND CONFIDENTIALITY NOTICE = The information contained in this message is intended for the named recipient only. It may contain privileged and confidential information. If you are not the intended recipient, you must not copy, distribute, take any action in reliance on it, or disclose any details of the message to any person, firm or corporation. If you have received this message in error, please notify the sender immediately by reply e-mail and delete all copies of this transmission together with any attachments. The views or opinions expressed in this e-mail or any attachment are not necessarily those of Serck Controls Pty Ltd. NOTE - You should carry out your own virus checks before opening any attachment.
RE: How to fix corrupt revision in repo?
David Hopkins wrote on Fri, Sep 16, 2011 at 08:30:14 +0800: > > Here's the context in the rev file: > > id: y-31.0-32.r192/673830 > > type: file > > pred: y-31.0.r31/264323 > > count: 1 > > text: 162 670867 6111 52486 5117fb0964ca1a78dd97447d23452e73 > > 609f4745460d6e14860daff0803ee7024c54898c 191-5r/_m > > That tells you that the 6111 bytes starting at offset 670867(bytes) into > the r162 rev file are a representation generating a file whose checksums > and uniquifier are given later. See subversion/libsvn_fs_fs/structure > for details --- basically, it's DELTA\n or PLAIN\n up through ENDREP\n. That's interesting. It certainly explains the "end of file" error message that is getting thrown, because rev 162 is only 1,506 bytes long. Rev 162 was a deletion of a single file from a different folder in the repo so I'd be surprised if it contained any file representations at all. Rev 192 is 683,471 bytes long, so it *is* long enough for a 670867 byte offset to make sense. > > > > Looking at the other nearby entries, they have "text: 192 [...]" > > instead of "text: 162 [...]". Is that likely to be the problem? > > > > It's normal for r192 to contain "text: 162" if rep-sharing is enabled or > if you did a copy-without-textmods from r162. > Ok. I think rep-sharing is probably enabled because this server was installed using SVN 1.6, and we haven't altered the setting. (It's on by default, yes?) But, I can see from the CPATH which file in r192 is referencing r162 (EDGE.CSV), and that reference doesn't make sense. The history of EDGE.CSV is as follows: R31: EDGE.CSV added to repo R32: one of the directories in EDGE.CSV's parent path was renamed (R162: a single file in a completely different part of the repo was deleted. Literally the only part of their file path in common was the repo root folder. EDGE.CSV and the deleted file have no shared history, relationships, or even data in common - one is a CSV file and the deleted file was a binary archive!) R192: EDGE.CSV was modified, along with several other files in the same folder. I've now checked, and every single other text: field in R192 references R192. There are no other revisions referenced. R335: EDGE.CSV was deleted. This is because that file wasn't very important, and all the other files which changed in r192 were updated in later revisions and apparently can be successfully checked out/updated. > > > To answer your question: yes, most definitely a copy of the r192 > > (and/or > > > r162) rev file would allow to pinpoint the problem, however you might > > > not want to share those files on a public list as they may contain > > > sensitive data (versioned file contents). > > > > I'll find out if I can release the broken revisions in their entirety. > > > > Perhaps someone would be willing to have a look at those two revision > files privately. > > (In fact, I might be able to do this too. But I'm reluctant to make > a promise or commitment about this.) > > > The corrupted revision doesn't actually contain anything particularly > > important (almost all the modified files in it have since been replaced > > by newer versions anyway). Can I fix the repository by dumping every > > revision except 192, and then reloading the good revisions into a new > > repo? Or will cause problems for the revisions after 192 since one of > > the revisions no longer exists? > > > > That won't work if files after r192 are stored as deltas against the > fulltext of r192. > Hmm, ok. I'm thinking about making a copy of the repository folder, and seeing what happens if I replace "text: 162" with "text: 192" in revs\0\192, since the offsets appear to pass the "smell test" for file size. Is there _any_ chance that that will work? Or are there other references I would also need to patch inside the revs\0\192 file? I thought I'd try doing an svndump and then use svndumpfilter to exclude EDGE.CSV, since it seems to be the only thing with an invalid rev reference, but the svnadmin dump operation fails when it gets to r192, since it can't process the reference to r162 either. Regards, David Hopkins Serck Controls = PRIVACY AND CONFIDENTIALITY NOTICE = The information contained in this message is intended for the named recipient only. It may contain privileged and confidential information. If you are not the intended recipient, you must not copy, distribute, take any action in reliance on it, or disclose any details of the message to any person, firm or corporation. If you have received this message in error, please notify the sender immediately by reply e-mail and delete all copies of this transmission together with any attachments. The views or opinions expressed in this e-mail or any attachment are not necessarily those of Serck Controls Pty Ltd. NOTE - You should carry out your own virus checks before opening any attachment.
RE: How to fix corrupt revision in repo?
> Daniel Shahaf wrote: > One more thing. The fact that in r162 one file was deleted *and no > files were added or changed* implies that the only new representations > in r162 would be directory representations --- it wouldn't add any > *file* representations --- so the reference to r162 in the node-rev > header (the sequence of ASCII lines of which the "text:" line is part) > is almost certainly bogus. > > I'm curious to hear whether the problem was indeed that the noderev > referred to r162 instead of r192. Sadly, it wasn't. I've now experimented with that. The offset supposedly within r162 is listed as 670867 bytes, which is well outside the total length of r162 as we've already discussed. But it isn't a valid pointer within r192 either; offset 670867 points to the middle of one of the other rep blocks within the r192 file. I've had a look at the other node-rev headers and it appears that all the rep blocks in the r192 file are fully accounted for by the node-revs which have text: 192. (That is, there are no representations in the r192 file which don't already have a valid node-rev header). I've had a look through all the revs between 162 and 192 which are at least 600 KiB in size. But I can't find *any* rev files in the whole repository history leading up to 192 where an offset of 670867 points to the beginning of a DELTA or PLAIN representation. So, I'm now assuming that both the reference to r162, and the offset of 670867 bytes, are bogus. But there aren't any obvious candidates for a non-bogus representation of that particular file update. Given that the file with the bogus node-rev is unimportant, and has since been deleted from the repo, is there any way to patch the r192 rev-file so that the repository has enough internal consistency to produce a valid dump file? At the moment it looks like the "nuclear option" is to check out the current version of everything and start a new repository with it. This *should* work because the corrupted file isn't included in recent revisions, so SVN won't need to de-reference the invalid reference in r192 when performing the check out. But if I can purge the broken-ness from the repo and keep the rest of the history, that would obviously be better. I certainly don't want to keep using a repo that doesn't validate and can't be dumped, though. > > Daniel Shahaf wrote on Fri, Sep 16, 2011 at 15:37:11 +0300: > > Quick reply, more verbose one might follow up later. > > > > Your reply breaks the nested quoting levels, please try to avoid it, > are > > you sending mail as text/plain? > > Sorry about breaking the nested quoting. I'm using Outlook which is pretty mediocre as a plain-text email client. I was already using text/plain, but Outlook's quoting style wasn't right, so I was trying to manually fix the text-wrapping and quote marks. Clearly I wasn't getting it right. I've now found a couple more Outlook settings which will hopefully address the problem. Unfortunately, it doesn't look like I'll be able to send you the actual rev file(s), at least not without a lot of inconvenience that I don't want to subject you to (ie an NDA, since we don't actually own the IP to any of the code which may be included in the rev file). Sorry about that. Regards, David Hopkins = PRIVACY AND CONFIDENTIALITY NOTICE = The information contained in this message is intended for the named recipient only. It may contain privileged and confidential information. If you are not the intended recipient, you must not copy, distribute, take any action in reliance on it, or disclose any details of the message to any person, firm or corporation. If you have received this message in error, please notify the sender immediately by reply e-mail and delete all copies of this transmission together with any attachments. The views or opinions expressed in this e-mail or any attachment are not necessarily those of Serck Controls Pty Ltd. NOTE - You should carry out your own virus checks before opening any attachment.
RE: How to fix corrupt revision in repo?
> Daniel Shahaf wrote on Monday, 19 September 2011 9:27 PM: > You ought to be able to keep the rest of the history even without fixing > the brokenness in r192. (as the file is deleted in HEAD, a checkout > should work; and you also have the option of dumping the history while > excluding the problematic file from it (via authz+svnsync/svnrdump or > svndumpfilter).) I'll look into the authz+svnsync/svnrdump option. Svndumpfilter doesn't work for me because the 'svnadmin dump' operation fails when it tries to process 192 (before I get a chance to use svndumpfilter to eliminate the bogus file). As far as I can tell svndumpfilter operates on dumpfiles that already exist, and can't actually stop svnadmin from trying to resolve the bogus node-rev header during the dump process. The authz+svnsync solution will hopefully allow me to effectively do that filtering at an earlier stage in the pipeline. Thank you very much for all your help, David Hopkins Serck Controls = PRIVACY AND CONFIDENTIALITY NOTICE = The information contained in this message is intended for the named recipient only. It may contain privileged and confidential information. If you are not the intended recipient, you must not copy, distribute, take any action in reliance on it, or disclose any details of the message to any person, firm or corporation. If you have received this message in error, please notify the sender immediately by reply e-mail and delete all copies of this transmission together with any attachments. The views or opinions expressed in this e-mail or any attachment are not necessarily those of Serck Controls Pty Ltd. NOTE - You should carry out your own virus checks before opening any attachment.
RE: How to fix corrupt revision in repo?
> > Daniel Shahaf wrote on Monday, 19 September 2011 9:27 PM: > > You ought to be able to keep the rest of the history even without > fixing > > the brokenness in r192. (as the file is deleted in HEAD, a checkout > > should work; and you also have the option of dumping the history while > > excluding the problematic file from it (via authz+svnsync/svnrdump or > > svndumpfilter).) > > I'll look into the authz+svnsync/svnrdump option. Svndumpfilter doesn't > work for me because the 'svnadmin dump' operation fails when it tries to > process 192 (before I get a chance to use svndumpfilter to eliminate the > bogus file). As far as I can tell svndumpfilter operates on dumpfiles > that already exist, and can't actually stop svnadmin from trying to > resolve the bogus node-rev header during the dump process. The > authz+svnsync solution will hopefully allow me to effectively do that > filtering at an earlier stage in the pipeline. For the benefit of anyone else who comes across this message thread in the future, I thought I'd post a final follow-up message with my results. The authz+svnrdump solution *did* work for creating a dumpfile without references to the corrupted file revision. I ended up setting up a temporary server where I could set custom authz permissions, and downloaded a beta SVN 1.7 client so that I could use svnrdump rather than svnsync (which was much simpler to set up). I've successfully loaded the purged dumpfile into a new repository which now works with svnadmin verify, svnadmin dump, svnadmin hotcopy etc. Thanks once again for all your help (especially the authz+svnrdump suggestion). Regards, David Hopkins Serck Controls = PRIVACY AND CONFIDENTIALITY NOTICE = The information contained in this message is intended for the named recipient only. It may contain privileged and confidential information. If you are not the intended recipient, you must not copy, distribute, take any action in reliance on it, or disclose any details of the message to any person, firm or corporation. If you have received this message in error, please notify the sender immediately by reply e-mail and delete all copies of this transmission together with any attachments. The views or opinions expressed in this e-mail or any attachment are not necessarily those of Serck Controls Pty Ltd. NOTE - You should carry out your own virus checks before opening any attachment.