Re: 350GB SVN repo creates around 1MB revision for simplest task
On Wed, Oct 13, 2010 at 00:45, FAISAL YAQOOB, BLOOMBERG/ 731 LEXIN wrote: > > > This all started when I noticed that my repository size is increasing at a > daily rate of 1GB. I did a simple test. Created a branch/tag of an existing > folder that had a size of 35KB. I took note of revision number and went to > $REPO/db/revs//rev-number/ and checked the size of the revision. It > was 1 mega byte. That sounds fishy. Any ideas on what might be wrong here. My > repo is about 350GB in size with about 600,000 revisions. > > P.S. I have already started a rebuild of the whole repository to see if that > makes any difference but it will probably take days to complete. > > > Question: Is there anything in svn that when it reaches a certain size every > commit will be that large? Does your repository contain a directory with very many entries? Are the changes that produce the large commits being made in or below such a directory? Let's assume to commit a single change to a single file to your repository. Let's further assume the file is located here, in your repository: /project/trunk/some-really-large-directory/notes/blah.txt When you commit the change to blah.txt, the new revision will rewrite the directory nodes between 'blah.txt' and the root of the repository: /project/trunk/some-really-large-directory/notes, /project/trunk/some-really-large-directory, /project/trunk, /project, /. When rewriting a directory node, FSFS always stores the new version in its entirety. (This is different from the way changes to files are stored, which are generally as differences to some previous version of the same file.) If /project/trunk/some-really-large-directory/ contains, say 1 files, then each commit to blah.txt will store a full copy of this directory (with its 10'000 names) in your repository. I noticed this when I started keeping a personal wiki under version control a few years ago. It was a flat directory of over 10'000 text files. I quickly noticed that commits were pretty big. (I've since switched to git for that task, for this and other reasons.) see also http://svn.apache.org/repos/asf/subversion/trunk/notes/subversion-design.html#server.fs.struct.bubble-up // Ben
Re: Periodically merge between trunk->branch and branch->trunk
On Thu, Dec 9, 2010 at 20:27, Kris Deugau wrote: > Daniel Albuschat wrote: >> >> I'd like to create a branch from trunk and periodically merge trunk >> into my branch to stay up to date with what happens in trunk. >> At some point, the feature in my branch reaches a kind of stability >> that is OK for trunk, so I merge it back to trunk. >> The difference to the standard situation is that I want to continue >> working on the branch, because the feature is not completely finished, >> yet, or it needs further enhancement. >> Currently the only solution I see is to reintegrate the branch to >> trunk and then re-create the branch. This has the shortcoming that all >> developers working on the branch have to switch to the new branch >> (although it is the same URL) to be able to work with it, right? This >> is ok when I'm working alone on my branch, but with a development >> team, it becomes tricky to make sure that everyone properly switch to >> the new branch. > > This is covered in the book: > > http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.reintegratetwice Before going down that route, be sure to have understood the implications of issue 3650: http://subversion.tigris.org/issues/show_bug.cgi?id=3650 short summary: this technique renders `svn log --use-merge-history` useless. // Ben
Re: svnadmin: Can't write to stream: File too large
On Mon, Jan 3, 2011 at 14:16, Torsten Krah wrote: > Hi, > > ive got a large repository (using svn version 1.5.1 r32289) and want to dump > the repository. > Dumping with svnadmin dump $repo > dumpfile does result in: > > svnadmin: Can't write to stream: File too large > > The file at this time is around 17 GiB in size. > The only workaround found at the moment is to bzip or gzip things via a pipe. > > But why i got this error and what is the solution - thought svn should support > such big dump files, shouldn't it? I don't think the error condition is originating in Subversion. Subversion just sees that it's writing to standard output, which your shell has arranged to write into a file. It seems more likely that you are running into a limitation of the underlying file system. What *nix flavor are you running? What's the file system you're trying to write the dump file to? "Around 17 GiB" seems like a strange maximum size. It isn't 17179869184 bytes, perchance? // Ben > thx > > Torsten > > -- > Bitte senden Sie mir keine Word- oder PowerPoint-Anhänge. > Siehe http://www.gnu.org/philosophy/no-word-attachments.de.html > > Really, I'm not out to destroy Microsoft. That will just be a > completely unintentional side effect." > -- Linus Torvalds >
Re: examining repository for files that only I have changed
On Fri, Jan 21, 2011 at 17:18, Woodworth, James wrote: > Hello, > > Over the past three weeks I have changed and committed some files. I need > to somehow recurse through the directories and show me only the files I have > changed. How do I do this? > > I suppose I could write a script to do this, but I suspect Subversion is > sophisticated enough, and I am just missing something. Subversion will give you the log as XML, if you ask for it: svn log --verbose --xml svn://server/repository Here's an XSLT that will extract the names of *files* added, deleted or modified by a given *author*: -- files-modified-by-author.xsl -- http://www.w3.org/1999/XSL/Transform"; version="1.0"> smithma -- end file -- Here's a BASH script which will call svn for the log, pull out the files using xsltproc and the above XSL transformation and then toss out duplicates using sort: -- files-modified-by-author.bash -- REPOSITORY=svn://server/repository AUTHOR=username function get_log () { svn log --xml --verbose $REPOSITORY } function select_files () { xsltproc --stringparam author $AUTHOR files-modified-by-author.xsl - } get_log | select_files | sort -u -- end file -- Edit the values of REPOSITORY and AUTHOR as necessary for your repository and user name. Then, you can call the script as follows and record the resulting list of file names in a file for perusal at your leisure: bash files-modified-by-author.bash > files-modified-by-author.txt HTH // Ben
Re: where is the missing versions
On Sun, Apr 10, 2011 at 19:28, Gu Shiyuan wrote: > Hi, > svn log gives me a list of the history, but I notice that some versions > are skipped. In the list, r44 is followed by r49 while r45-r48 is missing. > How can I find out those versions? I remember I made some important comments > when I commited those versions, but I don't not why they are gone. Thanks. > http://svnbook.red-bean.com/en/1.5/svn.forcvs.revnums.html Revision numbers are not per-file in subversion. They apply to the whole repository. If the svn log of some particular file or directory is not showing an entry for r5, that just means that that file or subtree of the repository was not changed in that revision. // Ben
Re: Evil UTF-8 Character in filename in repo causing issues on my wc
On Thu, Jun 16, 2011 at 18:24, Geoff Hoffman wrote: > > > On Wed, Jun 15, 2011 at 11:19 PM, Markus Schaber > wrote: >> >> Hi, Geoff, >> >> Von: Geoff Hoffman [mailto:ghoff...@cardinalpath.com] >> >>> I have a file with some (I believe) Portuguese characters in the >> >>> filename that someone managed to store in the repo without any >> >>> problem, >> >>> and I checked it out without issues, too. However, now on my working >> >>> copy, it thinks that file is locally new. >> >> Maybe it helps if you use a repo browser to rename the file to an >> >> ASCII-Only name directly in the repository? >> >> > That's all I ever really wanted to do, but I cannot, at least, I don't >> > know how to type the characters in the >> > filename of the file in svn without copy-paste from the svn ls terminal >> > output on Mac OS X, which I think has >> > already converted the filename it just printed, so I get a file not >> > found error when I try to rename or delete >> > it. It may have worked if I had ssh'd into the RHEL server, not sure. >> > It's a bit unclear. >> >> I thought of some graphical repository browser (like the one built into >> TortoiseSVN for example, I guess such things also exist for MacOS), it lets >> you browse the repository and select the file to rename directly in the >> repository, without the need of a local checkout / working copy. >> > > > Yeah, if I had more time I probably should fiddle with it. Our one guy here > on Windows using Tortoise has no issues with the same file, so it is indeed > a problem specific to Mac, as Stefan pointed out. Given that the issue > presents itself in Terminal and NetBeans IDE, it's safe to say any other > graphical SVN client on Mac would complain, too, but I didn't test it. IIRC > the graphical clients are using the command line under the hood. On Thu, Jun 16, 2011 at 18:24, Geoff Hoffman wrote: > > > On Wed, Jun 15, 2011 at 11:19 PM, Markus Schaber > wrote: >> >> Hi, Geoff, >> >> Von: Geoff Hoffman [mailto:ghoff...@cardinalpath.com] >> >>> I have a file with some (I believe) Portuguese characters in the >> >>> filename that someone managed to store in the repo without any >> >>> problem, >> >>> and I checked it out without issues, too. However, now on my working >> >>> copy, it thinks that file is locally new. >> >> Maybe it helps if you use a repo browser to rename the file to an >> >> ASCII-Only name directly in the repository? >> >> > That's all I ever really wanted to do, but I cannot, at least, I don't >> > know how to type the characters in the >> > filename of the file in svn without copy-paste from the svn ls terminal >> > output on Mac OS X, which I think has >> > already converted the filename it just printed, so I get a file not >> > found error when I try to rename or delete >> > it. It may have worked if I had ssh'd into the RHEL server, not sure. >> > It's a bit unclear. >> >> I thought of some graphical repository browser (like the one built into >> TortoiseSVN for example, I guess such things also exist for MacOS), it lets >> you browse the repository and select the file to rename directly in the >> repository, without the need of a local checkout / working copy. >> > > > Yeah, if I had more time I probably should fiddle with it. Our one guy here > on Windows using Tortoise has no issues with the same file, so it is indeed > a problem specific to Mac, as Stefan pointed out. Given that the issue > presents itself in Terminal and NetBeans IDE, it's safe to say any other > graphical SVN client on Mac would complain, too, but I didn't test it. IIRC > the graphical clients are using the command line under the hood. Yes, any graphical client working on a *working copy* on the mac would complain too. But, a hypothetical graphical repo browser that operates directly on the repository isn't effected by HFS+'s unicode normalization. // ben
how to get a list of unmerged revisions?
I've got a trunk and a maintenance branch. I periodically merge changes from the maintenance branch to trunk. Merge tracking is a help here. Commits on the maintenance branch begin with an ID identifying the issue, backlog item or user story. When I merge changes to trunk, I'd like to merge those belonging to the same ID together, but I don't want to mix different IDs in the same merge. To do this, I explicitly name the revisions I want to merge on the command line. Now, how do I get Subversion to tell me which revisions of have not been merged to ? What I've been doing is either (1) firing up TortoiseSVN, which shows already merged revisions in grey when you start a log viewer from the merge dialog box or (2) actually doing the merge and then meditating over the diff of the svn:merge-info property before reverting the merge again and doing it properly. Both of these approaches are awkward in the extreme. Surely I'm overlooking something obvious. // Ben
Re: SVN Dump Question
On Tue, Feb 16, 2010 at 18:39, Justin Connell wrote: > The reason, I'm asking such strange questions is that I have a very abnormal > situation on my hands here. I previously took a full dump of the repo (for > the reason you implied) where the original size of the repo on disk was 150 > GB, and the resulting dump file ended up at 46 GB. This was quite unexpected > (the dump is usually larger than the repos on smaller repos that I have > worked on). Such a large difference would make me suspicious, even if you're using --deltas when generating the dump. Back of the envelope calculation says that FSFS is wasting some space due to external fragmentation. I'd estimate less than 10GB, so that's not enough to explain this discrepancy. (Estimate: about 5.5 GB in storing revision properties (assuming 4KB file system block sizes, 1.5M revisions, circa 140 bytes actual revprop content per revisions) another 2.8 GB wasted on the revs themselves, unless they've been packed, assuming the last file system block of each file averages half full.) > Just as a sanity check, this is what I was trying to accompliesh: > > Scenario - The repo needs to get trimmed down from 150 GB to a more > maintainable size. We have a collection of users who access the repository > as a document revision control system. Many files have been added and > deleted over the past 2 years and all these transactions have caused such an > astronomical growth in the physical size of the repo. My mission is to solve > this issue preferably using subversion best practices. There are certain > locations in the repo that do not have to retain version history and others > that must retain their version history. > > Proposed solution - > > Take a full dump of the repo > run a svnadmin dumpfilter including the paths that need to have version > history preserved into a single filtered.dump file > export the top revision of the files that do not require version history to > be preserved > create a new repo and load the filtered repo > import the content form the svn export to complete the process > > Is this a sane approach to solving the problem? and what about the size > difference between the dump file and the original repo - am I loosing > revisions (the dump shows all revision numbers being written to the dump > file and this looks correct). Any files that were created outside of the repository portion being selected by dumpfilter, but later copied into said portion will cause you problems. I ran into that when trying to split up one of my larger repositories. At the time, I gave up and moved onto more productive endeavors.Looking back on it: Perhaps I could have made a series of incremental dumps, the devisions being at the points in history where the problematic renames took place. I then could have filtered each of the dumps separately and then loaded them into a fresh repository. sounds fiddly. > Another aspect could also be that there are unused log files occupying disk > space (we are not using Berkley DB though) is this a valid assumption to > make when using the FS configuration. FSFS does not write logs. > Thanks so much to all who have responded to this mail, and all of you who > take the time and read these messages > > Justin >
Re: Distributed Subversion Repositories
On Wed, Feb 17, 2010 at 11:08, Mark wrote: > I have the following problem. Repository A is used by a lab of developers. 1 > developer needs to work off site against the code base held in A, for an > extended period of time. He requires version control, but cannot gain access > to Repository A. To solve this we can dump/mirror A into repository B. > During this period A and B will independently updated. When the off site > developer returns we need to combine B back into A. Any advice on whether > this is possible under Subversion, should we be dumping, how to combine, > pitfalls and options/hints much appreciated. I use git-svn to work in this fashion, even when I'm in the office. (It allows me to clean up my local commits before really publishing them to the central repository.) I've been happy with this solution, though it does come with the downside of needing to understand two version control systems (git and svn) in order to use this competently. In my case, that proved profitable because what I've learned about version control from broadening my horizons to include git has made me a more competent svn user as well. More recently, I've been experimenting with bzr's built-in support for making local branches from svn repositories. I haven't really banged on it very hard, so I'm unsure about how robust it is, but it looks pretty nice. I've read that mercurial (hg) also has some kind of svn bridging analogous to git-svn. All three of these approaches would give your developer the ability to make commits locally (into bzr, git or hg) and then replay those commits into the central (svn) repository when your developer is on site again. (This is different and more powerful than just using one of these tools to put an svn working copy under local version control.) For the record, I was a heavy user of svk before discovering git-svn, but I was never really very happy with svk. I found it slow, poorly documented and prone to fail in confusing ways. // Ben
Re: Distributed Subversion Repositories
On Thu, Feb 18, 2010 at 03:59, Vincent Lefevre wrote: > On 2010-02-17 11:18:18 +, Julian Phillips wrote: >> If using a different tool is an option, then there are tools that let you >> interact directly with Subversion repositories from various other SCM >> tools, e.g. >> >> http://mercurial.selenic.com/wiki/WorkingWithSubversion >> http://flavio.castelli.name/howto_use_git_with_svn >> >> Then you don't have to worry about manually commiting back to Subversion >> ... > > But do they support properties? git-svn doesn't fully support properties. I mean you can proplist and propget, but not propset. git-svn itself ignores the various svn:* when checking out, except for svn:executable. So no eol-style, needs-lock, externals, etc. When I've needed control over properties, I've just used svn directly. git-svn works well for me in the way I use it, but really I'm using it as a glorified patch-queue-manager. // Ben
Re: dumpfilter exclude question
On Fri, Feb 19, 2010 at 02:46, Justin Connell wrote: > Hi, > If I use a dumpfilter exclude /parentpath/folder_to_drop will the content of > the parentpath be preserved? parentpath and everything in parentpath (except folder_to_drop and its contents) will be preserved. // Ben
Re: SCM, Content-Management and cherry-picking in big project
On Sun, Feb 28, 2010 at 21:33, Pacco wrote: > Hi, > > I'm responsible for the content of a single package within a bigger software > (several million lines of code). I'm experienced with ClearCase, Dimensions, > CVS, git, mercurial, etc. > Now, management decision was made to use Subversion over one of the big > players. > But on opposite side I'm confronted with customer's request of stable > releases, quick and immediate rollbacks, multiple release branches. > > So, I want to get the very best out of Subversion from the Content > Management point of view. I want individually select changes for my release > version. > My intentional idea looks like that: > > 1) Multiple repositories for release-results (binaries, libraries), > development of different package-branches, etc. - if you're going to put your build products under *source* control, then by all means keep them in a separate repository. - it doesn't make sense to spread branches belonging to one project/package/product/whatever across more than one repository. History does not cross repository boundaries. I'm not sure if that's what you are suggesting... > 2) Developers are working on one repository. After having made their > implementations or changes, they commit and copy/create a tag-version, i.e. > BUG_1234_TAG. Within the same repository that they were just working in, I trust. > 3) Official release planning: > Based on last release-branch, i.e. REL_0001, I want to compose the next > release by individually taking the different tags. For example: > > REL_ (old release) > BUG_1010_TAG > BUG_1020_TAG > > but not BUG_1030_TAG and BUG_1040_TAG. > Unfortunately I'm not aware of the commiting order! BUG_1010_TAG could have > been commited after BUG_1020_TAG. I don't think I'm following, but it sounds like intending to merge ("compose") the BUG_***_TAG tags into your release branch for your next release. That can be done, provided it's all in the same repository. > 4) After populating this release-working-directory, the official builds and > tests are done. A tag or branch will be copied to mark this release-version, > the release outcome are commited to the release-repository. the "release repository". is this the one that gets the build products? > So my questions are: > > - Is this scenario possible? Especially the cherry-picking of tags, not > file-versions? As far as Subversion is concerned, tags and branches are the same thing. Bu,t I can't really answer your question because I don't fully understand what your proposing. > - Is there a better way of selecting individual packets (tags) for composing > new working-directory? I don't know, how are you going about selecting them? // Ben
Re: SVN Server Best Practices?
2010/3/3 Mariusz Droździel : > Hello, > > After spending almost 2 days on recovering broken FSFS, which turned > out to be broken for over year already I have a question. What are > suggested best practices for SVN Repo? Some kind of cronbot which > would do a verify and a full checkout dialy comes to mind. What else? > Are there any scripts to do dialy checks on repo out there already? > What do you suggest? Thanks in advance. > one idea: I have read-only (svnsync) clones of all my important repositories in a separate building from the real svn server. This happened more by accident than by design: my continuous build checks out from the clones instead of the original repository. That way, I don't have to worry about my build jobs flooding the central server with polling to see if anything has changed yet. (There is some of this, however, since there's a cron job to keep the repositories up-to-date. This has helped me diagnose and repair repository corruption in one case, and it gives me a warm fuzzy feeling (despite the fact that I've been told that the central svn server is backed up). // ben
need heuristic to order the application of path elements a'la svn log --xml --verbose
svn log --xml --verbose emits elements, each of which contains a number of elements describing the changes made in that revision: ... /trunk/src/com/example/courts/model /trunk/src/com/example /trunk/src/com/example/courts/model/CourtsObject.java ... My difficulty, is that these path entries seem to come in arbitrary order, and it's not clear to to me how to efficiently reorder them such that they may be applied to a graph representing the previous revision's state to produce the expected state of this revision. - Top to bottom can't be right because that has us adding ".../example/courts/model" before we have added ".../example". - Bottom-to-top can't be right because that has us adding ".../model/CourtsObject.java" before we've added .../example/courts/model". - Start at the leaves and work up, is wrong for Add. - Start at the root and work down is wrong for Delete (Not shown here, but I've seen examples that boil down to this: /foo/bar" /foo/bar/baz" I've run into this question because I'm hacking on a little project in Clojure which sucks in an svn-log.xml, attempting to replay the structural changes described by the log into a history of the repository represented as a vector (indexed by revision number) of hash-maps. Directories become nested hash maps. Files just map a name to 'true' (for lack of anything better). [ ;;; r0 {} ;;; r1 {"trunk" {"src" {"com" {"example" {"Foo.java" true, "tags" {}, "branches" {}} ;;;... ] My aim, such as it is, is to have a more efficient way than svn ls -R to know which file names are currently in use in the repository. It's also just a nice example to hack on. I've tried a few different approaches, none satisfactory. My most recent attempt is brute-force recursive backtracking, but this blows the stack as soon as a revision with a few thousand changes shows up. Before I invest the time to rewrite things to remove the recursion and manage the stack explicitly as a list, I'd like to make sure I'm not Doing It Wrong. Maybe there's some obvious heuristic I'm missing? // Ben
Re: svn+ssh: Expected FS format '2'; found format '4'
On Sat, Apr 17, 2010 at 16:58, Rainer Dorsch wrote: > and a more recent version in a non-standard dir (which is added to my path > in .bash_profile) add it to your path in .bashrc instead. .bash_profile only gets loaded by interactive login shells. > I expect that .bash_profile is not executed when I do svn+ssh, but it is when > I do ssh into the host regularly. Yup. But .bashrc should be sourced in both cases. > I have now root access to the server containing the repository, I cannot use > keypair authentification. This isn't relevant. Besides, you *are* using keypair authentification: > r...@blackbox:~/Managed/Daten$ svn checkout > svn+ssh://rdor...@hdcl037/var/svn/OSSM/svn-repository/trunk OSSM > > Enter passphrase for key '/home/rd/.ssh/id_rsa': > rdor...@hdcl037's password: > svn: Expected FS format '2'; found format '4' // ben
Re: svn+ssh: Expected FS format '2'; found format '4'
On Mon, Apr 19, 2010 at 08:08, B Smith-Mannschott wrote: > On Sat, Apr 17, 2010 at 16:58, Rainer Dorsch wrote: > ... >> I have now root access to the server containing the repository, I cannot use >> keypair authentification. > > This isn't relevant. Besides, you *are* using keypair authentification: > >> r...@blackbox:~/Managed/Daten$ svn checkout >> svn+ssh://rdor...@hdcl037/var/svn/OSSM/svn-repository/trunk OSSM >> >> Enter passphrase for key '/home/rd/.ssh/id_rsa': >> rdor...@hdcl037's password: >> svn: Expected FS format '2'; found format '4' On second thought, I stand corrected: You're *trying* to use keypair authentification, but it's not working just as you said. // Ben
Re: Size of SVN Transaction vs Actual commit size
On Fri, May 7, 2010 at 06:48, Ravi Roy wrote: > > On Thu, May 6, 2010 at 3:37 PM, Ravi Roy wrote: >> >> Hi >> >> General question about Transaction size versus actual file commit size. I >> am getting strange results when I am trying to commit 1.28 file and >> transaction size I am getting is 5538 bytes. Does somebody knows the mystery >> ? > > > Sorry, Realised a mistake, please read 1.28 MB file. >> >> >> I expect transaction size to be the same as actual commit size ? or some >> compression in between ? >> >> Can somebody throw some light on this please? This question is both confused and confusing. You'll have more success if you use established subversion terminology correctly and avoid making up your own terminology without defining it. (I know it's a bit of a chicken-and-the-egg problem for a beginner. See: http://svnbook.red-bean.com/en/1.5/index.html) My guess: Subversion is working as designed and storing only the compressed difference between the newest version of your 1.28 MB file and some previous version of the same file. But, really, I can only guess: - Is the "1.28 MB file" a new to the repository, or did you commit changes to an existing 1.28 MB file? - What is this "transaction size" of which you speak? The size of a file like $REPO/db/revs/12/12345? - Why wouldn't you say "revision" in this case? - Or are you really using a hook script to to peek at the actual transaction during the commit before it becomes a revision? - What is this "actual commit size" of which you speak? - I assume you can successfully retrieve the "1.28 MB" file from the repository. If so, the information must be there somewhere even if you can't quite explain it to yourself. - On that last point: http://svnbook.red-bean.com/en/1.5/index.html will help. // Ben >> Thanks >> >> -RR >> >> >
Re: compact repository (many files)
On Wed, May 26, 2010 at 19:07, Paul Ebermann wrote: > Mark Phippard wrote: >> On Wed, May 26, 2010 at 11:53 AM, Paul Ebermann wrote: > [...] >>> Is there any way to reduce the file number of the repository without >>> throwing away >>> information? As in, throw the changes in revisions 0 ... 999 together in >>> one file (and >>> this way even safe some space for better compression)? (I don't care for >>> worse >>> performance, as those old revisions are used only very seldom.) >>> >>> I found nothing about this by extensive googling so I assume such a >>> function is not yet >>> implemented. Is this really a new idea, or was it discussed before and >>> rejected? Or is >>> there a simple workaround? >> >> Subversion 1.6 includes the svnadmin pack command which will convert >> every 1000 revisions into a single revision file. > > Ah, nice to now, so I now only have to wait (or lobby) for the next software > update on one > accessible computer. (The server works with older clients, IIRC.) > > (Thanks also to the other answerers.) > > (It shows I don't find the right search words.) > >> It does not pack >> the revprops though, so it will only save you about 2000 files instead >> of 4000. > > I think packing the revprops would (from a space viewpoint) be even more > useful than > packing the revisions, since most of these are quite small (similar) text > files. > > Statistics: only 91 of those have a size of more than 150 bytes, the combined > size > (including "file: size" like lines for each) is 356 KB, while they now use > 2.56 MB due to > block size of 1K (= 7 times the space), even without any compression. And I > would save > those 2000 files. On Wed, May 26, 2010 at 19:07, Paul Ebermann wrote: > Mark Phippard wrote: >> On Wed, May 26, 2010 at 11:53 AM, Paul Ebermann wrote: > [...] >>> Is there any way to reduce the file number of the repository without >>> throwing away >>> information? As in, throw the changes in revisions 0 ... 999 together in >>> one file (and >>> this way even safe some space for better compression)? (I don't care for >>> worse >>> performance, as those old revisions are used only very seldom.) >>> >>> I found nothing about this by extensive googling so I assume such a >>> function is not yet >>> implemented. Is this really a new idea, or was it discussed before and >>> rejected? Or is >>> there a simple workaround? >> >> Subversion 1.6 includes the svnadmin pack command which will convert >> every 1000 revisions into a single revision file. > > Ah, nice to now, so I now only have to wait (or lobby) for the next software > update on one > accessible computer. (The server works with older clients, IIRC.) > > (Thanks also to the other answerers.) > > (It shows I don't find the right search words.) > >> It does not pack >> the revprops though, so it will only save you about 2000 files instead >> of 4000. > > I think packing the revprops would (from a space viewpoint) be even more > useful than > packing the revisions, since most of these are quite small (similar) text > files. > > Statistics: only 91 of those have a size of more than 150 bytes, the combined > size > (including "file: size" like lines for each) is 356 KB, while they now use > 2.56 MB due to > block size of 1K (= 7 times the space), even without any compression. And I > would save > those 2000 files. > True, but revprops are mutable, while revs are immutable. That said, packing of revprops has been implemented on trunk and will be coming with 1.7. You might consider the option using bdb instead of fsfs for your repository. I wouldn't normally prefer bdb, but it does create far fewer files than fsfs. // ben
Re: compact repository (many files)
On Thu, May 27, 2010 at 16:31, Paul Ebermann wrote: > B Smith-Mannschott wrote: >> On Wed, May 26, 2010 at 19:07, Paul Ebermann wrote: > >>> I think packing the revprops would (from a space viewpoint) be even more >>> useful than >>> packing the revisions, since most of these are quite small (similar) text >>> files. >>> >>> Statistics: [...] >> >> True, but revprops are mutable, while revs are immutable. > > OK, this explains it a bit. > >> That said, >> packing of revprops has been implemented on trunk and will be coming >> with 1.7. > > All my wishes come true :-) > >> You might consider the option using bdb instead of fsfs for your >> repository. I wouldn't normally prefer bdb, but it does create far >> fewer files than fsfs. > > I read[1] bdb does not work reliable on NFS, and my repository as on such a > file system > (and I can't really change this). > > Or does this reliability problem only concerns access from multiple hosts? > (I in all cases access the repository via svn+ssh with the same server host, > if this matters.) No, don't host a BDB on NFS. That just seems like trouble. // ben
svnadmin load to fsfs 9 times faster than bdb!?
I've never really been much of a user of the BDB back end. I started with Subversion in the 1.3 time-frame and have always used FSFS. Nevertheless, I recently was playing around with BDB and found that it takes a *lot* longer to load from a dump file than FSFS. * repository of 41818 revisions * dump file (--deltas) is 2.6 GB I'm running subversion on my (pretty ancient) home server: * PowerPC G4 @ 1GHz (Titanium Powerbook) * 2.5" 120GB Notebook hard drive * 1GB RAM * Subversion 1.6.0 (yea, I know, ancient) I loaded the dump file into into FSFS and BDB repositories: * FSFS took 5 hours and consumed 1.9 GB (1.8 GB after packing) of disk. * BDB took 44 hours and consumed 2.5 GB of disk. BDB does require far fewer "inodes", though I don't think that's relevant on HFS+, which stores everything in one giant B-Tree. BDB data isn't portable across architectures and BDB versions. It seems slower. It's trickier to back up. I'm sure there are operations where it performs faster than FSFS (svn log, for example). But, I guess I know now why I've always preferred FSFS. Does anyone actually use the BDB Subversion backend? // Ben
Re: svnadmin load to fsfs 9 times faster than bdb!?
On Mon, May 31, 2010 at 02:52, Mark Phippard wrote: > You just need to specify --bdb-txn-nosync when running svnadmin > create. although it is possible that svnadmin load also accepts this > option. This should make the load times about the same. > > There is something you are supposed to change after loading the > dumpfile to turn this back on. > Yup, by creating the repository with --bdb-txn-nosync the time to load the dump was reduced to about 9 hours (less than double the time of FSFS). db/DB_CONFIG is the place to go to get the safer behavior back once the load has completed: set_flags DB_TXN_NOSYNC // Ben
merge, --reintegrate, --record-only, near infinite regress in --use-merge-history
Bah!! (I'm a just a wee bit frustrated as I write this, please forgive any roughness in tone.) After waiting patiently for $JOB to finally move our server to Subversion 1.6, I'm finding Subversion's merge tracking to be rather less help than I'd hoped. (I guess git/hg/bzr have spoiled me during my two year wait.) What's really, really, really biting me at this point is a modification of the --reintegrate workflow that I've seen described on this list a few times: usual way: - create topic branch from trunk - svn merge trunk topic regularly during development to keep topic branch up-to-date with trunk - svn merge --reintegrate topic trunk merges the topic back to trunk - svn rm topic get rid of the topic branch. recreate if necessary. as the next sync merge would cause it to try to pull in its own reintegration leading to much gnashing of teeth. A few clever people suggested a way around the delete-and-recreate the branch problem. Just use a record-only merge to make the topic branch aware of the reintegration revision on trunk without actually enduring the resulting conflicts. It sounded like a great idea when I read it. Very clever. Unfortunately, it's also *very* *broken*. This technique interacts badly with "svn log -g". After a few repetitions of merge, merge --record-only, merge --reintegrate, I'm finding the same revisions showing up over and over again in my trunk when using svn log -g. svn log -q | wc -l 2299 svn log -q -g | wc -l 14167 $ svn log -q -g | egrep -o -e "^r[0-9]+" | sort | uniq -c | sort -n | tail 131 r42278 135 r42171 135 r42251 135 r42252 136 r42196 136 r42205 136 r42219 136 r42223 191 r42282 <-- these two revisions appear 191 times in the log -g of trunk! 191 r42292 <-- When this has made svn log -g useless for me. ("Include merged revisions" in TortoiseSVN is similarly useless). This is unfortunate, because I had hoped that this feature would free me from having to painstakingly protocol which revisions were merged in the log message as I used to do in 1.4 days. What's worse is that log -g uses historical information from the repository (not just HEAD), so it's impossible to undo this error by just deleting svn:merge-info properties and starting over. It appears only dump, remove svn:merge-info properties, load of the whole repository could fix things. Is that my only way to correct this idiocy? Does the Subversion Book include a big red warning label that I overlooked? As a work-around, is there a way to limit the recursive depth of log -g? // Ben
Re: merge, --reintegrate, --record-only, near infinite regress in --use-merge-history
On Wed, Jun 2, 2010 at 12:46, Stefan Sperling wrote: > On Wed, Jun 02, 2010 at 12:37:13PM +0200, Stefan Sperling wrote: >> We should figure out how log -g should behave in this case (the behaviour >> you're seeing clearly isn't desirable) and then fix it. >> Please file an issue. > > Oh, and if you can, please write a small script (attached to the issue) > or test case (patch for our test suite) that shows the problem, > by triggering a single revision to appear too many times in log -g output. > That would help people who would like to investigate get started. > > Thanks, > Stefan > http://subversion.tigris.org/issues/show_bug.cgi?id=3650
Re: Huge Problem
On Sun, Jun 6, 2010 at 11:31, Abius X wrote: > Hello > I'm unable to commit or update, since i've restored a previous revision of my > subversion repository on the server. > > What should I do? Describe "unable to commit or update" and "restored a previous revision of my subversion repository" more precisely. Are you seeing error messages? What are they? When you say "a previous revision", do you mean a backup? How old was it? Older than your working copy? Or have you checked out a fresh working copy from the restored repository? // ben
Working copy 'from the future' after restoring server from old backup (was "Huge Problem")
(I'm sending this Cc back to the list, since others may have input too.) On Sun, Jun 6, 2010 at 19:36, Abius X wrote: > Hi > Sorry for brief desscriptions, > Yep there are different error messages that describe server revision is older > than working copy. > > Yes I mean a backup, the server crashed and we restored a backup > It was like 3 weeks older than the working copy (revision 270 vs 291) > > We have many developments done in working copy that are not in the > repository, and we can't commit or update. You'll need to check out new working copies in order to be able to interact with the repository, but as you say you have changes in your current working copies you don't want to lose. Here's what I would do: First I need to define a few terms. I'm going to assume you're on Windows since most people are and most unix people are smart enough to replace \ with /. I'm also not going to assume familiarity with diff and patch. I'm also going to assume you haven't moved or renamed anything between r270 and r291. C:\project-old This is your working copy (r291 + local changes) before your repository died. C:\project-new This is your working copy (r270) from the restored repository. http://svn.example.com/repo/project/trunk This represents the URL your working copy was checked out from. 0. Make a backup copy of C:\project-old (You can never be too paranoid) 1. 'Export' a copy of C:\project-old. This will give you a copy free of the .svn administrative directories. svn export C:\project-old C:\project-old.export 2. Check out a new working copy of http://svn.example.com/repo/project/trunk svn co http://svn.example.com/repo/project/trunk C:\project-new 3. Copy the contents of C:\project-old.export with C:\project-new There are various ways to do this. I would just use the Widows file manager (confusingly called "Windows Explorer"). Drag project-old.export and drop it onto C:\project-new. Windows will offer to merge the contents of the two folders. Let it do so. 4. Use svn status to see the local changes in C:\project-new. Convince yourself that they make sense. svn status C:\project-new also helpful: svn diff C:\project-new 5. Think up a good commit comment to summarize the last 3 weeks (!) worth of work. Put it in this file: C:\BACKUPS_SHOULD_BE_AUTOMATED.txt. 6. Check in your local changes using your hopefully thoughtful commit message svn commit -F C:\BACKUPS_SHOULD_BE_AUTOMATED.txt C:\project-new As solutions go, this isn't beautiful and doesn't cover all cases (see my assumptions above.) It should leave you with the changes between r270 and of your working safely in your repository so work can continue. Once you are happy with the results, you can throw away C:\project-old. // Ben
Re: Check out problem because of alleged problematic URL
On Sun, Jun 6, 2010 at 15:49, Hirschberg, Benyamin wrote: > Hi > > > > I’m stuck with an annoying problem. > > > > I have an SVN server set up on a LAN server. I’m accessing it with > http://ada-srp/kr/svn/trunk, it is working from browsers and windows svn > clients (both Tortoise and command line client 1.6.4). > > > > The problem is when I’m trying to do a checkout on the server itself > (ada-srp) with the very same command line command as on the windows machine, > I’m getting the following error: > > [benya...@ada-srp ~]$ svn checkout http://ada-srp/kr/svn/trunk/ kr_repos > > svn: URL 'http://ada-srp/kr/svn/trunk' is malformed or the scheme or host or > path is missing > > > > The version of the client is 1.6.5. > > > > This is not likely to be a networking problem, I can do wget with no problem > on http://ada-srp/kr/svn/trunk from the same shell. In case of this error I > don’t see any change in apache log. > > > > Can you tell me what the problem is? Have you seen this?: http://subversion.apache.org/faq.html#unrecognized-url-error Subversion uses a plugin system to allow access to repositories. Currently there are three of these plugins: ra_local allows access to a local repository, ra_neon or ra_serf which allow access to a repository via WebDAV, and ra_svn allows local or remote access via the svnserve server. When you attempt to perform an operation in Subversion, the program tries to dynamically load a plugin based on the URL scheme. A `file://' URL will try to load ra_local, and an `http://' URL will try to load ra_neon or ra_serf. The error you are seeing means that the dynamic linker/loader can't find the plugins to load. For `http://' access, this normally means that you have not linked Subversion to neon or serf when compiling it (check the configure script output and the config.log file for information about this). It also happens when you build Subversion with shared libraries, then attempt to run it without first running 'make install'. Another possible cause is that you ran make install, but the libraries were installed in a location that the dynamic linker/loader doesn't recognize. Under Linux, you can allow the linker/loader to find the libraries by adding the library directory to /etc/ld.so.conf and running ldconfig. If you don't wish to do this, or you don't have root access, you can also specify the library directory in the LD_LIBRARY_PATH environment variable. //Ben
Re: Working copy 'from the future' after restoring server from old backup (was "Huge Problem")
On Sun, Jun 6, 2010 at 21:43, Abius X wrote: > Hi > thanks for the quick response, > Actually I'm on OS X Snow Leopard (10.6.3) and > I'm using Eclipse and Subversive (or Subclipse whichever is bundled with > Eclipse) > > I'm partially familiar with diff, but my project folder is quite huge! > > So I just create another Working Copy, apply the changes to it and commit? > > what if there are many folders added? I even have many branches added since > then! > > I'd appreciate your help again > Kind Regards > Does your working copy include branches? But, you are on OS X, which helps things a bit. It's past my bedtime, however, which doesn't help things. I'll sketch what I would do: 0. make a backup copy of your old working copy before continuing cp -R project-old project-old.backup 1. check out a working copy from your restored repository. I'm going to assume it's of at least a whole project, so it contains branches, tags and a trunk. svn co http://svn.example.com/repo/project project-new 2. if there are branches (or tags) in project-old that don't exist in project-new, recreate those in project-new by (for example) copying from trunk: svn copy project-new/trunk project-new/branches/some-branch-name ... 3. check that stuff in. you'll at least have the right branches, even if they have the wrong content. svn commit -m "restored missing branches (with wrong content)" project-new 4. incorporate the changes from your old working copy into the new working copy. rsync -r --del -c --exclude=.svn project-old/ project-new ## the trailing slash in "project-old/" is significant! 5. svn status may reveal missing (!) files or unknown (?) files. We need to tell svn that the missing ones are deleted and the unknown ones are to be added. I'd do it like this: cd project-new svn status | egrep "^[!]" | cut -c9- | tr '\n' '\0' | xargs -0 svn rm svn status | egrep "^[?]" | cut -c9- | tr '\n' '\0' | xargs -0 svn add 6. examine the results with svn status and svn diff. svn commit when you are happy with te results. // Ben
Re: Repository shrinkage on conversion 1.5 -> 1.6?
On Wed, Jul 28, 2010 at 11:01, John Beranek wrote: > > I maintain a 76000 revision Subversion 1.5 repository that add up to a > 101GiB FSFS db. > > I'm looking to upgrade to 1.6, so did a dump and load cycle. The > resulting dump file was 173GiB (I use --deltas). What surprised me is > that the restored repository came out as 71GB - 30GiB smaller! > > Is this unusual? Does it suggest some data has gone missing, or what? > Nothing I saw in the release notes suggest repository efficiencies of > this order - I have _not_ packed the repository. > Hi John, It looks like representation sharing works particularly well for your repository. http://subversion.apache.org/docs/release-notes/1.6.html#rep-sharing The 30% reduction you're seeing is a pretty nice space savings. Do you do a lot of branching? Do you have the identical files comitted separately to different parts of the repository? // Ben
Re: problem with SVN filename encoding on MacOSX
On Mon, Aug 16, 2010 at 05:06, Albert Zeyer wrote: > Hi, > > I have some problems with filename encodings under MacOSX. For example, I am > seeing always status messages like this: > > a...@ip212 1057 (Integration) %svn status > ? Verbesserungsvorschläge_Applets.odt > ! Verbesserungsvorschläge_Applets.odt > > And now I also cannot delete such a file anymore because it says it cannot > find it. > > More details here: > http://superuser.com/questions/176243/problem-with-svn-filename-encoding-on-macosx > > What do I need to do? This is a known issue. http://subversion.tigris.org/issues/show_bug.cgi?id=2464 One poster on the comment thread to that issue suggested as follows: Additional comments from Julian Mehnle Thu Aug 6 07:40:30 -0700 2009: > There *is* a work-around: install the "unicode_path" variant > of the subversion MacPorts package: > > $ sudo port install subversion +unicode_path I haven't tried this myself. // ben
Re: How to tell Subversion what is the encoding of a file?
On Sat, Aug 21, 2010 at 17:33, JWalker wrote: > Hi, > > I use ViewVC for viewing SVN repositories. I have files (C source > files) with comments written in cyrillic (Windows-1251). ViewVC > display these comments as ? characters. I was told in another forum > that I have to tell Subversion what the encoding of these text files > is, so as Subvewrsion to be able to tell ViewVC this information. So > my question is: How to tell Subversion what the encoding is? May I > have to set some property on the files? What property? svn propset svn:mime-type "text/plain;charset=windows-1251" SOMEFILE.c // ben
Re: character encoding of file content
On Mon, Aug 23, 2010 at 16:23, Mark Phippard wrote: > On Mon, Aug 23, 2010 at 10:21 AM, Schroeder, Hartmut > wrote: > > > I have a question concerning character encoding of file content. > > > > Let's say, we have two text files containing german umlauts and with > > different file encodings, one file has a 8859-1 encoding and the other > > UTF-8. Both file are committed to Subversion (on different machines). > > > > On a checkout both text files shall have the local encoding (UTF-8, > > e.g.) > > > > Is it possible with Subversion? > > Subversion does not record the encoding of files, nor does it try to > change or convert it. The only thing that Subversion does this for is > the filename. It stores all path names in UTF-8 and locally they are > determined by your filesystem and locale. > > Well, there's svn:mime-type, but tools have to know to look for it. The only one I'm aware of doing so is Apache when one is browsing the contents of a repository via HTTP. But, you won't find any automagical character set conversion in Subversion -- nor (IMHO) should you want to. // ben