On Aug 18, 2010, at 11:24 AM, Ryan Schmidt wrote: > > On Aug 18, 2010, at 07:47, Lezz Giles wrote: > >> Imagine the following scenario: >> >> - trunk has several large files (> 20M) which are updated regularly. These >> files are +only+ changed on trunk. >> - there are several branches, each of which updates from trunk at least once >> a week. >> >> The merge of the large files from trunk takes an excessive amount of time >> and creates new very large versions of files >> that previously took up effectively no space at all, since they were cheap >> copies. > > Are you sure these are now taking up a lot of space? Are you using the latest > Subversion and is your repository thoroughly upgraded? I thought the new "rep > sharing" feature was supposed to make this a non-issue now. But I have not > actually used it myself so maybe I misunderstand its purpose.
We aren't running the latest SVN on our server, though we will be moving to an upgraded system in the next day or so. With this feature, it looks like disk space won't be an issue moving forward. However there is the secondary question of performance. Running 1.6.5, creating a brand new repo with a 20MB text file on trunk, then creating a branch, adding a new version on trunk, and merging out to the branch takes a long time - I'm giving up waiting for it after around 30 minutes, running on a six-month old MacBook Pro where top is showing svn taking up around 100% of a cpu. And this is for files where a quick review of the histories would show that the merge can be done as a simple copy. So my question remains - is it worthwhile analyzing the history of files involved in a merge and doing the copy where the analysis shows that a copy is all that is required? Are there pitfalls that I'm going to find? I think the only difficult step is going to be determining that the history of one file is a prefix of the complete history of the other file - in git terms, that fast-fowarding is possible. Lezz