On Aug 18, 2010, at 11:24 AM, Ryan Schmidt wrote:

> 
> On Aug 18, 2010, at 07:47, Lezz Giles wrote:
> 
>> Imagine the following scenario:
>> 
>> - trunk has several large files (> 20M) which are updated regularly.  These 
>> files are +only+ changed on trunk.
>> - there are several branches, each of which updates from trunk at least once 
>> a week.
>> 
>> The merge of the large files from trunk takes an excessive amount of time 
>> and creates new very large versions of files
>> that previously took up effectively no space at all, since they were cheap 
>> copies.
> 
> Are you sure these are now taking up a lot of space? Are you using the latest 
> Subversion and is your repository thoroughly upgraded? I thought the new "rep 
> sharing" feature was supposed to make this a non-issue now. But I have not 
> actually used it myself so maybe I misunderstand its purpose.

We aren't running the latest SVN on our server, though we will be moving to an 
upgraded system in the next day or so.  With this feature, it looks like disk 
space won't be an issue moving forward.

However there is the secondary question of performance.  Running 1.6.5, 
creating a brand new repo with a 20MB text file
on trunk, then creating a branch, adding a new version on trunk, and merging 
out to the branch takes a long time - I'm giving
up waiting for it after around 30 minutes, running on a six-month old MacBook 
Pro where top is showing svn taking up around
100% of a cpu.  And this is for files where a quick review of the histories 
would show that the merge can be done as a simple
copy.

So my question remains - is it worthwhile analyzing the history of files 
involved in a merge and doing the copy where the
analysis shows that a copy is all that is required?  Are there pitfalls that 
I'm going to find?  I think the only difficult step is going
to be determining that the history of one file is a prefix of the complete 
history of the other file - in git terms, that fast-fowarding
is possible.

Lezz

Reply via email to