On Sun, Feb 14, 2010 at 2:34 AM, Glenn Maynard <gl...@zewt.org> wrote:
> Based on looking through [1] some more, it looks like "cp -a wc1 wc2" > and renaming working copies should work fine, since the database is > inside the working copy, and will just get copied along with the rest. In SVN 1.7 there will be a single .svn folder at the root of a working copy. Beyond 1.7 there are plans to make this configurable so that you could have it in ~/.subversion and shared across all your working copies. Of course the default will be the same as it will be in 1.7. > Hopefully there'll still be a way to slice out a piece of a repository > ("mv wc1/trunk .; rm -rf wc1"), which wouldn't work if it's dependent > on a global db at the top. There has been talk of adding a svn detach command to do this. Not sure if it will be done as part of 1.7. AFAIK, the plan is to add it later. > I have a few gigs of ~5 meg files in Subversion, and the idea of > storing large blocks of data in SQLite is a bit scary; I don't think > it's designed for blobs that size. Anything that lumps files together > like this is effectively subjected to two layers of fragmentation > instead of one (filesystem + db). There has never been any plan or discussion to store the pristine files in SQLite. As you point out, it is not well suited for that and would work poorly. SQLite is being used to store the SVN metadata and properties which are arguably just stored in a custom DB today. When the WC data is centralized the current code that has to read all the metadata, parse it and write it back out would be less efficient than using a database and just being update/insert rows as needed. Plus we got some benefits from being able to use SQL indexes. The storage format for the pristine files will still be files but it is being changed to be based on the SHA-1 hash for the files. I'd imagine the structure will be sharded based on the first two characters of the hash. This will bring several benefits: 1) On case insensitive file systems like Windows and OSX it will allow files to be renamed only by case. Today that fails because of the way the pristine copy is stored. Once it is a SHA-1, it will not matter. 2) Space savings. When you have files in a working copy with the same hash, there will only be a single pristine copy stored. This will likely be a minor benefit in 1.7, but imagine when you can have all your working copies centralized in a single location. If you have multiple copies of trunk checked out, or even multiple branches, it is likely there would be a lot of sharing of pristine copies and would save a significant amount of disk space. 3) Performance. This will be a future benefit. But again, imagine you have a single centralized working copy area. When you do a checkout we can enhance the client/server protocol so that the when the server returns the list of items for the client to fetch it also includes the SHA-1. Now the client can be made smart enough to only fetch the items it does not already have. So imagine you have trunk checked out and you want to checkout a branch. Maybe 90% of the files would already be on your disk and the client could just fetch the other 10% and construct the working copy from what it already has available. -- Thanks Mark Phippard http://markphip.blogspot.com/