Hi All, I've just joined this group. I've been using subversion for a few years now - most of my day to day stuff via TortoiseSvn. A few days ago I once again came across a requirement where I said "subversion is what I need here" only to once again hit the issue that to start a new project in subversion means losing all the file time-stamps. I don't want to re-start arguments on that front (I see from googling and archives that it is a VERY old discussion). I simply have some questions in regard to my own chosen work-around to the problem.
It seemed to me that for most of my requirements I did not need extra features in subversion, all I really needed was some way to create the new repository so that it looked like all the files I imported were committed at the time it says on the file from the original source. If I could get that then I could use the "use-commit-times" option to keep things very close to the way wanted them. [And I could keep using TortoiseSvn and I would be a happy man.] That all led me to trying to create my own dump files. I ended up choosing powershell scripting because I wanted to learn about it and this seemed like an interesting project to try with it. I have a working script now, put simply it is executed as: powershell .\Import-from-Source D:\SourceFolder D:\Temp\DumpFile.dat It takes the entire contents of D:\SourceFolder and creates a subversion dump file in D:\Temp\DumpFile.dat. It replicates the structure inside D:\SourceFolder so if you want a "trunk" folder etc you have to have created them first. Objects (the full tree) from D:\SourceFolder are first sorted by their last-write-time property and I then create a revision entry for each date that appears (the revision resolution is adjustable in the script). This makes it so that each file ends up appearing to have been committed on the same date that it had on the original source file, so checking out the files with the use-commit-times option gives them same date as the original file (if not, necessarily, exactly the same time). Yippee, it works. Now to some gritty details, which is why I am here. Q1: If, in the dump file, I sometimes give a file a property svn:eol-style = native, but the file itself has been copied directly into the dump file (ie. contains CRLF end-of-lines) is that going to matter to svnadmin load? [Will the load process take care of things for me or do I need to parse such files and make them all LF - which is what svn says it uses internally for "native" files? ] My experiments seemed to show that svnadmin dump also produced the the CRLF end-of-lines but it all gets quite confusing so thought I would ask here. Since I mostly work under Windows it's probably not a big deal for me ... but I'd rather the script was correct in case it gets used by others that may have other requirements. Q2: When writing the code to try and identify text versus binary files I decided to look at what subversion did ... but now I am confused. In libsvn_subr\io.c function svn_io_detect_mimetype2 a comment says: going to examine the first block of data, and make sure that 85% of the bytes are such that their value is in the ranges 0x07-0x0D or 0x20-0x7F, and that 100% of those bytes is not 0x00. but my reading of this code if (((binary_count * 1000) / amt_read) > 850) { *mimetype = generic_binary; return SVN_NO_ERROR; } suggests that it is actually setting the type to binary only if it finds more than 85% are binary bytes (in earlier code a file binary if forced if any null byte is found). Can anyone explain this? A bug or am I missing something? Q3: If there are already other scripts around that do this then feel free to tell me that I have wasted my time. I could not find any similar solutions in my searching. Q4: If there are any powershell people here that would like to review and test the code I am quite happy to share it ... but would not recommend it to a scripting novice until it has been checked over and tested by more than me. Q5: I found a description of the dump file in the source but that description says "Properties are stored in the same human-readable hashdump format used by working copy property files," Any pointers to a description for that? (Obviously I've gotten by just by visually checking dump files produced by svnadmin, but it would be good to know what I was doing. ;-) Hmmm... big post for my first post. Hope that's okay. -- Geoff Worboys Telesis Computing