Geoff Worboys wrote on Tue, 22 Jun 2010 at 17:36 -0000: > powershell .\Import-from-Source D:\SourceFolder D:\Temp\DumpFile.dat > > It takes the entire contents of D:\SourceFolder and creates > a subversion dump file in D:\Temp\DumpFile.dat. It replicates > the structure inside D:\SourceFolder so if you want a "trunk" > folder etc you have to have created them first. > > Objects (the full tree) from D:\SourceFolder are first sorted > by their last-write-time property and I then create a revision > entry for each date that appears (the revision resolution is > adjustable in the script). This makes it so that each file > ends up appearing to have been committed on the same date that > it had on the original source file, so checking out the files > with the use-commit-times option gives them same date as the > original file (if not, necessarily, exactly the same time). >
i.e., you import the files in order of their timestamps, so that svn:date remain globally sorted? Nice! > Q1: If, in the dump file, I sometimes give a file a property > svn:eol-style = native, but the file itself has been copied > directly into the dump file (ie. contains CRLF end-of-lines) > is that going to matter to svnadmin load? > > [Will the load process take care of things for me or do I > need to parse such files and make them all LF - which is what > svn says it uses internally for "native" files? ] > > My experiments seemed to show that svnadmin dump also produced > the the CRLF end-of-lines but it all gets quite confusing so > thought I would ask here. > i.e., 'svnadmin dump' produces CRLF for svn:eol-style=native files? That surprises me; I'd expect such files to be outputted with LF in dump files. (My testing agrees with my expectation.) Can you double-check? In any case, it probably *should* use LF, since dumpfiles are supposed to be a portable binary format. > Since I mostly work under Windows it's probably not a big deal > for me ... but I'd rather the script was correct in case it > gets used by others that may have other requirements. > > > Q2: When writing the code to try and identify text versus > binary files I decided to look at what subversion did ... but > now I am confused. In libsvn_subr\io.c function > svn_io_detect_mimetype2 a comment says: > going to examine the first block of data, and make sure that 85% > of the bytes are such that their value is in the ranges 0x07-0x0D > or 0x20-0x7F, and that 100% of those bytes is not 0x00. > but my reading of this code > if (((binary_count * 1000) / amt_read) > 850) > { > *mimetype = generic_binary; > return SVN_NO_ERROR; > } > suggests that it is actually setting the type to binary only > if it finds more than 85% are binary bytes (in earlier code a > file binary if forced if any null byte is found). > > Can anyone explain this? A bug or am I missing something? > What's the question? Are you saying the code/comment disagree? > Q5: I found a description of the dump file in the source but > that description says "Properties are stored in the same > human-readable hashdump format used by working copy property > files," Any pointers to a description for that? > You're quoting <http://svn.apache.org/repos/asf/subversion/trunk/notes/dump-load-format.txt>. Internally the function it uses is svn_hash_write2(), and there's a small documentation comment at the top of hash.c. But, as you say, > (Obviously I've gotten by just by visually checking dump files > produced by svnadmin, but it would be good to know what I was > doing. ;-) > the format isn't hard to reverse-engineer, right? > > Hmmm... big post for my first post. Hope that's okay. > > Yeah. For next time, you could consider adding a one-paragraph summary at the top, and/or make it clear what kind of responses you're looking for (e.g., "Hey, I'm looking for people to try my script", or "Hey, I'm looking for answers to questions I ran into developing a script", or ...)