On Fri, Dec 3, 2010 at 10:44 AM, Jim Jenkins <j...@homrichberg.com> wrote:
> I’m planning to use Hooks to add OCR scanning for select documents going into > a SVN repo. I’m not really sure where to start so I’m hoping someone here can tell > me if it’s possible and even suggest how best to proceed. I'm going to take a slightly different approach. Pre-commit hooks are not what you want. 1. A pre-commit hook should only be used if the developer has some way of fixing an issue. A good pre-commit hook is to make sure all files that end in *.sh have the property svn:eol-style set to "LF". If a developer doesn't set this, and the pre-commit hook fails, the developer can easily fix the problem and recommit the file. 2. The user is left twiddling their thumbs on hooks, even a post-commit hook. If you have a hook that takes a few minutes to run, users will get impatient. They may simply not bother committing changes they should until they have a big horking commit which they'll do at the end of the day and leave. 3. Changing committed files on a commit is very difficult. You, after all, don't have access to the client's workspace, so you'll have to emulate their checkout, so you can make your changes and do a commit. Of course that means that your pre-commit hook will fire off once more, so you'll have to have some mechanism in place letting your pre-commit hook know to not do whatever is it was suppose to do in the first place. 4. Also, it's a bad idea to change a commit on a user. As Ulrich Eckhardt pointed out, your user's client doesn't know that the files they just committed were changed. Besides, what if your pre-commt hook created an error as a side effect of that hook? I once wrote a pre-commit hook in ClearCase to automatically expand RCS keywords. On occasion, the pre-commit hook expanded a sprintf statement or something like that, and the developer was furious because their program worked, and I botched it up. I would instead think of your committed files as a "source" code, and that your OCR scans as a "compiled" code. What you probably want, although you really don't compile, is a continuous build server that takes the committed files, and creates the needed OCR scans of these files, and stores them where they can be referenced. The storage area does not have to be Subversion (and in fact, I would argue that Subversion is not your ideal storage area). Take a look at Hudson. It's a powerful continuous build server and is very flexible in its setup. With Hudson, you could automatically do the scans after a commit, and then email the user if the scan failed for some reason. It is possible to only have Hudson scan the files that were changed (since Hudson knows which files were committed). And, it is possible to have Hudson FTP or store the changed OCR files onto another server (or to simply keep the scanned archive on Hudson itself. It'll. take a bit of tweaking, but so would trying this in Subversion. And, you and your users would be much happier with this arrangement. -- David Weintraub qazw...@gmail.com