Good suggestion Daniel. While this does markedly improve performance, it
does so at the expense of changing the underlying protocol.
Unfortunately, I'm not at liberty to change the underlying protocol - I
have customers that define the protocol, I don't. So my "program" needs
to access their repos using their protocols.
But the results:
ssh port forwarding to an active svnserve takes about 2.5s.
pure svnserve takes roughly 2s
svnserve -d --listen-port 8000
ssh epe...@localhost -L 3690:localhost:8000
...then run my svn update commands...
--eric
On 07/06/2010 12:52 PM, Daniel Shahaf wrote:
Have you tried using SSH port forwarding instead of svn+ssh://?
Daniel
(perhaps one of the other devs will address the points you made; I'm
myself not familiar with that part of the code)
Eric Peers wrote on Tue, 6 Jul 2010 at 21:17 -0000:
Howdy,
I've got a program that needs to checkout specific files at specific versions.
In this particular case a branch does not make sense. I have found that the
performance of svn+ssh in this case is very bad.
I run the rough equivalent of:
svn update -r 2 file1 file2 file3 file4 file5
svn update -r 3 file6 file7 file8 file9 file10
overall I have about 100 such files, and 2 svn update calls. I've accomplished
this with an xargs frontend to svn so as to not overrun the cmdline.
if I use file:/// as a protocol, it runs in 3 seconds.
if I use svn+ssh:/// as a protocol, it takes 53 seconds.
if I run an svn update -r 3 with no files, it takes about 2s.
I wrote a direct svn api-program to accept the file lists, make the
authentication a single time, and then call svn_update3. This still runs super
slow. around 53s still.
I suspect the problem is because each individual file is called out, locked,
etc. Is there a way to batch these locks together or improve performance?
Cause the ssh channel/ra session to be reused?
Perusing the source code suggests that svn_client__update_internal will be
called for each element in my paths. Since an individual file lock/svn
directory write does not seem to be overly performance costly, I suspect the
problem is in the svn_client__open_ra_session_internal + svn_ra_do_update2
calls from svn_client__update_internal? Is the subversion code opening a new
ra_session for each of these files at the expense of an ssh+svnserve on the
remote end? Is there a way to force a single RA session across all the files
at an API level without writing my own svn_client__update_internal?
thoughts here?
thanks!
--eric