> -----Original Message----- > From: Johan Corveleyn [mailto:jcor...@gmail.com] > Sent: dinsdag 31 maart 2015 14:13 > To: users@subversion.apache.org > Cc: Bert Huijben; Philip Martin; Ben Reser > Subject: Re: Branching slow 1.8.11 https > > On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn <jcor...@gmail.com> > wrote: > > On Sun, Mar 29, 2015 at 7:57 PM, Johan Corveleyn <jcor...@gmail.com> > wrote: > >> On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben <b...@qqmail.nl> wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: Johan Corveleyn [mailto:jcor...@gmail.com] > >>>> Sent: vrijdag 27 maart 2015 22:03 > >>>> To: users@subversion.apache.org > >>>> Subject: Branching slow 1.8.11 https > >>>> > >>>> Does the following ring a bell for someone? > >>>> > >>>> Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to > >>>> 1.8.11 (CollabNet package). Some time after that, we discovered that > >>>> branching was very slow. I'm talking about pure server-side branching > >>>> ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11 > >>>> client (tried both from same machine as the server, and from another > >>>> machine on the LAN (100 Mbit)). > >>>> > >>>> - Branching trunk (containing many directories and files): 6-8 minutes > >>>> - Branching a subfolder of trunk: 20-30 seconds (still very slow) > >>>> - Branching a single file is fast (< 0.5s or so). > >>>> > >>>> So it seems the performance degrades depending on the depth or size of > the > >>>> tree. > >>>> > >>>> Now, it gets more interesting: > >>>> - The resulting rev file on the server is always very small (as it > >>>> should be, it contains only a lightweight 'copy' of the trunk node). > >>>> - Our repos is currently served via https (Apache 2.2.29). > >>>> - Branching with file:/// urls is fast (branching trunk takes 0.6s). > >>>> - When starting an svnserve instance serving the same repository, and > >>>> branching with svn:// urls, it's fast as well (also 0.6s). > >>>> - We reproduced it on a copy of the production repo. > >>>> - Experimenting with the test copy, we found that > >>>> $repos/dav/activities.d contains ~2000 files. When we clear that > >>>> directory, the branching times go down by more than half (~2 minutes > >>>> for trunk, ~10s for subdir of trunk --- i.e. still slow, but it > >>>> definitely has an impact). > >>>> - With a 1.7 client connecting with neon, the problem is the same. > >>>> - During the 'svn copy', an httpd child consumes a lot of cpu (around > >>>> half a core). > >>>> - There is no authz configured for this repo (SVNPathAuthz off). > >>>> - Backend is still in 1.5 format (we have not run svnadmin upgrade > >>>> yet, a dump+load is planned in a couple of weeks). > >>>> > >>>> So it seems clearly mod_dav_svn related (and not for instance related > >>>> to the FSFS backend). > >>>> > >>>> I don't think we have anything special in our httpd config: > >>>> [[[ > >>>> <Location /test_svn> > >>>> SVNInMemoryCacheSize 131072 > >>>> SVNCacheFullTexts on > >>>> SVNCacheTextDeltas on > >>>> SSLRequireSSL > >>>> AuthName "TEST Subversion Repository" > >>>> AuthType Basic > >>>> AuthBasicProvider ldap > >>>> AuthBasicAuthoritative off > >>>> AuthLDAPURL "ldap://redacted:389"; > >>>> AuthLDAPBindDN "redacted" > >>>> AuthLDAPBindPassword redacted > >>>> Require ldap-group redacted > >>>> DAV svn > >>>> SVNPath /path/to/test_repos > >>>> SVNPathAuthz off > >>>> </Location> > >>>> ]]] > >>>> > >>>> Any ideas? > >>>> Why the cpu usage by the server, what's it doing? > >>>> What is the dav/activities.d directory for? How come it contains so > >>>> many files? Is it ok to purge the old files from that directory? > >>> > >>> Httpd's mod_dav was updated in some recent version to do a full lock > traversal on copies and moves. I think we already applied some optimizations, > but the real fix would be that mod_dav shouldn't do this work (which our repos > layer already does). > >>> > >>> I'm not sure which release we applied the first set of optimizations. > >>> > >> > >> Thanks for refreshing my memory. > >> > >> So the problem is known as issue #4531 (server-side copy (over dav) > >> uses too much memory) [1]. The memory usage issue has been fixed in > >> SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains > >> (copy is no longer O(1), but depends on the size of the tree being > >> copied). That's a direct violation of one of Subversion's "old selling > >> points" vs. CVS: that branching / tagging is O(1). Branching / tagging > >> taking several minutes brings back "fond memories" from CVS' days. > >> > >> As Philip pointed out in his last comment on #4531 [2]: "This issue is > >> related to a change in mod_dav in 2.2.25 to fix PR54610 which > >> added a walk over the copy source looking for lock tokens." (also > >> released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected -- > >> older httpd's won't have this problem I guess). > >> > >> Again quoting Philip: "Apache knows in advance that the walk is > >> redundant in cases such as Subversion's URL-to-URL copy but Subversion > >> cannot avoid the read access. We should attempt to fix mod_dav to > >> avoid the walk where possible." > >> > >> So my hope rests with Philip and others who might have the necessary > >> knowledge to fix this in mod_dav. It's really not acceptable that > >> branching / tagging (or I'm guessing also: moving a large tree with a > >> server-side move) takes several minutes. > >> > >> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4531 > >> [2] http://subversion.tigris.org/issues/show_bug.cgi?id=4531#desc12 > > > > I think I've found a workaround: it seems the tree walk by mod_dav is > > avoided when the request has a header Depth with value 0. I've tried > > adding > > > > <If "%{REQUEST_METHOD} == 'COPY'"> > > RequestHeader set Depth 0 > > </If> > > > > to the Location block of SVN, and the copy is fast again! And the good > > thing is: it's still a fully recursive copy :-) (otherwise it wouldn't > > be much of a workaround). > > > > 'svn copy' time for a very large tree (artificially generated with > > ~50000 folders and ~250000 files) is now down to 1,5 seconds (still > > three times slower than the same via file:/// or svn://, but good > > enough, and not O(sizeof(tree)) anymore). > > > > Is this workaround safe? Thoughts? > > It might even be something that can be exploited by our client, when > > 'svn copy'ing ... (though a "normal" server-side fix for this problem, > > within the normal workings of mod_dav, would of course be better > > still). > > Seems this workaround is pretty OK for now (apparently the subversion > code on the server ignores the Depth:0 for COPY requests, so the copy > is handled like a normal recursive copy). > > Bert suggested on irc to make the setting of the header also dependent > on the useragent string. > > For completeness: I'm now no longer seeing the 1,5 seconds time for > copying over dav. Today it's more like 0,5 - 0,7 seconds, i.e. the > same as with file:// and svn://. Maybe something was slowing down my > network temporarily yesterday evening.
[[ Index: subversion/mod_dav_svn/repos.c =================================================================== --- subversion/mod_dav_svn/repos.c (revision 1670075) +++ subversion/mod_dav_svn/repos.c (working copy) @@ -4447,6 +4447,14 @@ return NULL; } + if (params->root->info->r->method_number == M_COPY + && params->root->info->repos->is_svn_client) + { + /* We don't need to check if there are locks on the MOD_DAV level, + as we can handle that far more efficient on the FS level */ + depth = 0; + } + ctx.params = params; ctx.wres.walk_ctx = params->walk_ctx; ]] Implements the same hack on the mod_dav_svn level without requiring a config file change. Bert