On 31.03.2015 14:43, Mark Phippard wrote: >> On Mar 31, 2015, at 8:13 AM, Johan Corveleyn <jcor...@gmail.com> wrote: >> >>> On Tue, Mar 31, 2015 at 2:19 AM, Johan Corveleyn <jcor...@gmail.com> wrote: >>>> On Sun, Mar 29, 2015 at 7:57 PM, Johan Corveleyn <jcor...@gmail.com> wrote: >>>>> On Sat, Mar 28, 2015 at 5:09 PM, Bert Huijben <b...@qqmail.nl> wrote: >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Johan Corveleyn [mailto:jcor...@gmail.com] >>>>>> Sent: vrijdag 27 maart 2015 22:03 >>>>>> To: users@subversion.apache.org >>>>>> Subject: Branching slow 1.8.11 https >>>>>> >>>>>> Does the following ring a bell for someone? >>>>>> >>>>>> Recently upgraded our server (on Solaris 10 SPARC) from 1.5.4 to >>>>>> 1.8.11 (CollabNet package). Some time after that, we discovered that >>>>>> branching was very slow. I'm talking about pure server-side branching >>>>>> ('svn copy $URL/trunk $URL/branches/br1'). I'm testing with a 1.8.11 >>>>>> client (tried both from same machine as the server, and from another >>>>>> machine on the LAN (100 Mbit)). >>>>>> >>>>>> - Branching trunk (containing many directories and files): 6-8 minutes >>>>>> - Branching a subfolder of trunk: 20-30 seconds (still very slow) >>>>>> - Branching a single file is fast (< 0.5s or so). >>>>>> >>>>>> So it seems the performance degrades depending on the depth or size of >>>>>> the >>>>>> tree. >>>>>> >>>>>> Now, it gets more interesting: >>>>>> - The resulting rev file on the server is always very small (as it >>>>>> should be, it contains only a lightweight 'copy' of the trunk node). >>>>>> - Our repos is currently served via https (Apache 2.2.29). >>>>>> - Branching with file:/// urls is fast (branching trunk takes 0.6s). >>>>>> - When starting an svnserve instance serving the same repository, and >>>>>> branching with svn:// urls, it's fast as well (also 0.6s). >>>>>> - We reproduced it on a copy of the production repo. >>>>>> - Experimenting with the test copy, we found that >>>>>> $repos/dav/activities.d contains ~2000 files. When we clear that >>>>>> directory, the branching times go down by more than half (~2 minutes >>>>>> for trunk, ~10s for subdir of trunk --- i.e. still slow, but it >>>>>> definitely has an impact). >>>>>> - With a 1.7 client connecting with neon, the problem is the same. >>>>>> - During the 'svn copy', an httpd child consumes a lot of cpu (around >>>>>> half a core). >>>>>> - There is no authz configured for this repo (SVNPathAuthz off). >>>>>> - Backend is still in 1.5 format (we have not run svnadmin upgrade >>>>>> yet, a dump+load is planned in a couple of weeks). >>>>>> >>>>>> So it seems clearly mod_dav_svn related (and not for instance related >>>>>> to the FSFS backend). >>>>>> >>>>>> I don't think we have anything special in our httpd config: >>>>>> [[[ >>>>>> <Location /test_svn> >>>>>> SVNInMemoryCacheSize 131072 >>>>>> SVNCacheFullTexts on >>>>>> SVNCacheTextDeltas on >>>>>> SSLRequireSSL >>>>>> AuthName "TEST Subversion Repository" >>>>>> AuthType Basic >>>>>> AuthBasicProvider ldap >>>>>> AuthBasicAuthoritative off >>>>>> AuthLDAPURL "ldap://redacted:389" >>>>>> AuthLDAPBindDN "redacted" >>>>>> AuthLDAPBindPassword redacted >>>>>> Require ldap-group redacted >>>>>> DAV svn >>>>>> SVNPath /path/to/test_repos >>>>>> SVNPathAuthz off >>>>>> </Location> >>>>>> ]]] >>>>>> >>>>>> Any ideas? >>>>>> Why the cpu usage by the server, what's it doing? >>>>>> What is the dav/activities.d directory for? How come it contains so >>>>>> many files? Is it ok to purge the old files from that directory? >>>>> Httpd's mod_dav was updated in some recent version to do a full lock >>>>> traversal on copies and moves. I think we already applied some >>>>> optimizations, but the real fix would be that mod_dav shouldn't do this >>>>> work (which our repos layer already does). >>>>> >>>>> I'm not sure which release we applied the first set of optimizations. >>>> Thanks for refreshing my memory. >>>> >>>> So the problem is known as issue #4531 (server-side copy (over dav) >>>> uses too much memory) [1]. The memory usage issue has been fixed in >>>> SVN 1.8.11 and 1.7.19 (see CHANGES), but a performance problem remains >>>> (copy is no longer O(1), but depends on the size of the tree being >>>> copied). That's a direct violation of one of Subversion's "old selling >>>> points" vs. CVS: that branching / tagging is O(1). Branching / tagging >>>> taking several minutes brings back "fond memories" from CVS' days. >>>> >>>> As Philip pointed out in his last comment on #4531 [2]: "This issue is >>>> related to a change in mod_dav in 2.2.25 to fix PR54610 which >>>> added a walk over the copy source looking for lock tokens." (also >>>> released in 2.4.5; so both httpd 2.2.25+ and 2.4.5+ are affected -- >>>> older httpd's won't have this problem I guess). >>>> >>>> Again quoting Philip: "Apache knows in advance that the walk is >>>> redundant in cases such as Subversion's URL-to-URL copy but Subversion >>>> cannot avoid the read access. We should attempt to fix mod_dav to >>>> avoid the walk where possible." >>>> >>>> So my hope rests with Philip and others who might have the necessary >>>> knowledge to fix this in mod_dav. It's really not acceptable that >>>> branching / tagging (or I'm guessing also: moving a large tree with a >>>> server-side move) takes several minutes. >>>> >>>> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4531 >>>> [2] http://subversion.tigris.org/issues/show_bug.cgi?id=4531#desc12 >>> I think I've found a workaround: it seems the tree walk by mod_dav is >>> avoided when the request has a header Depth with value 0. I've tried >>> adding >>> >>> <If "%{REQUEST_METHOD} == 'COPY'"> >>> RequestHeader set Depth 0 >>> </If> >>> >>> to the Location block of SVN, and the copy is fast again! And the good >>> thing is: it's still a fully recursive copy :-) (otherwise it wouldn't >>> be much of a workaround). >>> >>> 'svn copy' time for a very large tree (artificially generated with >>> ~50000 folders and ~250000 files) is now down to 1,5 seconds (still >>> three times slower than the same via file:/// or svn://, but good >>> enough, and not O(sizeof(tree)) anymore). >>> >>> Is this workaround safe? Thoughts? >>> It might even be something that can be exploited by our client, when >>> 'svn copy'ing ... (though a "normal" server-side fix for this problem, >>> within the normal workings of mod_dav, would of course be better >>> still). >> Seems this workaround is pretty OK for now (apparently the subversion >> code on the server ignores the Depth:0 for COPY requests, so the copy >> is handled like a normal recursive copy). >> >> Bert suggested on irc to make the setting of the header also dependent >> on the useragent string. >> >> For completeness: I'm now no longer seeing the 1,5 seconds time for >> copying over dav. Today it's more like 0,5 - 0,7 seconds, i.e. the >> same as with file:// and svn://. Maybe something was slowing down my >> network temporarily yesterday evening. >> >> -- >> Johan > Are we going to change the client to send this header? This seems like a very > significant regression in our primary "promises" to allow it to wait for a > mod_dav fix that might never even happen.
The problem is that there are other Subversion DAV server implementations out there that could break if we tried to do a non-recursive copy of a directory (whatever that means). -- Brane