On Mon, Oct 3, 2011 at 2:35 PM, Johan Corveleyn <jcor...@gmail.com> wrote: > On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber <kyle.le...@gmail.com> wrote: >> >> >> On Mon, Oct 3, 2011 at 4:10 AM, Johan Corveleyn <jcor...@gmail.com> wrote: >>> >>> [ Again: please don't top-post on this list. I'm moving your reply to >>> the bottom. More below ... ] >>> >>> On Mon, Oct 3, 2011 at 2:24 AM, Kyle Leber <kyle.le...@gmail.com> wrote: >>> > On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf <d...@daniel.shahaf.name> >>> > wrote: >>> >> >>> >> Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400: >>> >> > Johan, >>> >> > >>> >> > I did a little more digging. There were a few different places where >>> >> > svn >>> >> > seems to get hung up so I ran the gprof report on just the first one >>> >> > (the >>> >> > merge takes hours otherwise). In this particular case, svn prints >>> >> > out >>> >> > that >>> >> > it is merging from a small text file while it is hanging for more >>> >> > than a >>> >> > minute @ 100% CPU. When I examine "lsof", however, it see it >>> >> > actually >>> >> > has a >>> >> > different file open. This one is a large (15 MB) "binary" file. It >>> >> > turns >>> >> > out this binary file did not have a property in the trunk (which I >>> >> > think >>> >> > means it's treated as text, right?). But in the branch it was marked >>> >> > as >>> >> > octet stream. So perhaps svn is doing a text-based diff on this >>> >> > binary >>> >> > file because it used to be incorrectly marked as text? >>> >> > >>> >> >>> >> If either side is marked as binary then svn will defer to the "Use >>> >> merge-right if merge-left == base, else conflict" algorithm. >>> >> >>> >> Could you share the value of 'svn proplist --verbose' on both files? >>> >> >>> > Yup, trunk version has empty properties >>> > branch version has: >>> > >>> > svn:mime-type >>> > application/octet-stream >>> > >>> >>> What is the merge target? Is it a trunk working copy (the one without >>> mime-type), or a branch working copy (with >>> svn:mime-type=application/octet-stream)? >>> >>> I think it's the mime-type of the merge target that determines if >>> merge will take the "binary" route, or the "text" route. See this >>> snippet from libsvn_wc/merge.c [1] (in the function >>> svn_wc__internal_merge): >>> >>> [[[ >>> /* Decide if the merge target is a text or binary file. */ >>> if ((mimeprop = get_prop(&mt, SVN_PROP_MIME_TYPE)) >>> && mimeprop->value) >>> is_binary = svn_mime_type_is_binary(mimeprop->value->data); >>> else >>> { >>> const char *value = svn_prop_get_value(mt.actual_props, >>> SVN_PROP_MIME_TYPE); >>> >>> is_binary = value && svn_mime_type_is_binary(value); >>> } >>> ]]] >>> >>> (mt is the merge target) >>> >>> I'm not terribly familiar with this part of the codebase. But on first >>> sight, this seems to say: >>> >>> (1) Look at the mime-type of the "base version" of the merge target. >>> If that's binary, then we'll go binary. >>> >>> (2) If the "base" of the merge target doesn't have a mime-type, look >>> if it has one on the "actual" node (the uncommitted local >>> modifications). If that's binary, then we'll go binary. >>> >>> (3) Else: text merge >>> >>> So I'm guessing that you're merging to trunk, the target without >>> mime-type property, which makes svn take the "text" route for merging. >>> Is that correct? >>> >>> If that's the case, maybe you can simply set the mime-type on that >>> binary file in your merge target, as a local modification (I don't >>> think you need to even commit it). Can you try that? >>> >>> -- >>> Johan >>> >>> [1] >>> http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_wc/merge.c >> >> Johan, >> >> Sorry for the top-post. Hopefully this is better :) > > Much better, thank you :). > >> I set the mime-type to "application/octet-stream" in the working copy prior >> to merge and this fixed the problem. No more heavy CPU usage or excessive >> time spent on the file. > > I'm glad it helped. Apart from the performance, it's important that > svn does this merge the "binary way", because as you said line-based > merges are not correct for this file.
It may also interest you (and other readers of this thread) that there is an open enhancement request for making text-merges take the same shortcut as binary-merges (if mine == merge-left then set merged := merge-right), to avoid expensive diffing [1]. But that hasn't been addressed yet. [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big trivial text files merged MUCH slower than binary - pls optimize. -- Johan