RE: Trival merge of big text file: Dismal performance, 540x faster if binary.

Andreas Krüger, DV-RATIO Fri, 14 Jan 2011 06:54:40 -0800

Hello, Johan and all,

first, for the record, here is another comparison between
binary and text merge performance, this time with the files
generated by my script (repeated below):


Binary merge took 3.56 seconds, text merge took 3:45:45.202 hours.
In this particular case, binary merge performance was 3805 times
faster than text merge performance.



> Textual merging in svn makes use of a variant of the standard diff
> algorithm, namely diff3.  Just a couple of days ago, I finally
> succeeded in making diff3 take advantage of ... performance
> improvements ... .

Good news! Excellent! Thank you!

But... does this relate to my problem?

The improved diff3 will give a nice performance improvement in the
*general* case.

I certainly want that improvement!


Another nice performance improvement of a factor of several hundreds
(or thousands) could be obtained for big files in the *trivial* case,
if SVN didn't diff3 at all, but simply copied the result.

I also want this other improvement!


Finally:

SVN already contains the intelligence needed to find out whether a
merge is trivial or not.  For, in the binary case, the trivial merges
are precisely the ones that SVN knows how to do.


Johan (or whoever else), please kindly enlighten me, should I be
missing something!

Regards, Andreas
--
Dr. Andreas Krüger, Senior Developer

Tel. (+49) (211) 280 69-1132
andreas.krue...@hp.com

DV-RATIO NORDWEST GmbH, Habsburgerstraße 12, 40547 Düsseldorf, Germany
 
für
 
Hewlett-Packard GmbH H Herrenberger Str. 140   71034 Böblingen   www.hp.com/de
Geschäftsführer: Volker Smid (Vorsitzender), Michael Eberhardt, Thorsten 
Herrmann,
Martin Kinne, Heiko Meyer, Ernst Reichart, Rainer Sterk
Vorsitzender des Aufsichtsrates: Jörg Menno Harms
Sitz der Gesellschaft: Böblingen S Amtsgericht Stuttgart HRB 244081   
WEEE-Reg.-Nr. DE 30409072

-----Original Message-----
From: krueger, Andreas (Andreas Krüger, DV-RATIO) 
Sent: Thursday, January 13, 2011 4:08 PM
To: users@subversion.apache.org
Subject: RE: Trival merge of big text file: Dismal performance, 540x faster if 
binary.

...


#!/usr/bin/perl -w

# Generate stupid files on stdout.

use strict;

# For the overhauled file, set to 1:
my $overhaul = 0;

my $number = 1;

for (1 .. 1000000) {

    # 1073741824 and 910111213 have no common divisor,
    # so this will take a while before it repeats.
    $number = ($number + 910111213) % 1073741824;

    my $printme;
    if($overhaul) {
        $printme = ($number % 4 != 0 ? $number * 13 % 1073741824 : $number);
    } else {
        $printme = $number;
    }
    print $printme,"\n";
}

RE: Trival merge of big text file: Dismal performance, 540x faster if binary.

Reply via email to