Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-27 Thread Shawn Pearce
On Sat, Apr 26, 2014 at 2:39 PM, David Kastrup wrote: > > At least the stuff I fixed with regard to performance would seem to be > done right in JGit to start with. Hah! Its Java. We have to do things right, otherwise its too slow. :-) >> Its still not as fast as I want it to be. :-) > > Most of

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-27 Thread David Kastrup
Shawn Pearce writes: > On Sat, Apr 26, 2014 at 10:30 AM, David Kastrup wrote: >> David Kastrup writes: >> >> Here's some example: >> >> dak@lola:/usr/local/tmp/wortliste$ time git blame -n -s wortliste >/tmp/wl1 >> >> real15m47.118s >> user14m39.928s >> sys 1m1.872s > > Hah, this is

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-26 Thread Shawn Pearce
On Sat, Apr 26, 2014 at 10:30 AM, David Kastrup wrote: > David Kastrup writes: > >> http://repo.or.cz/r/wortliste.git >> git blame [-M / -C] wortliste >> >> The latter one is _really_ taking a severe hit from the O(n^2) >> algorithms. If your benchmarks for that one still point mostly to the >>

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-26 Thread David Kastrup
David Kastrup writes: > http://repo.or.cz/r/wortliste.git > git blame [-M / -C] wortliste > > The latter one is _really_ taking a severe hit from the O(n^2) > algorithms. If your benchmarks for that one still point mostly to the > unpacking, your jgit blame should be fine regarding the stuff > I

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-26 Thread David Kastrup
Shawn Pearce writes: > Right, and JGit blame still is missing the -M and -C options, as I > have not implemented those yet. I got basic blame and reverse blame > working a few years ago and then stopped working on the code for a > while. Now we have interest in improving the latency for $DAY_JOB,

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-26 Thread Shawn Pearce
On Sat, Apr 26, 2014 at 9:50 AM, David Kastrup wrote: > Shawn Pearce writes: > >> On Sat, Apr 26, 2014 at 12:48 AM, David Kastrup wrote: >>> Shawn Pearce writes: And JGit was already usually slower than git-core. Now it will be even slower! :-) >>> >>> If your statement about JGi

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-26 Thread David Kastrup
Shawn Pearce writes: > Thanks for doing this. Unfortunately I can't read the patch itself as > I am also trying to improve JGit's blame code for $DAY_JOB, and JGit > is BSD licensed. Actually, I'd have suggested asking $EMPLOYER to buy the rights for looking at the code, but as I wrote previousl

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-26 Thread David Kastrup
Shawn Pearce writes: > On Sat, Apr 26, 2014 at 12:48 AM, David Kastrup wrote: >> Shawn Pearce writes: >>> >>> And JGit was already usually slower than git-core. Now it will be >>> even slower! :-) >> >> If your statement about JGit is accurate, it should likely have beat >> Git for large use ca

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-26 Thread Shawn Pearce
On Sat, Apr 26, 2014 at 12:48 AM, David Kastrup wrote: > Shawn Pearce writes: > >> On Fri, Apr 25, 2014 at 4:56 PM, David Kastrup wrote: >>> The previous implementation used a single sorted linear list of blame >>> entries for organizing all partial or completed work. Every subtask had >>> to s

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-26 Thread David Kastrup
Shawn Pearce writes: > On Fri, Apr 25, 2014 at 4:56 PM, David Kastrup wrote: >> The previous implementation used a single sorted linear list of blame >> entries for organizing all partial or completed work. Every subtask had >> to scan the whole list, with most entries not being relevant to the

Re: [PATCH 1/2] blame: large-scale performance rewrite

2014-04-25 Thread Shawn Pearce
On Fri, Apr 25, 2014 at 4:56 PM, David Kastrup wrote: > The previous implementation used a single sorted linear list of blame > entries for organizing all partial or completed work. Every subtask had > to scan the whole list, with most entries not being relevant to the > task. The resulting run-

[PATCH 1/2] blame: large-scale performance rewrite

2014-04-25 Thread David Kastrup
The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This chan