Re: Multi-threaded 'git clone'

2015-02-17 Thread Junio C Hamano
On Tue, Feb 17, 2015 at 3:32 PM, Junio C Hamano wrote: A few typofixes and clarifications. > *4* The scheme in *3* can be extended to bring the fetcher > step-wise. If the server's state was X when the fetcher last "bring the fetcher up-to-date step-wise", or "update the fetcher step-wise"

Re: Multi-threaded 'git clone'

2015-02-17 Thread Junio C Hamano
Martin Fick writes: > Sorry for the long winded rant. I suspect that some variation of all > my suggestions have already been suggested, but maybe they will > rekindle some older, now useful thoughts, or inspire some new ones. > And maybe some of these are better to pursue then more parallelism?

Re: Multi-threaded 'git clone'

2015-02-16 Thread Martin Fick
There currently is a thread on the Gerrit list about how much faster cloning can be when using Gerrit/jgit GCed packs with bitmaps versus C git GCed packs with bitmaps. Some differences outlined are that jgit seems to have more bitmaps, it creates one for every refs/heads, is C git doing that?

Re: Multi-threaded 'git clone'

2015-02-16 Thread Shawn Pearce
On Mon, Feb 16, 2015 at 10:43 AM, Junio C Hamano wrote: > Jeff King writes: > >> ... And the whole output is checksummed by a single sha1 >> over the whole stream that comes at the end. >> >> I think the most feasible thing would be to quickly spool it to a >> server on the LAN, and then use an e

Re: Multi-threaded 'git clone'

2015-02-16 Thread Jeff King
On Tue, Feb 17, 2015 at 06:16:39AM +0700, Duy Nguyen wrote: > On Mon, Feb 16, 2015 at 10:47 PM, Jeff King wrote: > > Each clone generates the pack on the fly > > based on what's on disk and streams it out. It should _usually_ be the > > same, but there's nothing to guarantee byte-for-byte equalit

Re: Multi-threaded 'git clone'

2015-02-16 Thread Duy Nguyen
On Mon, Feb 16, 2015 at 10:47 PM, Jeff King wrote: > Each clone generates the pack on the fly > based on what's on disk and streams it out. It should _usually_ be the > same, but there's nothing to guarantee byte-for-byte equality between > invocations. It's usually _not_ the same. I tried when I

Re: Multi-threaded 'git clone'

2015-02-16 Thread Junio C Hamano
Jeff King writes: > ... And the whole output is checksummed by a single sha1 > over the whole stream that comes at the end. > > I think the most feasible thing would be to quickly spool it to a > server on the LAN, and then use an existing fetch-in-parallel tool > to grab it from there over the W

Re: Multi-threaded 'git clone'

2015-02-16 Thread Jeff King
On Mon, Feb 16, 2015 at 07:31:33AM -0800, David Lang wrote: > >Then the server streams the data to the client. It might do some light > >work transforming the data as it comes off the disk, but most of it is > >just blitted straight from disk, and the network is the bottleneck. > > Depending on h

Re: Multi-threaded 'git clone'

2015-02-16 Thread David Lang
On Mon, 16 Feb 2015, Jeff King wrote: On Mon, Feb 16, 2015 at 05:31:13AM -0800, David Lang wrote: I think it's an interesting question to look at, but before you start looking at changing the architecture of the current code, I would suggest doing a bit more analisys of the problem to see if t

Re: Multi-threaded 'git clone'

2015-02-16 Thread Jeff King
On Mon, Feb 16, 2015 at 05:31:13AM -0800, David Lang wrote: > I think it's an interesting question to look at, but before you start > looking at changing the architecture of the current code, I would suggest > doing a bit more analisys of the problem to see if the bottleneck is really > where you

Re: Multi-threaded 'git clone'

2015-02-16 Thread David Lang
On Mon, 16 Feb 2015, Koosha Khajehmoogahi wrote: Cloning huge repositories like Linux kernel takes considerable amount of time. Is it possible to incorporate a multi-threaded simultaneous connections functionality for cloning? To what extent do we need to change the architecture of the current c

Multi-threaded 'git clone'

2015-02-16 Thread Koosha Khajehmoogahi
Greetings, Cloning huge repositories like Linux kernel takes considerable amount of time. Is it possible to incorporate a multi-threaded simultaneous connections functionality for cloning? To what extent do we need to change the architecture of the current code and how large would be the scope of