[PATCH v5 5/5] clean: improve performance when removing lots of directories

2015-04-25 Thread Erik Elfström
"git clean" uses resolve_gitlink_ref() to check for the presence of nested git repositories, but it has the drawback of creating a ref_cache entry for every directory that should potentially be cleaned. The linear search through the ref_cache list causes a massive performance hit for large number o

[PATCH v5 4/5] p7300: add performance tests for clean

2015-04-25 Thread Erik Elfström
The tests are run in dry-run mode to avoid having to restore the test directories for each timed iteration. Using dry-run is an acceptable compromise since we are mostly interested in the initial computation of what to clean and not so much in the cleaning it self. Signed-off-by: Erik Elfström --

[PATCH v5 3/5] t7300: add tests to document behavior of clean and nested git

2015-04-25 Thread Erik Elfström
Signed-off-by: Erik Elfström --- t/t7300-clean.sh | 128 +++ 1 file changed, 128 insertions(+) diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh index 99be5d9..11f3a6d 100755 --- a/t/t7300-clean.sh +++ b/t/t7300-clean.sh @@ -455,6 +455,134 @@ te

[PATCH v5 2/5] setup: sanity check file size in read_gitfile_gently

2015-04-25 Thread Erik Elfström
read_gitfile_gently will allocate a buffer to fit the entire file that should be read. Add a sanity check of the file size before opening to avoid allocating a potentially huge amount of memory if we come across a large file that someone happened to name ".git". The limit is set to a sufficiently u

[PATCH v5 0/5] Improving performance of git clean

2015-04-25 Thread Erik Elfström
Changes in v5: * Added defines for read_gitfile_gently error codes. This was a silly mistake, sorry about that. Erik Elfström (5): setup: add gentle version of read_gitfile setup: sanity check file size in read_gitfile_gently t7300: add tests to document behavior of clean and nested git

[PATCH v5 1/5] setup: add gentle version of read_gitfile

2015-04-25 Thread Erik Elfström
read_gitfile will die on most error cases. This makes it unsuitable for speculative calls. Extract the core logic and provide a gentle version that returns NULL on failure. The first usecase of the new gentle version will be to probe for submodules during git clean. Helped-by: Junio C Hamano Hel

[PATCH 12/14] merge: decide if we auto-generate the message early in collect_parents()

2015-04-25 Thread Junio C Hamano
Signed-off-by: Junio C Hamano --- builtin/merge.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/builtin/merge.c b/builtin/merge.c index 84ebb22..c7d9d6e 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -1098,6 +1098,10 @@ static struct commit_list *coll

[PATCH 14/14] merge: deprecate 'git merge HEAD ' syntax

2015-04-25 Thread Junio C Hamano
We had this in "git merge" manual for eternity: 'git merge' HEAD ... [This] syntax ( `HEAD` ...) is supported for historical reasons. Do not use it from the command line or in new scripts. It is the same as `git merge -m ...`. With the update to "git merge" to make it underst

[PATCH 10/14] merge: extract prepare_merge_message() logic out

2015-04-25 Thread Junio C Hamano
Signed-off-by: Junio C Hamano --- builtin/merge.c | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/builtin/merge.c b/builtin/merge.c index d853c9d..a972ed6 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -1076,6 +1076,20 @@ static struct commi

[PATCH 05/14] merge: do not check argc to determine number of remote heads

2015-04-25 Thread Junio C Hamano
To reject merging multiple commits into an unborn branch, we check argc, thinking that collect_parents() that reads the remaining command line arguments from will give us the same number of commits as its input, i.e. argc. Because what we really care about is the number of commits, let the functi

[PATCH 04/14] merge: clarify "pulling into void" special case

2015-04-25 Thread Junio C Hamano
Instead of having it as one of the three if/elseif/.. case arms, test the condition and handle this special case upfront. This makes it easier to follow the flow of logic. Signed-off-by: Junio C Hamano --- builtin/merge.c | 35 ++- 1 file changed, 18 insertions(+

[PATCH 07/14] merge: clarify collect_parents() logic

2015-04-25 Thread Junio C Hamano
Clarify this small function in three ways. - The function initially collects all commits to be merged into a commit_list "remoteheads"; the "remotes" pointer always points at the tail of this list (either the remoteheads variable itself, or the ->next slot of the element at the end of th

[PATCH 06/14] merge: small leakfix and code simplification

2015-04-25 Thread Junio C Hamano
When parsing a merged object name like "foo~20" to formulate a merge summary "Merge branch foo (early part)", a temporary strbuf is used, but we forgot to deallocate it when we failed to find the named branch. Signed-off-by: Junio C Hamano --- builtin/merge.c | 4 ++-- 1 file changed, 2 insertio

[PATCH 13/14] merge: handle FETCH_HEAD internally

2015-04-25 Thread Junio C Hamano
The collect_parents() function now is responsible for 1. parsing the commits given on the command line into a list of commits to be merged; 2. filtering these parents into independent ones; and 3. optionally calling fmt_merge_msg() via prepare_merge_message() to prepare an auto-genera

[PATCH 09/14] merge: narrow scope of merge_names

2015-04-25 Thread Junio C Hamano
In order to pass the list of parents to fmt_merge_msg(), cmd_merge() uses this strbuf to create something that look like FETCH_HEAD that describes commits that are being merged. This is necessary only when we are creating the merge commit message ourselves, but was done unconditionally. Move the

[PATCH 11/14] merge: make collect_parents() auto-generate the merge message

2015-04-25 Thread Junio C Hamano
Signed-off-by: Junio C Hamano --- builtin/merge.c | 35 +-- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/builtin/merge.c b/builtin/merge.c index a972ed6..84ebb22 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -1092,7 +1092,8 @@ static void

[PATCH 08/14] merge: split reduce_parents() out of collect_parents()

2015-04-25 Thread Junio C Hamano
The latter does two separate things: - Parse the list of commits on the command line, and formulate the list of commits to be merged (including the current HEAD); - Compute the list of parents to be recorded in the resulting merge commit. Split the latter into a separate helper function,

[PATCH 02/14] t5520: style fixes

2015-04-25 Thread Junio C Hamano
Fix style funnies in early part of this test script that checks "git pull" into an unborn branch. The primary change is that 'chdir' to a newly created empty test repository is now protected by being done in a subshell to make it more robust without having to chdir back to the original place. Sig

[PATCH 03/14] t5520: test pulling an octopus into an unborn branch

2015-04-25 Thread Junio C Hamano
The code comment for "git merge" in builtin/merge.c, we say If the merged head is a valid one there is no reason to forbid "git merge" into a branch yet to be born. We do the same for "git pull". and t5520 does have an existing test for that behaviour. However, there was no test to m

[PATCH 01/14] merge: simplify code flow

2015-04-25 Thread Junio C Hamano
One of the first things cmd_merge() does is to see if the "--abort" option is given and run "reset --merge" and exit. When the control reaches this point, we know "--abort" was not given. Signed-off-by: Junio C Hamano --- builtin/merge.c | 16 1 file changed, 8 insertions(+), 8

[PATCH 00/14] Teach "git merge FETCH_HEAD" octopus merges

2015-04-25 Thread Junio C Hamano
Junio C Hamano writes: > Jeff King writes: >> On Mon, Apr 20, 2015 at 11:59:04AM -0700, Junio C Hamano wrote: >> >>> Unfortunately, "git merge"'s parsing of FETCH_HEAD forgets that we >>> may be creating an Octopus. Otherwise the above should work well. >> >> That sounds like a bug we should fi

Re: [PATCH v2] t0027: Add repoMIX and LF_nul

2015-04-25 Thread Junio C Hamano
Torsten Bögershausen writes: >> Hmph, would it still make sense to make sure that CRLF will stay CRLF >> so that future changes to break this expectation can be caught? Not >> that such a breakage is likely... > > Thanks for amending. > > We have the file CRLF (and CRLFmixLF), where we check tha

Re: [PATCH v4 2/5] setup: sanity check file size in read_gitfile_gently

2015-04-25 Thread Junio C Hamano
Erik Elfström writes: > On Sat, Apr 25, 2015 at 6:47 PM, Junio C Hamano wrote: >> I do not think it is wrong per-se, but the changes in this patch >> shows why hardcoded values assigned to error_code without #define is >> not a good idea, as these values are now exposed to the callers of >> the

Re: Cloning or pushing only files that have been updated

2015-04-25 Thread Johannes Sixt
Am 25.04.2015 um 23:17 schrieb c...@qgenuity.com: I have two sets of files. A_Old is a large unversioned directory tree containing many files. A_Updated is a git repository containing the files from A_Old, some of which have been modified. A_Updated also contains new files. I am looking for a

Cleaning projects with submodules

2015-04-25 Thread Simon Richter
Hi, I'm trying to set up a continuous integration build for Boost, which uses massive amounts of submodules, and keep running into problems when running "git clean" in the toplevel project. When I switch to a version where a submodule has been removed (e.g. an earlier version), "git clean -dx" wi

Re: [PATCH v2] t0027: Add repoMIX and LF_nul

2015-04-25 Thread Torsten Bögershausen
On 2015-04-25 18.41, Junio C Hamano wrote: > Torsten Bögershausen writes: > >> "new safer autocrlf handling": >> Check if eols in a file are converted at commit, when the file has >> CR (or CLLF) in the repo (technically speaking in the index). > > s/CLLF/CRLF/? (no need to resend for this;

Re: bug report : 2.3.5 ssh repo not found

2015-04-25 Thread Torsten Bögershausen
On 2015-04-25 07.57, Chris wrote: > Hello, > > Using git version 2.3.5 with kernel 3.19.3-3-ARCH #1 SMP PREEMPT x86_64 I see > the following error message when pulling or cloning a repo over ssh: > > """ > git clone ssh://user@mydomain:/home/user/path/to/project.git > Cloning into 'project'... >

Cloning or pushing only files that have been updated

2015-04-25 Thread cl
Hi, I have two sets of files. A_Old is a large unversioned directory tree containing many files. A_Updated is a git repository containing the files from A_Old, some of which have been modified. A_Updated also contains new files. I am looking for a way of cloning only the new or modified files f

Re: [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit()

2015-04-25 Thread Junio C Hamano
Jeff King writes: > Stefan's patch is just in pu at this point, right? I do not think there > is any rushing/release concern. It is too late for either to be in > v2.4.0, so the only decision is whether to aim for "master" or "maint". > To me, they both seem to be in the same ballpark as far as r

Re: [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time

2015-04-25 Thread Junio C Hamano
Junio C Hamano writes: > I am not too worried about "push --atomic", as we can just add a few > words to Release Notes and documentation saying "this is still an > experimental broken code that is unusable; don't use the feature in > production". > > I however am more worried about the other one

Re: [PATCH v4 2/5] setup: sanity check file size in read_gitfile_gently

2015-04-25 Thread Erik Elfström
On Sat, Apr 25, 2015 at 6:47 PM, Junio C Hamano wrote: > I do not think it is wrong per-se, but the changes in this patch > shows why hardcoded values assigned to error_code without #define is > not a good idea, as these values are now exposed to the callers of > the new function. After we gain a

Re: [PATCH v8 2/4] cat-file: teach cat-file a '--literally' option

2015-04-25 Thread Junio C Hamano
karthik nayak writes: >> Is there any other way to make cat-file looser other than accepting >> an unknown type name from the future? If not, then perhaps it may >> make sense to give it a generic name that implies that we would >> trigger such additional looseness in the future. But as the >>

Re: [PATCH v4 1/5] setup: add gentle version of read_gitfile

2015-04-25 Thread Junio C Hamano
Junio C Hamano writes: >> +switch (error_code) { >> +case 1: // failed to stat >> +case 2: // not regular file > > Please do not use C++ // comments. No need to resend for this; I'll locally amend. -- To unsubscribe from this list: send the line "unsubscribe g

Re: [PATCH v4 1/5] setup: add gentle version of read_gitfile

2015-04-25 Thread Junio C Hamano
Erik Elfström writes: > read_gitfile will die on most error cases. This makes it unsuitable > for speculative calls. Extract the core logic and provide a gentle > version that returns NULL on failure. > > The first usecase of the new gentle version will be to probe for > submodules during git cle

Re: [PATCH v4 2/5] setup: sanity check file size in read_gitfile_gently

2015-04-25 Thread Junio C Hamano
Erik Elfström writes: > read_gitfile_gently will allocate a buffer to fit the entire file that > should be read. Add a sanity check of the file size before opening to > avoid allocating a potentially huge amount of memory if we come across > a large file that someone happened to name ".git". The

Re: [PATCH v2] t0027: Add repoMIX and LF_nul

2015-04-25 Thread Junio C Hamano
Torsten Bögershausen writes: > "new safer autocrlf handling": > Check if eols in a file are converted at commit, when the file has > CR (or CLLF) in the repo (technically speaking in the index). s/CLLF/CRLF/? (no need to resend for this; I'll locally amend) > Add a test-file repoMIX with

Re: [PATCH] git-gui: sort entries in tclIndex

2015-04-25 Thread René Scharfe
Looping in Pat (git-gui maintainer). Am 15.04.2015 um 09:22 schrieb Olaf Hering: Ping? On Tue, Feb 10, Olaf Hering wrote: Ping? On Mon, Jan 26, Olaf Hering wrote: ALL_LIBFILES uses wildcard, which provides the result in directory order. This order depends on the underlying filesystem on th

Re: [PATCH v2 2/2] connect: improve check for plink to reduce false positives

2015-04-25 Thread Torsten Bögershausen
On 2015-04-25 00.28, brian m. carlson wrote: > The git_connect function has code to handle plink and tortoiseplink > specially, as they require different command line arguments from > OpenSSH. However, the match was done by checking for "plink" > case-insensitively in the string, which led to fals

Re: [PATCH v2] t0027: Add repoMIX and LF_nul

2015-04-25 Thread Johannes Schindelin
Hi Torsten, On 2015-04-25 08:47, Torsten Bögershausen wrote: > "new safer autocrlf handling": > Check if eols in a file are converted at commit, when the file has > CR (or CLLF) in the repo (technically speaking in the index). > Add a test-file repoMIX with mixed line-endings. > When conve

Re: [PATCH v8 2/4] cat-file: teach cat-file a '--literally' option

2015-04-25 Thread karthik nayak
On 04/22/2015 02:06 AM, Junio C Hamano wrote: Eric Sunshine writes: > It's easy to be blinded into thinking that cat-file's new option > should be named --literally since it was inspired by the --literally > option of hash-object, but indeed it may not be the best choice. Yeah, I wouldn't eve

[PATCH v4 3/5] t7300: add tests to document behavior of clean and nested git

2015-04-25 Thread Erik Elfström
Signed-off-by: Erik Elfström --- t/t7300-clean.sh | 128 +++ 1 file changed, 128 insertions(+) diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh index 99be5d9..11f3a6d 100755 --- a/t/t7300-clean.sh +++ b/t/t7300-clean.sh @@ -455,6 +455,134 @@ te

[PATCH v4 1/5] setup: add gentle version of read_gitfile

2015-04-25 Thread Erik Elfström
read_gitfile will die on most error cases. This makes it unsuitable for speculative calls. Extract the core logic and provide a gentle version that returns NULL on failure. The first usecase of the new gentle version will be to probe for submodules during git clean. Helped-by: Junio C Hamano Hel

[PATCH v4 5/5] clean: improve performance when removing lots of directories

2015-04-25 Thread Erik Elfström
"git clean" uses resolve_gitlink_ref() to check for the presence of nested git repositories, but it has the drawback of creating a ref_cache entry for every directory that should potentially be cleaned. The linear search through the ref_cache list causes a massive performance hit for large number o

[PATCH v4 4/5] p7300: add performance tests for clean

2015-04-25 Thread Erik Elfström
The tests are run in dry-run mode to avoid having to restore the test directories for each timed iteration. Using dry-run is an acceptable compromise since we are mostly interested in the initial computation of what to clean and not so much in the cleaning it self. Signed-off-by: Erik Elfström --

[PATCH v4 2/5] setup: sanity check file size in read_gitfile_gently

2015-04-25 Thread Erik Elfström
read_gitfile_gently will allocate a buffer to fit the entire file that should be read. Add a sanity check of the file size before opening to avoid allocating a potentially huge amount of memory if we come across a large file that someone happened to name ".git". The limit is set to a sufficiently u

[PATCH v4 0/5] Improving performance of git clean

2015-04-25 Thread Erik Elfström
v3 of the patch can be found here: http://thread.gmane.org/gmane.comp.version-control.git/267422 Changes in v4: * changed some tests to use more meaningful dir names. * fixed performance test by doing "git clean -n" to avoid timing setup code. Increased test size to 10 directories (~0.5s r