Re: [BUG] add_again() off-by-one error in custom format

2017-06-22 Thread Jeff King
On Thu, Jun 22, 2017 at 08:19:40PM +0200, René Scharfe wrote: > > I'd be OK with keeping it if we could reduce the number of magic > > numbers. E.g,. rather than 32 elsewhere use: > > > >(sizeof(*loose_objects_subdir_bitmap) * CHAR_BIT) > > We have a bitsizeof macro for that. > > > and simi

Re: [BUG] add_again() off-by-one error in custom format

2017-06-22 Thread René Scharfe
Am 18.06.2017 um 15:56 schrieb Jeff King: On Sun, Jun 18, 2017 at 02:59:04PM +0200, René Scharfe wrote: @@ -1586,6 +1587,9 @@ extern struct alternate_object_database { struct strbuf scratch; size_t base_len; + uint32_t loose_objects_subdir_bitmap[8]; Is it worth the comp

Re: [BUG] add_again() off-by-one error in custom format

2017-06-18 Thread Junio C Hamano
Jeff King writes: > On Sun, Jun 18, 2017 at 12:58:49PM +0200, René Scharfe wrote: > >> Anyway, here's a patch for stat-based invalidation, on top of the other >> one. Array removal is really slow (hope I didn't sneak a bug in there, >> but my confidence in this code isn't very high). No locking

Re: [BUG] add_again() off-by-one error in custom format

2017-06-18 Thread Jeff King
On Sun, Jun 18, 2017 at 02:59:04PM +0200, René Scharfe wrote: > > > @@ -1586,6 +1587,9 @@ extern struct alternate_object_database { > > > struct strbuf scratch; > > > size_t base_len; > > > + uint32_t loose_objects_subdir_bitmap[8]; > > > > Is it worth the complexity of having

Re: [BUG] add_again() off-by-one error in custom format

2017-06-18 Thread René Scharfe
Am 18.06.2017 um 13:49 schrieb Jeff King: On Sun, Jun 18, 2017 at 12:58:37PM +0200, René Scharfe wrote: An oid_array spends ca. 30 bytes per entry (20 bytes for a hash times a factor of 1.5 from alloc_nr). How many loose objects do we have to expect? 30 MB for a million of them sounds not too

Re: [BUG] add_again() off-by-one error in custom format

2017-06-18 Thread Jeff King
On Sun, Jun 18, 2017 at 12:58:49PM +0200, René Scharfe wrote: > Anyway, here's a patch for stat-based invalidation, on top of the other > one. Array removal is really slow (hope I didn't sneak a bug in there, > but my confidence in this code isn't very high). No locking is done; > parallel threa

Re: [BUG] add_again() off-by-one error in custom format

2017-06-18 Thread Jeff King
On Sun, Jun 18, 2017 at 12:58:37PM +0200, René Scharfe wrote: > An oid_array spends ca. 30 bytes per entry (20 bytes for a hash times > a factor of 1.5 from alloc_nr). How many loose objects do we have to > expect? 30 MB for a million of them sounds not too bad, but can there > realistically be

Re: [BUG] add_again() off-by-one error in custom format

2017-06-18 Thread René Scharfe
Am 15.06.2017 um 15:25 schrieb Jeff King: > On Thu, Jun 15, 2017 at 01:33:34PM +0200, René Scharfe wrote: >> Can we trust object directory time stamps for cache invalidation? > > I think it would work on POSIX-ish systems, since loose object changes > always involve new files, not modifications of

Re: [BUG] add_again() off-by-one error in custom format

2017-06-18 Thread René Scharfe
Am 15.06.2017 um 15:25 schrieb Jeff King: > On Thu, Jun 15, 2017 at 01:33:34PM +0200, René Scharfe wrote: >>> I'm not really sure how, though, short of caching the directory >>> contents. That opens up questions of whether and when to invalidate the >>> cache. If the cache were _just_ about short h

Re: [BUG] add_again() off-by-one error in custom format

2017-06-15 Thread Junio C Hamano
René Scharfe writes: > Am 13.06.2017 um 23:20 schrieb Junio C Hamano: > >> I think the real question is how likely people use more than one >> occurrence of the same thing in their custom format, and how deeply >> they care that --format='%h %h' costs more than --format='%h'. The >> cost won't o

Re: [BUG] add_again() off-by-one error in custom format

2017-06-15 Thread Jeff King
On Thu, Jun 15, 2017 at 01:33:34PM +0200, René Scharfe wrote: > > The difference is that in the "before" case, we actually opened each > > directory and ran getdents(). But after gc, the directories are gone > > totally and open() fails. We also have to do a linear walk through the > > objects in

Re: [BUG] add_again() off-by-one error in custom format

2017-06-15 Thread René Scharfe
Am 15.06.2017 um 07:56 schrieb Jeff King: One interesting thing is that the cost of finding short hashes very much depends on your loose object setup. I timed: git log --format=%H >/dev/null versus git log --format=%h >/dev/null on git.git. It went from about 400ms to about 800ms. But t

Re: [BUG] add_again() off-by-one error in custom format

2017-06-14 Thread Jeff King
On Wed, Jun 14, 2017 at 08:24:25PM +0200, René Scharfe wrote: > > I think the real question is how likely people use more than one > > occurrence of the same thing in their custom format, and how deeply > > they care that --format='%h %h' costs more than --format='%h'. The > > cost won't of cours

Re: [BUG] add_again() off-by-one error in custom format

2017-06-14 Thread René Scharfe
Am 13.06.2017 um 23:20 schrieb Junio C Hamano: > René Scharfe writes: > >> The difference is about the same as the one between: >> >> $ time git log --format="" >/dev/null >> >> real0m0.463s >> user0m0.448s >> sys 0m0.012s >> >> and: >> >> $ time git log --for

Re: [BUG] add_again() off-by-one error in custom format

2017-06-14 Thread René Scharfe
Am 14.06.2017 um 00:24 schrieb SZEDER Gábor: [sorry for double post, forgot the mailing list...] To throw in a fourth option, this one adjusts the expansions' cached offsets when the magic makes it necessary. It's not necessary for '%-', because it only makes a difference when the expansion is

Re: [BUG] add_again() off-by-one error in custom format

2017-06-13 Thread SZEDER Gábor
[sorry for double post, forgot the mailing list...] To throw in a fourth option, this one adjusts the expansions' cached offsets when the magic makes it necessary. It's not necessary for '%-', because it only makes a difference when the expansion is empty, and in that case - add_again() doesn'

Re: [BUG] add_again() off-by-one error in custom format

2017-06-13 Thread Junio C Hamano
René Scharfe writes: > The difference is about the same as the one between: > > $ time git log --format="" >/dev/null > > real0m0.463s > user0m0.448s > sys 0m0.012s > > and: > > $ time git log --format="%h" >/dev/null > > real0m1.062s > us

Re: [BUG] add_again() off-by-one error in custom format

2017-06-13 Thread René Scharfe
Am 13.06.2017 um 20:29 schrieb Junio C Hamano: René Scharfe writes: Indeed, a very nice bug report! I think the call to format_commit_one() needs to be taught to pass 'magic' through, and then add_again() mechanism needs to be told not to cache when magic is in effect, or something like that

Re: [BUG] add_again() off-by-one error in custom format

2017-06-13 Thread Junio C Hamano
René Scharfe writes: > Indeed, a very nice bug report! > >> I think the call to format_commit_one() needs to be taught to pass >> 'magic' through, and then add_again() mechanism needs to be told not >> to cache when magic is in effect, or something like that. >> >> Perhaps something along this li

Re: [BUG] add_again() off-by-one error in custom format

2017-06-13 Thread René Scharfe
Am 13.06.2017 um 00:49 schrieb Junio C Hamano: Michael Giuffrida writes: For the abbreviated commit hash placeholder ('h'), pretty.c uses add_again() to cache the result of the expansion, and then uses that result in future expansions. This causes problems when the expansion includes whitespac

Re: [BUG] add_again() off-by-one error in custom format

2017-06-12 Thread Junio C Hamano
Michael Giuffrida writes: > For the abbreviated commit hash placeholder ('h'), pretty.c uses > add_again() to cache the result of the expansion, and then uses that > result in future expansions. This causes problems when the expansion > includes whitespace: > > $ git log -n 1 --pretty='format

[BUG] add_again() off-by-one error in custom format

2017-06-11 Thread Michael Giuffrida
In a custom pretty format, using the '+' or ' ' combinators to prefix a non-empty expansion with whitespace will erroneously truncate subsequent expansions of the same type. Normally '%+X' inserts a newline before , IFF the expansion of X is non-empty: $ git log -n 1 --pretty='format:newline: