Here's another iteration of the pack bitmaps series. Compared to v2, it
changes:
- misc style/typo fixes
- portability fixes from Ramsay and Torsten
- count-objects garbage-reporting patch from Duy
- disable bitmaps when is_repository_shallow(); this also covers the
case where the client is shallow, since we feed pack-objects a
--shallow-file in that case. This used to done by checking
!internal_rev_list, but that doesn't apply after cdab485.
- ewah sources now properly use git-compat-util.h and do not include
system headers
- the ewah code uses ewah_malloc, ewah_realloc, and so forth to let the
project use a particular allocator (and we want to use xmalloc and
friends). And we defined those in pack-bitmap.h, but of course that
had no effect on the ewah/*.c files that did not include
pack-bitmap.h. Since we are hacking up and git-ifying libewok
anyway, we can just set the hardcoded fallback to xmalloc instead of
malloc.
- the ewah code used gcc's __builtin_ctzll, but did not provide a
suitable fallback. We now provide a fallback in C.
- The bitmap reading code only handles a single bitmapped pack (since
they must be fully closed, there is not much point in having
multiple). It used to silently ignore extra bitmap indices it found,
but will now warn that they are being ignored.
- The name-hash cache is now optional, controlled by
pack.writeBitmapHashCache.
- The test script will now do basic interoperability testing with jgit
(if you have jgit in your $PATH).
- There are now perf tests. Spoiler alert: bitmaps make clones faster.
See patch 20 for details. We can also measure the speedup from the
hash cache (see patch 21).
Not addressed:
- I did not include the NEEDS_ALIGNED_ACCESS patch. I note that we do
not even have a Makefile knob for this, and the code in read-cache.c
has probably never actually been used. Are there real systems that
have a problem? The read-cache code was in support of the index v4
experiment, which did away with the 8-byte padding. So it could be
that we simply don't see it, because everything is currently
aligned.
- On a related note, we do some cast-buffer-to-struct magic on the
mmap'd file. I note that the regular packfile reader also does this.
How careful do we want to be?
- We still assume that reusing a slice from the front of the pack will
never miss delta bases. This is the case currently for packs
generated by both git and JGit, but it would be nice to mark the
property in the bitmap index. Adding a new flag would break JGit
compatibility, though. We can either make it an option, or assume
it's good enough for now and worry about it in v2.
[01/21]: sha1write: make buffer const-correct
[02/21]: revindex: Export new APIs
[03/21]: pack-objects: Refactor the packing list
[04/21]: pack-objects: factor out name_hash
[05/21]: revision: allow setting custom limiter function
[06/21]: sha1_file: export `git_open_noatime`
[07/21]: compat: add endianness helpers
[08/21]: ewah: compressed bitmap implementation
[09/21]: documentation: add documentation for the bitmap format
[10/21]: pack-bitmap: add support for bitmap indexes
[11/21]: pack-objects: use bitmaps when packing objects
[12/21]: rev-list: add bitmap mode to speed up object lists
[13/21]: pack-objects: implement bitmap writing
[14/21]: repack: stop using magic number for ARRAY_SIZE(exts)
[15/21]: repack: turn exts array into array-of-struct
[16/21]: repack: handle optional files created by pack-objects
[17/21]: repack: consider bitmaps when performing repacks
[18/21]: count-objects: recognize .bitmap in garbage-checking
[19/21]: t: add basic bitmap functionality tests
[20/21]: t/perf: add tests for pack bitmaps
[21/21]: pack-bitmap: implement optional name_hash cache
-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html