On Sun, Jun 18, 2017 at 02:59:04PM +0200, René Scharfe wrote:
> > > @@ -1586,6 +1587,9 @@ extern struct alternate_object_database {
> > > struct strbuf scratch;
> > > size_t base_len;
> > > + uint32_t loose_objects_subdir_bitmap[8];
> >
> > Is it worth the complexity of having an actual bitmap and not just an
> > array of char? I guess it's not _that_ complex to access the bits, but
> > there are a lot of magic numbers involved.
>
> That would be 224 bytes more per alternate_object_database, and we'd
> gain simpler code. Hmm. We could add some bitmap helper macros, but
> you're probably right that the first version should use the simplest
> form for representing small bitmaps that we currently have.
I'd be OK with keeping it if we could reduce the number of magic
numbers. E.g,. rather than 32 elsewhere use:
(sizeof(*loose_objects_subdir_bitmap) * CHAR_BIT)
and similarly rather than 8 here use
256 / sizeof(*loose_objects_subdir_bitmap) / CHAR_BIT
There's also already a bitmap data structure in ewah/bitmap.c. It's a
little bit overkill, though, because it mallocs and will grow the bitmap
as needed.
> > We should probably insert a comment here warning that the cache may go
> > out of date during the process run, and should only be used when that's
> > an acceptable tradeoff.
>
> Then we need to offer a way for opting out. Through a global variable?
> (I'd rather reduce their number, but don't see how allow programs to
> nicely toggle this cache otherwise.)
Sorry, I didn't mean disabling it for a particular run of a program. I
just meant that we know this is an OK tradeoff for short-sha1 lookups,
but has_sha1_file(), for example, would never want to use it. So we are
warning programmers from using it in more code, not giving users a way
to turn it off run-to-run.
> > > +static void read_loose_object_subdir(struct alternate_object_database
> > > *alt,
> > > + int subdir_nr)
> >
> > I think it's nice to pull this out into a helper function. I do wonder
> > if it should just be reusing for_each_loose_file_in_objdir(). You'd just
> > need to expose the in_obj_subdir() variant publicly.
> >
> > It does do slightly more than we need (for the callbacks it actually
> > builds the filename), but I doubt memcpy()ing a few bytes would be
> > measurable.
>
> Good point. The function also copies the common first two hex digits
> for each entry. But all that extra work is certainly dwarfed by the
> readdir calls.
Yes. You're welcome to micro-optimize that implementation if you want. ;)
-Peff