On Wed, Sep 14, 2016 at 11:47:01PM -0700, Jeff King wrote:
> > first = i = hash_obj(sha1, obj_hash_size);
> > + clock_gettime(CLOCK_MONOTONIC, &time1);
> > while ((obj = obj_hash[i]) != NULL) {
> > if (!hashcmp(sha1, obj->oid.hash))
> > break;
> > @@ -98,6 +131,9 @@ struct object *lookup_object(const unsigned char *sha1)
> > if (i == obj_hash_size)
> > i = 0;
> > }
> > + clock_gettime(CLOCK_MONOTONIC, &time2);
> > + diff(&time1, &time2, &t_diff);
> > + add_time_to(&caching, &t_diff);
> > if (obj && i != first) {
>
> I don't think this is actually measuring the time spent on collisions.
> It's measuring the time we spend in hashcmp(), but that includes the
> non-collision case where we find it in the first hashcmp.
>
> Measuring _just_ the collisions is more like the patch below. In my
> measurements it's more like 30ms, compared to 10s for all of the
> hashcmps.
I forgot to send the patch. Which is just as well, because I realized it
was totally buggy.
Here's a patch that I believe is correct (it counts only times when we
move past the first hash slot). It spends about 280ms. Which is still a
lot less than 10s, so I think my other comments stand.
---
diff --git a/object.c b/object.c
index e9e73e0..7a74a1d 100644
--- a/object.c
+++ b/object.c
@@ -123,17 +123,20 @@ struct object *lookup_object(const unsigned char *sha1)
return NULL;
first = i = hash_obj(sha1, obj_hash_size);
- clock_gettime(CLOCK_MONOTONIC, &time1);
while ((obj = obj_hash[i]) != NULL) {
if (!hashcmp(sha1, obj->oid.hash))
break;
+ if (first == i)
+ clock_gettime(CLOCK_MONOTONIC, &time1);
i++;
if (i == obj_hash_size)
i = 0;
}
- clock_gettime(CLOCK_MONOTONIC, &time2);
- diff(&time1, &time2, &t_diff);
- add_time_to(&caching, &t_diff);
+ if (i != first) {
+ clock_gettime(CLOCK_MONOTONIC, &time2);
+ diff(&time1, &time2, &t_diff);
+ add_time_to(&caching, &t_diff);
+ }
if (obj && i != first) {
/*
* Move object to where we started to look for it so