Re: [PATCH] object: measure time needed for resolving hash collisions

Jeff King Thu, 15 Sep 2016 00:09:07 -0700

On Wed, Sep 14, 2016 at 11:47:01PM -0700, Jeff King wrote:

> >     first = i = hash_obj(sha1, obj_hash_size);
> > +   clock_gettime(CLOCK_MONOTONIC, &time1);
> >     while ((obj = obj_hash[i]) != NULL) {
> >             if (!hashcmp(sha1, obj->oid.hash))
> >                     break;
> > @@ -98,6 +131,9 @@ struct object *lookup_object(const unsigned char *sha1)
> >             if (i == obj_hash_size)
> >                     i = 0;
> >     }
> > +   clock_gettime(CLOCK_MONOTONIC, &time2);
> > +   diff(&time1, &time2, &t_diff);
> > +   add_time_to(&caching, &t_diff);
> >     if (obj && i != first) {
> 
> I don't think this is actually measuring the time spent on collisions.
> It's measuring the time we spend in hashcmp(), but that includes the
> non-collision case where we find it in the first hashcmp.
> 
> Measuring _just_ the collisions is more like the patch below. In my
> measurements it's more like 30ms, compared to 10s for all of the
> hashcmps.


I forgot to send the patch. Which is just as well, because I realized it
was totally buggy.

Here's a patch that I believe is correct (it counts only times when we
move past the first hash slot). It spends about 280ms. Which is still a
lot less than 10s, so I think my other comments stand.

---
diff --git a/object.c b/object.c
index e9e73e0..7a74a1d 100644
--- a/object.c
+++ b/object.c
@@ -123,17 +123,20 @@ struct object *lookup_object(const unsigned char *sha1)
                return NULL;
 
        first = i = hash_obj(sha1, obj_hash_size);
-       clock_gettime(CLOCK_MONOTONIC, &time1);
        while ((obj = obj_hash[i]) != NULL) {
                if (!hashcmp(sha1, obj->oid.hash))
                        break;
+               if (first == i)
+                       clock_gettime(CLOCK_MONOTONIC, &time1);
                i++;
                if (i == obj_hash_size)
                        i = 0;
        }
-       clock_gettime(CLOCK_MONOTONIC, &time2);
-       diff(&time1, &time2, &t_diff);
-       add_time_to(&caching, &t_diff);
+       if (i != first) {
+               clock_gettime(CLOCK_MONOTONIC, &time2);
+               diff(&time1, &time2, &t_diff);
+               add_time_to(&caching, &t_diff);
+       }
        if (obj && i != first) {
                /*
                 * Move object to where we started to look for it so

Re: [PATCH] object: measure time needed for resolving hash collisions

Reply via email to