Hello, This really looks like what we want :)
Agustina Arzille, on Fri 04 Mar 2016 11:32:10 -0300, wrote: > +#define GSYNC_NBUCKETS 512 > +static struct gsync_hbucket gsync_buckets[GSYNC_NBUCKETS]; I'm wondering whether we could want to have buckets per task too? What does Linux do for private futexes? Is 512 enough? How much does Linux use? (not saying that we need this for commiting, rather for future work). > +static int > +valid_access_p (vm_map_t map, vm_offset_t addr, > + vm_offset_t size, vm_prot_t prot) > +{ > + vm_map_entry_t entry; > + return (vm_map_lookup_entry (map, addr, &entry) && > + entry->vme_end >= addr + size && > + (prot & entry->protection) == prot); > +} Mmm, userland could be running gsync_wait in one thread, and munmap in another thread, and make the kernel crash because munmap() got effect beween the valid_access_p check and the actually dereference in gsync_wait. I believe you need a vm_map_lock_read(current_map()); vm_map_unlock_read(current_map()); around the valid_access_p + actual access part (and the same for wake and requeue). > + /* XXX: These codes aren't really descriptive, but it's > + * the best I can think of right now. */ > + ret = ret == THREAD_INTERRUPTED ? > + KERN_ABORTED : KERN_TERMINATED; I'd say invent new KERN_* values: neither ABORTED nor TERMINATED really match the two cases happening here. > + simple_lock (&hbp->lock); > + > + if (flags & GSYNC_MUTATE) > + __atomic_store_n ((unsigned int *)addr, > + val, __ATOMIC_RELEASE); Why using an atomic operation? AIUI, simple_lock already provides atomicity. You'd need atomic flags on the reader side too to really get proper semantic actually. > + * Setting the amount to UINT_MAX > + * should do the trick until we can manage a ridiculously > + * large amount of waiters. */ > + unsigned int nw = (flags & GSYNC_BROADCAST) ? ~0U : 1; [...] > + while (--nw != 0 && !list_end (&hbp->entries, runp) && > + gsync_key_eq (&node_to_waiter(runp)->key, &w.key)); It doesn't seem too costly to me to do instead > + while ( (flags & GSYNC_BROADCAST || --nw != 0) && !list_end > (&hbp->entries, runp) && Which avoids to introduce the not-so-happy trick altogether (and some for requeue) > + else if ((unsigned long)bp1 < (unsigned long)bp2) > + { > + simple_unlock (&bp2->lock); > + simple_unlock (&bp1->lock); > + } > + else > + { > + simple_unlock (&bp1->lock); > + simple_unlock (&bp2->lock); > + } There's no real need to order unlocking, but why not :) Thanks for your work! BTW, how is your copyright assignment going? Samuel