On Fri, 25 Sep 2020 00:03:14 +0200 Daniel Borkmann wrote:
> static inline u64 gen_cookie_next(struct gen_cookie *gc)
> {
> u64 val;
>
> if (likely(this_cpu_inc_return(*gc->level_nesting) == 1)) {
Is this_cpu_inc() in itself atomic?
Is there a comparison of performance of various atomic ops and locking
somewhere? I wonder how this scheme would compare to a using a cmpxchg.
> u64 *local_last = this_cpu_ptr(gc->local_last);
>
> val = *local_last;
> if (__is_defined(CONFIG_SMP) &&
> unlikely((val & (COOKIE_LOCAL_BATCH - 1)) == 0)) {
Can we reasonably assume we won't have more than 4k CPUs and just
statically divide this space by encoding CPU id in top bits?
> s64 next = atomic64_add_return(COOKIE_LOCAL_BATCH,
> &gc->shared_last);
> val = next - COOKIE_LOCAL_BATCH;
> }
> val++;
> if (unlikely(!val))
> val++;
> *local_last = val;
> } else {
> val = atomic64_add_return(COOKIE_LOCAL_BATCH,
> &gc->shared_last);
> }
> this_cpu_dec(*gc->level_nesting);
> return val;
> }