On Wed, Jan 30, 2019 at 01:34:19PM -0800, Alexei Starovoitov wrote:
> On Wed, Jan 30, 2019 at 10:05:29PM +0100, Peter Zijlstra wrote:
> >
> > Would something like the below work for you instead?
> >
> > I find it easier to read, and the additional CONFIG symbol would give
> > architectures (say ARM) an easy way to force the issue.
> >
> >
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -221,6 +221,72 @@ const struct bpf_func_proto bpf_get_curr
> > .arg2_type = ARG_CONST_SIZE,
> > };
> >
> > +#if defined(CONFIG_QUEUED_SPINLOCKS) || defined(CONFIG_BPF_ARCH_SPINLOCK)
> > +
> > +static inline void __bpf_spin_lock(struct bpf_spin_lock *lock)
> > +{
> > + arch_spinlock_t *l = (void *)lock;
> > + BUILD_BUG_ON(sizeof(*l) != sizeof(__u32));
> > + if (1) {
> > + union {
> > + __u32 val;
> > + arch_spinlock_t lock;
> > + } u = { .lock = __ARCH_SPIN_LOCK_UNLOCKED };
> > + compiletime_assert(u.val == 0, "__ARCH_SPIN_LOCK_UNLOCKED not
> > 0");
> > + }
> > + arch_spin_lock(l);
>
> And archs can select CONFIG_BPF_ARCH_SPINLOCK when they don't
> use qspinlock and their arch_spinlock_t is compatible ?
> Nice. I like the idea!
Exactly, took me a little while to come up with that test for
__ARCH_SPIN_LOCK_UNLOCKED, but it now checks for both assumptions, so no
surprises when people get it wrong by accident.
> > +}
> > +
> > +static inline void __bpf_spin_unlock(struct bpf_spin_lock *lock)
> > +{
> > + arch_spinlock_t *l = (void *)lock;
> > + arch_spin_unlock(l);
> > +}
> > +
> > +#else
> > +
> > +static inline void __bpf_spin_lock(struct bpf_spin_lock *lock)
> > +{
> > + atomic_t *l = (void *)lock;
> > + do {
> > + atomic_cond_read_relaxed(l, !VAL);
>
> wow. that's quite a macro magic.
Yeah, C sucks for not having lambdas, this was the best we could come up
with.
This basically allows architectures to optimize the
wait-for-variable-to-change thing. Currently only ARM64 does that, I
have a horrible horrible patch that makes x86 use MONITOR/MWAIT for
this, and I suppose POWER should use it but doesn't.
> Should it be
> atomic_cond_read_relaxed(l, (!VAL));
> like qspinlock.c does ?
Extra parens doesn't hurt of course, but I don't think it's strictly
needed, the atomic_cond_read_*() wrappers already add extra parent
before passing it on to smp_cond_load_*().