Introduce smp_cond_acquire() which combines a control dependency and a
read barrier to form acquire semantics.

This primitive has two benefits:
 - it documents control dependencies,
 - its typically cheaper than using smp_load_acquire() in a loop.

Note that while smp_cond_acquire() has an explicit
smp_read_barrier_depends() for Alpha, neither sites it gets used in
were actually buggy on Alpha for their lack of it. The first uses
smp_rmb(), which on Alpha is a full barrier too and therefore serves
its purpose. The second had an explicit full barrier.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
 include/linux/compiler.h |   18 ++++++++++++++++++
 kernel/sched/core.c      |    8 +-------
 kernel/task_work.c       |    4 ++--
 3 files changed, 21 insertions(+), 9 deletions(-)

--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -275,6 +275,24 @@ static __always_inline void __write_once
        __val; \
 })
 
+/**
+ * smp_cond_acquire() - Spin wait for cond with ACQUIRE ordering
+ * @cond: boolean expression to wait for
+ *
+ * Equivalent to using smp_load_acquire() on the condition variable but employs
+ * the control dependency of the wait to reduce the barrier on many platforms.
+ *
+ * The control dependency provides a LOAD->STORE order, the additional RMB
+ * provides LOAD->LOAD order, together they provide LOAD->{LOAD,STORE} order,
+ * aka. ACQUIRE.
+ */
+#define smp_cond_acquire(cond) do {            \
+       while (!(cond))                         \
+               cpu_relax();                    \
+       smp_read_barrier_depends(); /* ctrl */  \
+       smp_rmb(); /* ctrl + rmb := acquire */  \
+} while (0)
+
 #endif /* __KERNEL__ */
 
 #endif /* __ASSEMBLY__ */
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2111,19 +2111,13 @@ try_to_wake_up(struct task_struct *p, un
        /*
         * If the owning (remote) cpu is still in the middle of schedule() with
         * this task as prev, wait until its done referencing the task.
-        */
-       while (p->on_cpu)
-               cpu_relax();
-       /*
-        * Combined with the control dependency above, we have an effective
-        * smp_load_acquire() without the need for full barriers.
         *
         * Pairs with the smp_store_release() in finish_lock_switch().
         *
         * This ensures that tasks getting woken will be fully ordered against
         * their previous state and preserve Program Order.
         */
-       smp_rmb();
+       smp_cond_acquire(!p->on_cpu);
 
        p->sched_contributes_to_load = !!task_contributes_to_load(p);
        p->state = TASK_WAKING;
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -102,13 +102,13 @@ void task_work_run(void)
 
                if (!work)
                        break;
+
                /*
                 * Synchronize with task_work_cancel(). It can't remove
                 * the first entry == work, cmpxchg(task_works) should
                 * fail, but it can play with *work and other entries.
                 */
-               raw_spin_unlock_wait(&task->pi_lock);
-               smp_mb();
+               smp_cond_acquire(!raw_spin_is_locked(&task->pi_lock));
 
                do {
                        next = work->next;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to