http://lwn.net/Articles/147782/A realtime preemption overviewA realtime preemption overviewPosted Aug 16, 2005 4:29 UTC (Tue) by mingo (subscriber, #31122)In reply to: A realtime preemption overview by balbir Parent article: A realtime preemption overview Re #1, isnt SPL() disabling process/process preemption? That makes
the
concept unsuitable for the purposes of PREEMPT_RT. SPL() is also a
pretty 'opaque' serialization method, only little better than the 'Big
Kernel Lock' that Linux has now finally gotten rid of. Thirdly, it only
has a limited number of (32? 64?) "priority levels" available, while
Linux has thousands of separate types of critical sections. Fourthly,
isnt SPL() nested? E.g. blocking up to level 5 means all execution
covered by levels 0,1,2,3,4 are blocked - while with a spinlock you
will only block access to the data structure affected. Such artificial
nesting is pretty bad if you want to avoid deadlocks and want to have
good SMP scalability. The natural _expression_ of locking hierarchies is
not a flat "priority space" as SPL() does, but it's more like a forest
of trees of independent entities, where we want to maintain as much
independence as possible. Re #2, there over 5000 uses of the spin-lock APIs in the Linux
kernel,
renaming it just to show that it might not spin anymore is not really
worth the trouble (and the huge intrusion!) at this point. Re #3, yes, priority inheritance is pretty important when an RT task
wants to make use of kernel services. Re #4, what precisely do you mean by "interrupt context" and
"process
context" in this particular case? The current situation is the
following: In the stock kernel there are 3 basic types of contexts: there is
"interrupt context" (non-preemptible), "soft interrupt context"
(non-preemptible) and "process context" (preemptible, unless executing
in one of the many types of critical sections such as spinlocks). In the PREEMPT_RT kernel there are 4 essential types of contexts:
"hard
interrupt context", "interrupt context", "soft interrupt context" and
"process context". The hard interrupt context is an extremely small
shim in essence - a few tens of lines total, per arch - it just deals
with the interrupt controller, masks the IRQ line, acks the controller
and returns. The "interrupt context" is a separate per-IRQ interrupt
thread, which behaves like a process and is fully preemptible. "Soft
interrupt context" is a separate per-softirq system-thread too, fully
preemptible. "Process context" is what it used to be, and fully
preemptible too. ['fully preemptible' means it's preemptible for in
essence everything but the scheduler code and the basic RT-mutex/PI
code] considering the above description, your comment about "the lesser we
run in interrupt context, the better" is indeed correct: in PREEMPT_RT
the hardirq context execution time and complexity has been reduced to
an absolute physical minimum. It is a fundamentally good and important
thing to achieve determinism. Everything else is a "thread", as far as
the scheduler is concerned, and is as preemptible as possible. You can
then use individual thread priorities to make some interrupts more
important than others. There is (inevitably) some scheduling overhead due to having more
contexts, i've measured it to be 3-5%, worst-case [80 thousand
irqs/sec], and near zero for the common case [couple of thousand
irqs/sec], which is pretty good. Note that Linux has specific scheduler optimizations that makes the
introduction and use of system threads cheaper: e.g. the 'lazy-TLB'
optimization will skip TLB flushes when switching between system
threads, by letting system threads 'inherit' the TLB context of the
previous user-process. Thus we might not need to do any TLB flushing if
we switch back to the same user-process - and we dont have to do any
TLB flushing if we switch between system threads. So in the TLB
flushing sense, system threads are completely transparent and do not
increase the number of TLB flushes.
A realtime preemption overview Posted Aug 16, 2005 4:49 UTC (Tue) by balbir (guest, #19399) [Link] Thanks for answering all the questions
#1. Pre-empting critical sections sounds like an oxymoron, if the
sections are critical, why pre-empt them? Just kidding, I like the idea
of a true priority based pre-emptive scheduler. I agree that SPL() will
disable pre-emption and is opaque, but if you want to disable IRQ
pre-emption by other tasks (if there is any) then it works well, but it
has all the limitations you mentioned. #2. I meant, lets create an alias and encourage people to use the
new
name instead of spinlock_t, like raw_spinlock_t is an alias for the
older spinlock_t. Lets still have spinlock_t and a newer name for it. #4. Your numbers look very good, the optimizations seem good as
well.
What scares me now is correct assignment of task priority will be
critical to programming linux drivers/kernel components in the future.
Is this understanding correct? Are there any guidelines that you follow? I will search for your patch and read the code to understand it
better.
A realtime preemption overview Posted Aug 16, 2005 5:08 UTC (Tue) by mingo (subscriber, #31122) [Link] #1 there's no requirement that critical sections must never be preempted - in fact we do "preempt" most spinlocks sections with interrupt contexts in the stock kernel.
A critical section is "critical" only because the data structure
affected must be updated transactionally (fully or not at all). How
that is achieved, and whether certain types of contexts may or may not
execute during such critical sections is not specified. But i think we
mostly agree here. #2 an alias will only cause confusion and most of the code wont be
changed. A wholesale namespace cleanup might eventually be done, but it
is not practical right now. |
