A realtime preemption overview

Posted Aug 16, 2005 4:29 UTC (Tue) by mingo (subscriber, #31122)
In reply to: A realtime preemption overview by balbir
Parent article: A realtime preemption overview

Re #1, isnt SPL() disabling process/process preemption? That makes the concept unsuitable for the purposes of PREEMPT_RT. SPL() is also a pretty 'opaque' serialization method, only little better than the 'Big Kernel Lock' that Linux has now finally gotten rid of. Thirdly, it only has a limited number of (32? 64?) "priority levels" available, while Linux has thousands of separate types of critical sections. Fourthly, isnt SPL() nested? E.g. blocking up to level 5 means all execution covered by levels 0,1,2,3,4 are blocked - while with a spinlock you will only block access to the data structure affected. Such artificial nesting is pretty bad if you want to avoid deadlocks and want to have good SMP scalability. The natural _expression_ of locking hierarchies is not a flat "priority space" as SPL() does, but it's more like a forest of trees of independent entities, where we want to maintain as much independence as possible.

Re #2, there over 5000 uses of the spin-lock APIs in the Linux kernel, renaming it just to show that it might not spin anymore is not really worth the trouble (and the huge intrusion!) at this point.

Re #3, yes, priority inheritance is pretty important when an RT task wants to make use of kernel services.

Re #4, what precisely do you mean by "interrupt context" and "process context" in this particular case? The current situation is the following:

In the stock kernel there are 3 basic types of contexts: there is "interrupt context" (non-preemptible), "soft interrupt context" (non-preemptible) and "process context" (preemptible, unless executing in one of the many types of critical sections such as spinlocks).

In the PREEMPT_RT kernel there are 4 essential types of contexts: "hard interrupt context", "interrupt context", "soft interrupt context" and "process context". The hard interrupt context is an extremely small shim in essence - a few tens of lines total, per arch - it just deals with the interrupt controller, masks the IRQ line, acks the controller and returns. The "interrupt context" is a separate per-IRQ interrupt thread, which behaves like a process and is fully preemptible. "Soft interrupt context" is a separate per-softirq system-thread too, fully preemptible. "Process context" is what it used to be, and fully preemptible too. ['fully preemptible' means it's preemptible for in essence everything but the scheduler code and the basic RT-mutex/PI code]

considering the above description, your comment about "the lesser we run in interrupt context, the better" is indeed correct: in PREEMPT_RT the hardirq context execution time and complexity has been reduced to an absolute physical minimum. It is a fundamentally good and important thing to achieve determinism. Everything else is a "thread", as far as the scheduler is concerned, and is as preemptible as possible. You can then use individual thread priorities to make some interrupts more important than others.

There is (inevitably) some scheduling overhead due to having more contexts, i've measured it to be 3-5%, worst-case [80 thousand irqs/sec], and near zero for the common case [couple of thousand irqs/sec], which is pretty good.

Note that Linux has specific scheduler optimizations that makes the introduction and use of system threads cheaper: e.g. the 'lazy-TLB' optimization will skip TLB flushes when switching between system threads, by letting system threads 'inherit' the TLB context of the previous user-process. Thus we might not need to do any TLB flushing if we switch back to the same user-process - and we dont have to do any TLB flushing if we switch between system threads. So in the TLB flushing sense, system threads are completely transparent and do not increase the number of TLB flushes.

A realtime preemption overview

Posted Aug 16, 2005 4:49 UTC (Tue) by balbir (guest, #19399) [Link]

Thanks for answering all the questions

#1. Pre-empting critical sections sounds like an oxymoron, if the sections are critical, why pre-empt them? Just kidding, I like the idea of a true priority based pre-emptive scheduler. I agree that SPL() will disable pre-emption and is opaque, but if you want to disable IRQ pre-emption by other tasks (if there is any) then it works well, but it has all the limitations you mentioned.

#2. I meant, lets create an alias and encourage people to use the new name instead of spinlock_t, like raw_spinlock_t is an alias for the older spinlock_t. Lets still have spinlock_t and a newer name for it.

#4. Your numbers look very good, the optimizations seem good as well. What scares me now is correct assignment of task priority will be critical to programming linux drivers/kernel components in the future. Is this understanding correct? Are there any guidelines that you follow?

I will search for your patch and read the code to understand it better.

A realtime preemption overview

Posted Aug 16, 2005 5:08 UTC (Tue) by mingo (subscriber, #31122) [Link]

#1 there's no requirement that critical sections must never be preempted - in fact we do "preempt" most spinlocks sections with interrupt contexts in the stock kernel.

A critical section is "critical" only because the data structure affected must be updated transactionally (fully or not at all). How that is achieved, and whether certain types of contexts may or may not execute during such critical sections is not specified. But i think we mostly agree here.

#2 an alias will only cause confusion and most of the code wont be changed. A wholesale namespace cleanup might eventually be done, but it is not practical right now.

#4 more configurability also means more ways to misconfigure, but that's a natural consequence, not a bad thing. There are already some userspace tools emerging that simplify things and boost certain types of processing (such as "audio"). I'm not putting too much effort into formalizing this though until the infrastructure has not been finalized.

[linuxkernelnewbies] A realtime preemption overview [LWN.net]

A realtime preemption overview

A realtime preemption overview

Reply via email to