http://wiki.debian.org.tw/index.php/BottomHalves
BottomHalves
出自
DebianWiki
[編輯] Interrupt Management
- |__ Top Half -->
Interrupt Handlers
- |__ Bottom Halves
- |__ Softirqs
- |__ Tasklets
- |__ Work Queues
- |__ BH (obsolete)
- |__ Task Queues (obsolete)
- |__ Kernel Timer
(see Chap 9)
- Why Use Bottom
Halves ?
- Interrupt handlers
run asynchronously and thus interrupt other potentially
important code.
- Interrupt handlers
run with part or all interrupt level disabled.
- Interrupt handlers
are often very timing-critical because they deal with hardware.
- Interrupt handlers
do not run in process context, therefore, they cannot
block.
- How to Divide Works?
- If the work is time-sensitive,
perform it in the interrupt handler.
- If the work is related
to the hardware itself, perform it in the interrupt handler.
- If the work needs
to ensure that another interrupt (particularly the same interrupt) does
not interrupt it, perform it in the interrupt handler.
- For everything
else, consider performing the work in the bottom half.
- Some Advices
- When attempting to
write your own device driver, looking at other interrupt
handlers and their corresponding bottom halves.
- Asking yourself
what has to be in the top half and what can be in the
bottom half.
- The quicker
the interrupt handler executes, the better.
[編輯]
BH (obsolete)
- Features
- Providing a statically
created list of 32 bottom halves.
- Top half mark
whether the bottom half would run by setting a bit in a 32-bit
integer.
- Each BH was globally
synchronized. No two could run at the same time, even on different
processors.
- Easy-to-use,
simple; but inflexible & a performance bottleneck.
- BH mechanism is
similar to tasklets.
- In 2.4, BH
interface was implemented on top of tasklets.
[編輯] Task Queues (obsolete)
http://www.science.unitn.it/~fiorella/guidelinux/tlk/img114.gif
- Define
- Task queues are
the kernel's way of deferring work until later.
- A task queue is a
simple data structure, which consists of a singly linked list of
tq_struct data structures each of which contains the address of a
routine and a pointer to some data.
- The routine will
be called when the element on the task queue is processed and it will
be passed a pointer to the data.
- Anything in the
kernel, for example a device driver, can
create and use task queues but there are three task queues created and
managed by the kernel:
- timer
- immediate
- scheduler
<-- evolved into work queues interface after v2.5
- One can create new
queue for his own purpose.
[編輯]
Softirqs
- Features
- Defined in <kernel/softirq.c>
& <linux/interrupt.h>
- Max 32
- Statically
allocated
- Raised
from within interrupt handlers
- Run in interrupt
context with all interrupts enabled
- Can not
block or sleep
- Providing the
least serialization, as two or more softirqs of the same type may
run concurrently on different processors.
- Only 6 of
32 used
- Table 6.2 Listing
of Bottom Half Control Methods
| Tasklet |
Priority |
Softirq
Description
|
| HI_SOFTIRQ |
0 |
High priority
tasklets
|
| TIMER_SOFTIRQ |
1 |
Timer bottom half
|
| NET_TX_SOFTIRQ |
2 |
Send network
packets
|
| NET_RX_SOFTIRQ |
3 |
Receive network
packets
|
| SCSI_SOFTIRQ |
4 |
SCSI bottom half
|
| TASKLET_SOFTIRQ |
5 |
Tasklets
|
- Define
- softirq_action
& softirq_vec
/*
* structure representing a single softirq entry
*/
struct softirq_action
{
void (*action)(struct softirq_action *); /* function to run */
void *data; /* data to pass to function */
};
static struct softirq_action softirq_vec[32];
-
- The prototype of a
softirq handler
void softirq_handler(struct softirq_action *)
-
- do_softirq()
-- running all pending softirqs
/* Simplified version here!! */
u32 pending = softirq_pending(cpu);
if (pending) {
struct softirq_action *h = softirq_vec;
softirq_pending(cpu) = 0;
do {
if (pending & 1)
h->action(h);
h++;
pending >>= 1;
} while (pending);
}
- Usage
- Define an Index
via an enum in <linux/interrupt.h>
- Registering Your
Handler
open_softirq(NET_TX_SOFTIRQ,
net_tx_action, NULL); open_softirq(NET_RX_SOFTIRQ, net_rx_action,
NULL);
raise_softirq(NET_TX_SOFTIRQ);
or
/*
* interrupts must already be off!
*/
raise_softirq_irqoff(NET_TX_SOFTIRQ);
- ksoftirqd
- When the system is
overwhelmed with softirqs, helps in the processing of softirqs.
- Not immediately
process reactivated softirqs.
- The kernel threads
run with the lowest possible priority
(nice value of 19), which ensures they do not run in lieu of anything
important.
- The threads are
each named ksoftirqd/n where n is the processor number.
- Awakened whenever do_softirq()
detects an executed softirq reactivate itself.
for (;;) {
if (!softirq_pending(cpu))
schedule();
set_current_state(TASK_RUNNING);
while (softirq_pending(cpu)) {
do_softirq();
if (need_resched())
schedule();
}
set_current_state(TASK_INTERRUPTIBLE);
}
[編輯]
Tasklets
- Features
- Have nothing to do
with tasks.
- Built on top of softirqs.
- Tasklets are similar
in nature and work in a similar manner to softirqs.
- Two of the
same tasklets never run concurrently—although two different
tasklets can run at the same time on two different processors.
- Have a simpler
interface and relaxed locking rules.
- As with softirqs,
tasklets cannot sleep.
- Tasklets also run
with all interrupts enabled, precautions is required if your
tasklet shares data with an interrupt handler.
- Define
- Tasklets are
represented by two softirqs: HI_SOFTIRQ and TASKLET_SOFTIRQ.
- tasklet_struct
struct tasklet_struct {
struct tasklet_struct *next; /* pointer to the next tasklet in the list
*/ unsigned long state; /* state of the tasklet */ atomic_t count; /*
reference counter */ void (*func)(unsigned long); /* tasklet handler
function */ unsigned long data; /* argument to the tasklet function */
};
-
- The func
member is the tasklet handler (the equivalent of action to a softirq)
and it receives data as its sole argument.
- The state
member is one of zero, TASKLET_STATE_SCHED, or TASKLET_STATE_RUN.
- The count
field is used as a reference count for the
tasklet. If it is nonzero, the tasklet is disabled and cannot run; if
it is zero, the tasklet is enabled and can run if marked pending.
- Declaring Your Tasklet
- Statically create
a tasklet using one of two macros defined in <linux/interrupt.h>:
DECLARE_TASKLET(name,
func, data)
// change count field into
1
DECLARE_TASKLET_DISABLED(name, func, data);
DECLARE_TASKLET(my_tasklet, my_tasklet_handler, dev);
This line is equivalent to
struct tasklet_struct my_tasklet = { NULL, 0, ATOMIC_INIT(0),
tasklet_handler, dev };
-
- Dynamically create
tasklet:
struct tasklet_struct *t;
tasklet_init(t,
tasklet_handler, dev); /* dynamically not
statically */
- Writing Your Tasklet
Hander
void tasklet_handler(unsigned long data);
- Scheduling Tasklets
- Stored in two
per-processor linked-lists: tasklet_vec (for regular
tasklets) and tasklet_hi_vec (for high-priority tasklets).
- Via the tasklet_schedule()
and tasklet_hi_schedule(): functions, which receive a pointer
to the tasklet's tasklet_struct as argument.
- Check if the
tasklet's state is TASKLET_STATE_SCHED. If it is, the
tasklet is already scheduled to run and the function can return.
- Save the state
of the interrupt system, and then disable
local interrupts. This ensures nothing on this processor will mess with
the tasklet scheduling code.
- Add the
tasklet to-be-scheduled to the head of the tasklet_vec or tasklist_hi_vec
linked list, which is unique to each processor in the system.
- Raise the TASKLET_SOFTIRQ
or HI_SOFTIRQ softirq, so this tasklet will execute in the near
future by do_softirq().
- Restore
interrupts to their previous state and return.
- do_softirq()
executes the associated tasklet_action() and tasklet_hi_action()
handlers:
- Disable
interrupts and retrieves the tasklet_vec or tasklist_hi_vec
list for this processor.
- Clear the list
for this processor by setting it equal to NULL.
- Enable
interrupts (there is no need to restore them to their
previous state because the code here is always called as a softirq
handler, and thus, interrupts are always enabled).
- Loop over each
pending tasklet in the retrieved list.
- If this is a
multiprocessor machine, check if the tasklet is running on another
processor by checking the TASKLET_STATE_RUN
flag. If it is currently running, do not execute it now and skip to the
next pending tasklet (recall, only one tasklet of a given type may run
concurrently).
- If the tasklet
is not currently running, set the TASKLET_STATE_RUN flag, so
another processor will not run it.
- Check for a
zero count value, to ensure that the tasklet is not disabled.
If the tasklet is disabled, skip it and go to the next pending tasklet.
- We now know
that the tasklet is not running elsewhere, is
marked as running by us so it will not start running elsewhere, and has
a zero count value. Run the tasklet handler.
- After the
tasklet runs, clear the TASKLET_STATE_RUN flag in the tasklet’s
state field.
- Repeat for the
next pending tasklet, until there are no more scheduled tasklets
waiting to run.
tasklet_disable(&my_tasklet);
/* tasklet is now disabled */
/* we can now do stuff
knowing the tasklet cannot run .. */
tasklet_enable(&my_tasklet);
/* tasklet is now enabled */
[編輯]
Work Queues
- Features
- Defer work into a kernel
thread—the work always runs in process context.
- If the deferred
work needs to sleep, work queues are used.
- The work queue
subsystem is an interface for creating worker threads in kernel
to handle work that is queued from elsewhere.
- The default
worker threads are called events/n where n is the
processor number; there is one per processor.
- Nothing
stops code from creating its own worker thread.
/*
- The externally visible
workqueue abstraction is an array of
- per-CPU workqueues:
- /
struct workqueue_struct {
struct cpu_workqueue_struct cpu_wq[NR_CPUS];
};
-
-
- Each type
of worker thread has one workqueue_struct associated to it.
- Inside, there
is one cpu_workqueue_struct for every thread, and thus, every
processor, because there is one worker thread on each processor.
- struct
cpu_workqueue_struct -- one per possible processor on the system
/*
* The per-CPU workqueue:
*/
struct cpu_workqueue_struct { spinlock_t lock; atomic_t nr_queued;
struct list_head worklist; wait_queue_head_t more_work;
wait_queue_head_t work_done; struct workqueue_struct *wq; task_t
*thread; struct completion exit;
};
-
- Data Structures
Representing the Work: struct work_struct
struct work_struct {
unsigned long pending; /* is this work pending? */ struct list_head
entry; /* link list of all work */ void (*func)(void *); /* handler
function */ void *data; /* argument to handler */ void *wq_data; /*
used internally */ struct timer_list timer; /* timer used by delayed
work queues */
};
-
-
- These
structures are strung into a linked list, one for each type of queue on
each processor.
- Core logic of worker_thread(),
simplified:
for (;;) {
set_task_state(current, TASK_INTERRUPTIBLE);
add_wait_queue(&cwq->more_work, &wait); if
(list_empty(&cwq->worklist)) schedule(); else
set_task_state(current, TASK_RUNNING);
remove_wait_queue(&cwq->more_work, &wait); if
(!list_empty(&cwq->worklist)) run_workqueue(cwq);
}
-
-
- All worker
threads are implemented as normal kernel threads running the
worker_thread() function.
- After initial
setup, this function enters an infinite loop and goes to sleep.
- Core logic of run_workqueue(),
in turn, actually performs the deferred work:
while
(!list_empty(&cwq->worklist)) { struct work_struct *work =
list_entry(cwq->worklist.next, struct work_struct, entry); void (*f)
(void *) = work->func; void *data = ""
list_del_init(cwq->worklist.next); clear_bit(0,
&work->pending); f(data);
}
// Statically
DECLARE_WORK(name, void (*func)(void *), void *data);
// dynamically initialized
INIT_WORK(struct work_struct *work, void (*func)(void *), void *data);
void work_handler(void
*data)
-
-
- The function runs
in process context.
- By default, interrupts
are enabled and no locks are held.
- If needed, the
function can sleep.
- Despite
running in process context, the work handlers cannot access
user-space because there is no associated user-space memory map for
kernel threads.
- Locking
between work queues or other parts of the kernel is handled just as
with any other process context code.
- Scheduling Work
schedule_work(&work);
// with a delay time at least n_ticks timer ticks
schedule_delayed_work(&work, n_ticks);
void
flush_scheduled_work(void);
-
-
- Used while modules
unloading from kernel
- The function
sleeps, only call it from process context.
- This function
does not cancel any delayed work
- To cancel delayed
work, call
int
cancel_delayed_work(struct work_struct *work);
struct workqueue_struct
*create_workqueue(const char *name);
ex:
struct workqueue_struct *keventd_wq = create_workqueue(“events”);
-
-
- Scheduling
Work on your queue
int queue_work(struct
workqueue_struct *wq,
struct work_struct *work);
int
queue_delayed_work(struct workqueue_struct *wq, struct work_struct
*work, unsigned long delay);
-
-
- Flushing Work
on your queue
flush_workqueue(struct
workqueue_struct *wq);
[編輯] Which bottom halves should we use?
- Table 6.3 Bottom Half
Comparison
| Bottom Half |
Context |
Serialization
|
| Softirq |
Interrupt |
None
|
| Tasklet |
Interrupt |
Against the same
tasklet
|
| Work queues |
Process |
None (scheduled as
process context)
|
[編輯]
Locking issues
- To protect shared data
from concurrent access while using bottom halves
- Tasklets are
serialized with respect to themselves
- you do not have to
worry about intratasklet concurrency issues.
- Intertasklet
concurrency (that is, when two different tasklets share the same data)
requires proper locking.
- Softirqs provide no
serialization, all shared data needs an appropriate lock.
- If process context
code and a bottom half share data, you
need to disable bottom half processing and obtain a lock before
accessing the data.
- If interrupt context
code and a bottom half share data, you
need to disable interrupts and obtain a lock before accessing the data.
- Any shared data in a
work queue requires locking as in normal kernel code.
- Table 6.4 Disabling
Bottom Halves
| Method |
Description
|
| void
local_bh_disable() |
Disable softirq
and tasklet processing on the local processor
|
| void
local_bh_enable() |
Enable softirq and
tasklet processing on the local processor
|
|