BottomHalves

出自 DebianWiki

[編輯] Interrupt Management

Interrupt Processing

|__ Top Half --> Interrupt Handlers

|__ Bottom Halves

|__ Softirqs

|__ Tasklets

|__ Work Queues

|__ BH (obsolete)

|__ Task Queues (obsolete)

|__ Kernel Timer (see Chap 9)

Why Use Bottom Halves ?
- Interrupt handlers run asynchronously and thus interrupt other potentially important code.
- Interrupt handlers run with part or all interrupt level disabled.
- Interrupt handlers are often very timing-critical because they deal with hardware.
- Interrupt handlers do not run in process context, therefore, they cannot block.

How to Divide Works?
- If the work is time-sensitive, perform it in the interrupt handler.
- If the work is related to the hardware itself, perform it in the interrupt handler.
- If the work needs to ensure that another interrupt (particularly the same interrupt) does not interrupt it, perform it in the interrupt handler.
- For everything else, consider performing the work in the bottom half.

Some Advices
- When attempting to write your own device driver, looking at other interrupt handlers and their corresponding bottom halves.
- Asking yourself what has to be in the top half and what can be in the bottom half.
- The quicker the interrupt handler executes, the better.

[編輯] BH (obsolete)

Features
- Providing a statically created list of 32 bottom halves.
- Top half mark whether the bottom half would run by setting a bit in a 32-bit integer.
- Each BH was globally synchronized. No two could run at the same time, even on different processors.
- Easy-to-use, simple; but inflexible & a performance bottleneck.

BH mechanism is similar to tasklets.
- In 2.4, BH interface was implemented on top of tasklets.

[編輯] Task Queues (obsolete)

http://www.science.unitn.it/~fiorella/guidelinux/tlk/img114.gif

Define
- Task queues are the kernel's way of deferring work until later.
- A task queue is a simple data structure, which consists of a singly linked list of tq_struct data structures each of which contains the address of a routine and a pointer to some data.
- The routine will be called when the element on the task queue is processed and it will be passed a pointer to the data.

Anything in the kernel, for example a device driver, can create and use task queues but there are three task queues created and managed by the kernel:
- timer
- immediate
- scheduler <-- evolved into work queues interface after v2.5

One can create new queue for his own purpose.

[編輯] Softirqs

Features
- Defined in <kernel/softirq.c> & <linux/interrupt.h>
- Max 32
- Statically allocated
- Raised from within interrupt handlers
- Run in interrupt context with all interrupts enabled
- Can not block or sleep
- Providing the least serialization, as two or more softirqs of the same type may run concurrently on different processors.
- Only 6 of 32 used
- Table 6.2 Listing of Bottom Half Control Methods

Tasklet	Priority	Softirq Description
HI_SOFTIRQ	0	High priority tasklets
TIMER_SOFTIRQ	1	Timer bottom half
NET_TX_SOFTIRQ	2	Send network packets
NET_RX_SOFTIRQ	3	Receive network packets
SCSI_SOFTIRQ	4	SCSI bottom half
TASKLET_SOFTIRQ	5	Tasklets

Define
- softirq_action & softirq_vec

/* 
 * structure representing a single softirq entry 
 */ 
struct softirq_action
{ 
  void (*action)(struct softirq_action *); /* function to run */ 
  void *data; /* data to pass to function */ 
};
 
static struct softirq_action softirq_vec[32];

- The prototype of a softirq handler

void softirq_handler(struct softirq_action *)

- do_softirq() -- running all pending softirqs

/* Simplified version here!! */
u32 pending = softirq_pending(cpu);

if (pending) {
  struct softirq_action *h = softirq_vec;

  softirq_pending(cpu) = 0;

  do {
    if (pending & 1)
      h->action(h);


    h++;
    pending >>= 1;
 } while (pending);
}

Usage
- Define an Index via an enum in <linux/interrupt.h>
- Registering Your Handler

open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL); open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL);

- Raising Your Softirq

raise_softirq(NET_TX_SOFTIRQ);

/*
 * interrupts must already be off!
 */
raise_softirq_irqoff(NET_TX_SOFTIRQ);

ksoftirqd
- When the system is overwhelmed with softirqs, helps in the processing of softirqs.
- Not immediately process reactivated softirqs.
- The kernel threads run with the lowest possible priority (nice value of 19), which ensures they do not run in lieu of anything important.
- The threads are each named ksoftirqd/n where n is the processor number.
- Awakened whenever do_softirq() detects an executed softirq reactivate itself.

for (;;) {
  if (!softirq_pending(cpu))
	schedule();

  set_current_state(TASK_RUNNING);

  while (softirq_pending(cpu)) {
	do_softirq();
	if (need_resched())
		schedule();
  }

  set_current_state(TASK_INTERRUPTIBLE);
}

[編輯] Tasklets

Features
- Have nothing to do with tasks.
- Built on top of softirqs.
- Tasklets are similar in nature and work in a similar manner to softirqs.
- Two of the same tasklets never run concurrently—although two different tasklets can run at the same time on two different processors.
- Have a simpler interface and relaxed locking rules.
- As with softirqs, tasklets cannot sleep.
- Tasklets also run with all interrupts enabled, precautions is required if your tasklet shares data with an interrupt handler.

Define
- Tasklets are represented by two softirqs: HI_SOFTIRQ and TASKLET_SOFTIRQ.
- tasklet_struct

struct tasklet_struct { struct tasklet_struct *next; /* pointer to the next tasklet in the list */ unsigned long state; /* state of the tasklet */ atomic_t count; /* reference counter */ void (*func)(unsigned long); /* tasklet handler function */ unsigned long data; /* argument to the tasklet function */ };

- The func member is the tasklet handler (the equivalent of action to a softirq) and it receives data as its sole argument.
- The state member is one of zero, TASKLET_STATE_SCHED, or TASKLET_STATE_RUN.
- The count field is used as a reference count for the tasklet. If it is nonzero, the tasklet is disabled and cannot run; if it is zero, the tasklet is enabled and can run if marked pending.

Declaring Your Tasklet
- Statically create a tasklet using one of two macros defined in <linux/interrupt.h>:

DECLARE_TASKLET(name, func, data)

// change count field into 1 DECLARE_TASKLET_DISABLED(name, func, data);

- - ex.

 DECLARE_TASKLET(my_tasklet, my_tasklet_handler, dev);

This line is equivalent to

 struct tasklet_struct my_tasklet = { NULL, 0, ATOMIC_INIT(0),

tasklet_handler, dev };

- Dynamically create tasklet:

struct tasklet_struct *t;

tasklet_init(t, tasklet_handler, dev); /* dynamically not

                                         statically */

Writing Your Tasklet Hander
- Prototype:

void tasklet_handler(unsigned long data);

Scheduling Tasklets
- Stored in two per-processor linked-lists: tasklet_vec (for regular tasklets) and tasklet_hi_vec (for high-priority tasklets).
- Via the tasklet_schedule() and tasklet_hi_schedule(): functions, which receive a pointer to the tasklet's tasklet_struct as argument.
  - Check if the tasklet's state is TASKLET_STATE_SCHED. If it is, the tasklet is already scheduled to run and the function can return.
  - Save the state of the interrupt system, and then disable local interrupts. This ensures nothing on this processor will mess with the tasklet scheduling code.
  - Add the tasklet to-be-scheduled to the head of the tasklet_vec or tasklist_hi_vec linked list, which is unique to each processor in the system.
  - Raise the TASKLET_SOFTIRQ or HI_SOFTIRQ softirq, so this tasklet will execute in the near future by do_softirq().
  - Restore interrupts to their previous state and return.
- do_softirq() executes the associated tasklet_action() and tasklet_hi_action() handlers:
  - Disable interrupts and retrieves the tasklet_vec or tasklist_hi_vec list for this processor.
  - Clear the list for this processor by setting it equal to NULL.
  - Enable interrupts (there is no need to restore them to their previous state because the code here is always called as a softirq handler, and thus, interrupts are always enabled).
  - Loop over each pending tasklet in the retrieved list.
  - If this is a multiprocessor machine, check if the tasklet is running on another processor by checking the TASKLET_STATE_RUN flag. If it is currently running, do not execute it now and skip to the next pending tasklet (recall, only one tasklet of a given type may run concurrently).
  - If the tasklet is not currently running, set the TASKLET_STATE_RUN flag, so another processor will not run it.
  - Check for a zero count value, to ensure that the tasklet is not disabled. If the tasklet is disabled, skip it and go to the next pending tasklet.
  - We now know that the tasklet is not running elsewhere, is marked as running by us so it will not start running elsewhere, and has a zero count value. Run the tasklet handler.
  - After the tasklet runs, clear the TASKLET_STATE_RUN flag in the tasklet’s state field.
  - Repeat for the next pending tasklet, until there are no more scheduled tasklets waiting to run.

Managing Tasklets
- Disable & Enable Tasklet

tasklet_disable(&my_tasklet); /* tasklet is now disabled */

/* we can now do stuff knowing the tasklet cannot run .. */

tasklet_enable(&my_tasklet); /* tasklet is now enabled */

- Kill a Pending Tasklet
  - tasklet_kill()

[編輯] Work Queues

Features
- Defer work into a kernel thread—the work always runs in process context.
- If the deferred work needs to sleep, work queues are used.
- The work queue subsystem is an interface for creating worker threads in kernel to handle work that is queued from elsewhere.
- The default worker threads are called events/n where n is the processor number; there is one per processor.
- Nothing stops code from creating its own worker thread.

Define
- struct workqueue_struct

The externally visible workqueue abstraction is an array of
per-CPU workqueues:
/

struct workqueue_struct { struct cpu_workqueue_struct cpu_wq[NR_CPUS]; };

- - Each type of worker thread has one workqueue_struct associated to it.
  - Inside, there is one cpu_workqueue_struct for every thread, and thus, every processor, because there is one worker thread on each processor.
- struct cpu_workqueue_struct -- one per possible processor on the system

/* * The per-CPU workqueue: */ struct cpu_workqueue_struct { spinlock_t lock; atomic_t nr_queued; struct list_head worklist; wait_queue_head_t more_work; wait_queue_head_t work_done; struct workqueue_struct *wq; task_t *thread; struct completion exit; };

- Data Structures Representing the Work: struct work_struct

struct work_struct { unsigned long pending; /* is this work pending? */ struct list_head entry; /* link list of all work */ void (*func)(void *); /* handler function */ void *data; /* argument to handler */ void *wq_data; /* used internally */ struct timer_list timer; /* timer used by delayed

                                          work queues */

};

- - These structures are strung into a linked list, one for each type of queue on each processor.
- Core logic of worker_thread(), simplified:

for (;;) { set_task_state(current, TASK_INTERRUPTIBLE); add_wait_queue(&cwq->more_work, &wait); if (list_empty(&cwq->worklist)) schedule(); else set_task_state(current, TASK_RUNNING); remove_wait_queue(&cwq->more_work, &wait); if (!list_empty(&cwq->worklist)) run_workqueue(cwq); }

- - All worker threads are implemented as normal kernel threads running the worker_thread() function.
  - After initial setup, this function enters an infinite loop and goes to sleep.
- Core logic of run_workqueue(), in turn, actually performs the deferred work:

while (!list_empty(&cwq->worklist)) { struct work_struct *work = list_entry(cwq->worklist.next, struct work_struct, entry); void (*f) (void *) = work->func; void *data = "" list_del_init(cwq->worklist.next); clear_bit(0, &work->pending); f(data); }

Usage
- Creating Work

// Statically DECLARE_WORK(name, void (*func)(void *), void *data);

// dynamically initialized INIT_WORK(struct work_struct *work, void (*func)(void *), void *data);

- Your Work Queue Handler

void work_handler(void *data)

- - The function runs in process context.
  - By default, interrupts are enabled and no locks are held.
  - If needed, the function can sleep.
  - Despite running in process context, the work handlers cannot access user-space because there is no associated user-space memory map for kernel threads.
  - Locking between work queues or other parts of the kernel is handled just as with any other process context code.
- Scheduling Work

schedule_work(&work); // with a delay time at least n_ticks timer ticks schedule_delayed_work(&work, n_ticks);

- Flushing Work

void flush_scheduled_work(void);

- - Used while modules unloading from kernel
  - The function sleeps, only call it from process context.
  - This function does not cancel any delayed work
- To cancel delayed work, call

int cancel_delayed_work(struct work_struct *work);

- Creating New Work Queues

struct workqueue_struct *create_workqueue(const char *name);

ex: struct workqueue_struct *keventd_wq = create_workqueue(“events”);

- - Scheduling Work on your queue

int queue_work(struct workqueue_struct *wq,

              struct work_struct *work);

int queue_delayed_work(struct workqueue_struct *wq, struct work_struct *work, unsigned long delay);

- - Flushing Work on your queue

flush_workqueue(struct workqueue_struct *wq);

[編輯] Which bottom halves should we use?

Table 6.3 Bottom Half Comparison

Bottom Half	Context	Serialization
Softirq	Interrupt	None
Tasklet	Interrupt	Against the same tasklet
Work queues	Process	None (scheduled as process context)

[編輯] Locking issues

To protect shared data from concurrent access while using bottom halves
Tasklets are serialized with respect to themselves
- you do not have to worry about intratasklet concurrency issues.
- Intertasklet concurrency (that is, when two different tasklets share the same data) requires proper locking.
Softirqs provide no serialization, all shared data needs an appropriate lock.
If process context code and a bottom half share data, you need to disable bottom half processing and obtain a lock before accessing the data.
If interrupt context code and a bottom half share data, you need to disable interrupts and obtain a lock before accessing the data.
Any shared data in a work queue requires locking as in normal kernel code.
Table 6.4 Disabling Bottom Halves

Method	Description
void local_bh_disable()	Disable softirq and tasklet processing on the local processor
void local_bh_enable()	Enable softirq and tasklet processing on the local processor

[linuxkernelnewbies] BottomHalves - DebianWiki