Questions: Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8, Q9.
The simples form of synchronization in the kernel are
spin-locks. Spin-locks cause the kernel to spin indefinitely
in an busy loop until some kind of resource
(represented by the lock) becomes available. One of the
functions that implements this behaviour is
spin_lock()
, defined in include/asm/spinlock.h
,
line 130:
static inline void spin_lock(spinlock_t *lock)
The function generates the following (simplified) assembly code:
1: lock; decb my_lock jns 3f 2: cmpb $0, my_lock jle 2b jmp 1b 3:
The execution of the current kernel control path depends on
the my_lock variable. If my_lock is greater than
zero at the time spin_lock()
is called, the
kernel will continue its normal execution. However, if
my_lock is less or equal to zero, the kernel will wait
in a busy loop until the lock becomes positive again.
The labels 1, 2 and 3 represent different phases of the
function. In part 1, my_lock is decremented
atomically. The lock
instruction before
decb
assures that on a multi-processor system
only one processor at the time can access the memory location
referred by my_lock. The jns
(Jump If Not
Sign) instruction immediately next will then transfer code
execution to part 3 if the new value of my_lock is
greater or equal to zero, allowing the kernel to continue its
current path. On the other hand, if the new value of
my_lock is negative, the kernel will keep spinning in
part 2 until the lock becomes positive. At this point
execution is transferred to part 1 once again.
The function to release the lock, spin_unlock()
,
is straightforward:
xchgb unlocked_value, my_lock
The code above sets the value of my_lock back to the
unlocked state, 1. unlocked_value is just a char
variable containing the value 1. xchg
executes an
atomic exchange of values between the
unlocked_value and my_lock variables. The
lock
instruction is not needed because it is
buil-in in xchg
.
Several variants of spin_lock()
and
spin_unlock()
exist. For instance, a more
sophisticated spinlock is the read/write spin lock, which
allows several readers to enter simultaneously a critical
region of code. However, only one writer at the time is
allowed in the critical section. The functions that implement
this kind of spinlock are read_lock()
,
write_lock()
, and their unlock partners.
On machines with multiple processors, spin-locks are a cheap and convenient way to protect important data structures from simultaneous access by two or more processors. However, on computers with a single CPU, spin-locks are generally forbidden and other techniques are used insted, such as semaphores and wait queues (described later).
![]() |
Q1. What is the great disadvantage of using spin-locks? |
![]() |
Q2. Suppose the assembly code for the
spin_lock() function is changed as shown
below. Will this new version still work? Explain.
1: lock; decb my_lock cmpb $0, my_lock jge 3f 2: cmpb $0, my_lock jle 2b jmp 1b 3: |
![]() |
Q3. What happens if a lock is initialized to 2 instead of 1? |
Semaphores are a smarter form of synchronization than spin-locks. Instead of wasting CPU time in busy loops, semaphore functions can put processes to sleep until a given resource becomes available. While a process is sleeping, the CPU can execute other tasks, therefore making the best use of its time and increasing the system's global level of multiprogramming.
Semaphores are defined in include/asm/semaphore.h
:
struct semaphore { atomic_t count; int sleepers; wait_queue_head_t wait; };
The count
field indicates the number of
simultaneous process allowed in a critical region. A value of
1 allows only one process in the critical section at any time.
The functions that operate on semaphore are the following:
down(sem)
and
down_interruptible(sem)
. If the semaphore
sem
is locked, the functions put the
current process to sleep until the semaphore is unlocked by
the up()
function. A process put to sleep using
interruptible_down()
can also be woken up by
signals (see Processes and
Process Scheduling: Sending signals to processes). If the
semaphore is unlocked, the process continues its normal
execution.up(sem)
, releases (unlocks) a
semaphore and wakes up one process in the semaphore's
waiting queue.
down_interruptible()
is almost always preferred
to down()
. The latter should be used with care
and only in some special cases.
One important feature of semaphore functions is that only one process is woken up when a semaphore is released. This differs from sleeping queues (described later), which awake all of the processes sleeping on a given queue.
Device drivers make use of semaphores and waiting queues all the time. Suppose a certain process issues a read from a device, such as a DVD drive. The drive's motor will take a few seconds to reach the optimal spin velocity and seek to the right position on disk. Meanwhile, the process that issued the read has to wait. The device driver in charge of the DVD player could implement this waiting as a busy loop, testing the device for a ready status at every loop iteration. However, this kind of waiting would waste a terrible amount of CPU time. A better solution is to put the process to sleep using semaphores or sleeping queues, until the DVD drive notifies (via interrupts) the processor that it is ready to accept commands.
The following example shows the design of a simple device
driver that uses semaphores. The driver creates the device
/dev/mutex
. The first process reading from this
device will acquire a lock that will cause other read attempts
from any process to sleep. The lock is released when the
process that owns it closes the device.
struct semaphore sem; /* Module Initialization */ static int __init dev_init(void) { ... dev_handle = devfs_register ( ... "mutex", /* create /dev/mutex */ ... S_IFCHR | S_IRWXO | S_IRWXG | S_IRWXU, /* permissions, everybody RWX */ ... ); ... /* initialize semaphore */ sema_init(&sem, 1); /* one process per critical section */ return(0); } /* dev_init() */ /* Read from device */ static ssize_t dev_read(struct file *filp, char *buf, size_t count, loff_t *offp) { if (down_interruptible(&sem)) return(-EINTR); /* sleeping has been interrupted by a signal */ return(count); } /* dev_read() */ /* Close the device */ static int dev_release(struct inode *inode, struct file *filp) { up(&sem); return(0); }
Notice that the sempahore sem
needs to be
initialized before it is used. Initialization of a
semaphore structure is done by the sema_init(sem,
count)
function. Also notice the if statement that
includes the down_interruptible()
function. Recall that a process put to sleep with this
function can be woken up by a signal. If this is the case,
down_interruptible()
returns -EINTR
.
We need to test for this condition every time we call
down_interruptible()
and return
-EINTR
if appropriate. When a process is woken up
by the up()
function,
down_interruptible()
returns 0.
![]() |
Q4. According to the code fragment shown above, what
happens if the the first process to open the
/dev/mutex device executes the following
instructions:
fd = open("/dev/mutex", O_RDONLY); read(fd, &buf, count); /* do something... */ read(fd, &buf, count); /* do something... */ close(fd); |
![]() |
Q5. Implement the /dev/mutex driver using
the code fragments introduced above. Show that the
semaphore actually works, by running several processes that
attempt to take the semaphore, delay a few seconds, and
release it. Use ps to verify that the
waiting processes are blocked rather than spinning. |
A wait queue is a linked list of sleeping processes waiting for a certain condition to become true. When the condition is satisfied, the processes on the queue are woken up all at once. This differs from the behaviour of semaphores, in which only one process is woken up when a lock is released. Semaphores are implemented using a special type of wait queue called exclusive wait queue.
A wait queue is identified by a
wait_queue_head_t
element, defined in
/include/linux/wait.h
, line
77:
struct __wait_queue_head { wq_lock_t lock; struct list_head task_list; }; typedef struct __wait_queue_head wait_queue_head_t;
The task_list
field points to a list of tasks
waiting on the queue. lock
is used to carry out
atomic operations on the task list.
Each element in the wait queue has type
wait_queue_t
:
struct __wait_queue { unsigned int flags; struct task_struct * task; struct list_head task_list; }; typedef struct __wait_queue wait_queue_t;
The functions that operate on wait queues are the following:
sleep_on(queue)
and
interruptible_sleep_on(queue)
, put the
current process to sleep in a wait queue. As with semaphores,
a process put to sleep with the
interruptible function can be woken up by signals. The
implementation of sleep_on()
is shown below:
void sleep_on(wait_queue_head_t *queue) { wait_queue_t wait; /* entry in the sleeping queue */ init_waitqueue_entry(&wait, current); wq_write_lock_irqsave(&queue->lock, flags); __add_wait_queue(queue, &wait) wq_write_unlock(&queue->lock); current->state = TASK_UNINTERRUPTIBLE; schedule(); wq_write_lock_irq(&queue->lock); __remove_wait_queue(queue, &wait); wq_write_unlock_irqrestore(&queue->lock, flags); }
sleep_on_interruptible()
is identical to
sleep_on()
except for the
TASK_INTERRUPTIBLE
symbol instead than
TASK_UNINTERRUPTIBLE
.
The function works by inserting a entry (wait
) in
the wait queue. The entry is initialized with the
init_waitqueue_entry()
function and then added to
the queue using __add_wait_queue()
. Next, the
state of the current taks is changed to
TASK_UNINTERRUPTIBLE
(for sleep_on) or
TASK_INTERRUPTIBLE
(for
sleep_on_interruptible). Finally, the scheduler
function is called, which runs another process in the
runqueue.
To wake up the processes in a specific wating queue, the
kernel provides the following functions:
wake_up()
, to wake up all of the
processes in a wait queue, and
wake_up_interruptible()
, to wake up only
the interruptible processes.
A simplified version of wake_up()
is shown below:
void wake_up(wait_queue_head_t *queue) { struct list_head *tmp; struct task_struct *p; list_for_each(tmp, &queue->task_list) { unsigned int state; wait_queue_t *curr = list_entry(tmp, wait_queue_t, task_list); /* current entry */ tsk = curr->task; if (tsk->state & (TASK_INTERRUPTIBLE | TASK_UNINTERRUPTIBLE)) { tsk->state = TASK_RUNNING; /* wake up task */ if (task_on_runqueue(tsk)) continue; /* task already on the runqueue */ add_to_runqueue(tsk); reschedule_idle(tsk); } } }
The function wakes up every process in the wait queue. The
process being woken up is added to the runqueue and its state
set to
TASK_RUNNING
. reschedule_idle()
then
tries to run the process immediately on a CPU that is
currently idle. If no CPU is idle, the function attempts to
preempt a currently running process that has lower priority.
![]() |
Q6. What small change would you make to
wake_up() in order to implement
wake_up_interruptible() ? |
In the code below we show a fragment of a device driver that
uses wait queues. The driver has an internal buffer of
BUFMAX
bytes of maximum capacity. The current
number of bytes stored in the buffer is given by
buffer_size
. A process reads from the driver's
buffer by issuing a read()
file
operation. However, if the buffer is empty
(buffer_size==0
), the process goes to sleep until
another process writes to the buffer.
static wait_queue_head_t wq; /* the readers wait queue */ static int __init dev_init(void) { ... init_waitqueue_head(&wq); /* initialize wait queue */ ... } static ssize_t dev_read(struct file *filp, char *buf, size_t count, loff_t *offp) { if (buffer_size == 0) { interruptible_sleep_on(&wq); /* go to sleep, wait for writers */ if (signal_pending(current)) /* woken up by a signal? */ return(-EINTR); } /* send message to reader */ count = (count>buffer_size) ? buffer_size : count; copy_to_user(buf, dev_buffer, count); return(count); } /* dev_read() */ static ssize_t dev_write(struct file *filp, const char *buf, size_t count, loff_t *offp) { /* store message in device buffer */ count = (count>BUFMAX) ? BUFMAX : count; copy_from_user(dev_buffer, buf, count); buffer_size = count; wake_up_interruptible(&wq); /* wake up readers */ return(count); }
Two things need to be noticed from the code above:
down_interruptible()
, the
function interruptible_sleep_on()
does not return
a particular error code when the process is woken up by a
signal. Therefore, we need to manually check wether a process was
woken up by a signal or by one of the wake-up functions.
![]() |
Q7. Implement the driver described in this section using the code fragment shown above. |
![]() |
Q8. One problem with the code above is that a writer
could change the buffer contents and size while one or
more processes and reading from the buffer. Modify your
code from the previous question so that it prevents
writing to the buffer while a process is reading from
it. Use the semaphore functions
down_interruptible() and up()
as explained in the Semaphores
section. |
![]() |
Q9. Implement a device driver that operates as following.
The driver creates a device /dev/mbox . Only
one process can read from this device, but many processes
can write to it. The device allows messages to be
transferred from the writers to the reader. The reader
should sleep when no messages are available. The writers
should sleep only if a message from a previous writer is
still pending for retrieval by the reader. |