6.4 Deadlock and lock ordering
If a code path through the kernel must hold several locks at the same time, it is important that all code paths acquire those locks in the same order. If they don’t, there is a risk of deadlock. Let’s say two code paths in xv6 need locks A and B, but code path 1 acquires locks in the order A then B, and the other path acquires them in the order B then A. Suppose thread T1 executes code path 1 and acquires lock A, and thread T2 executes code path 2 and acquires lock B. Next T1 will try to acquire lock B, and T2 will try to acquire lock A. Both acquires will block indefinitely, because in both cases the other thread holds the needed lock, and won’t release it until its acquire returns. To avoid such deadlocks, all code paths must acquire locks in the same order. The need for a global lock acquisition order means that locks are effectively part of each function’s specification: callers must invoke functions in a way that causes locks to be acquired in the agreed-on order.
Xv6 has many lock-order chains of length two involving
per-process locks
(the lock in each
struct proc
)
due to the way that
sleep
works (see Chapter 7).
For example,
consoleintr
(kernel/console.c:136)
is the interrupt routine which handles typed characters.
When a newline arrives, any process that is waiting for
console input should be woken up.
To do this,
consoleintr
holds
cons.lock
while calling
wakeup
,
which acquires
the waiting process’s lock in order to wake it up.
In consequence, the global deadlock-avoiding
lock order includes the rule that
cons.lock
must be acquired before any process lock.
The file-system code contains xv6’s longest lock chains.
For example, creating a file requires simultaneously
holding a lock on the directory, a lock on the new file’s inode,
a lock on a disk block buffer,
the disk driver’s vdisk_lock
,
and
the calling process’s p->lock
.
To avoid deadlock, file-system code always acquires locks in the order
mentioned in the previous sentence.
Honoring a global deadlock-avoiding order can be surprisingly difficult. Sometimes the lock order conflicts with logical program structure, e.g., perhaps code module M1 calls module M2, but the lock order requires that a lock in M2 be acquired before a lock in M1. Sometimes the identities of locks aren’t known in advance, perhaps because one lock must be held in order to discover the identity of the lock to be acquired next. This kind of situation arises in the file system as it looks up successive components in a path name, and in the code for wait and exit as they search the table of processes looking for child processes. Finally, the danger of deadlock is often a constraint on how fine-grained one can make a locking scheme, since more locks often means more opportunity for deadlock. The need to avoid deadlock is often a major factor in kernel implementation.