6.3 Code: Using locks
Xv6 uses locks in many places to avoid races.
As described above,
kalloc
(kernel/kalloc.c:69)
and
kfree
(kernel/kalloc.c:47)
form a good example.
Try Exercises 1 and 2 to see what happens if those
functions omit the locks.
You’ll likely find that it’s difficult to trigger incorrect
behavior, suggesting that it’s hard to reliably test whether code
is free from locking errors and races.
Xv6 may well have as-yet-undiscovered races.
A hard part about using locks is deciding how many locks to use and which data and invariants each lock should protect. There are a few basic principles. First, any time a variable can be written by one CPU at the same time that another CPU can read or write it, a lock should be used to keep the two operations from overlapping. Second, remember that locks protect invariants: if an invariant involves multiple memory locations, typically all of them need to be protected by a single lock to ensure the invariant is maintained.
The rules above say when locks are necessary but say nothing about when locks
are unnecessary, and it is important for efficiency not to lock too much,
because locks reduce parallelism. If parallelism isn’t important, then one
could arrange to have only a single thread and not worry about locks. A simple
kernel can do this on a multiprocessor by having a single lock that must be
acquired on entering the kernel and released on exiting the kernel (though
blocking system calls such as pipe reads or
wait
would pose a problem). Many uniprocessor operating systems have been converted to
run on multiprocessors using this approach, sometimes called a “big
kernel lock,” but the approach sacrifices parallelism: only one
CPU can execute in the kernel at a time. If the kernel does any heavy
computation, it would be more efficient to use a larger set of more
fine-grained locks, so that the kernel could execute on multiple CPUs
simultaneously.
As an example of coarse-grained locking, xv6’s kalloc.c
allocator has a single free list protected by a single lock. If
multiple processes on different CPUs try to allocate pages at the same
time, each will have to wait for its turn by spinning in acquire. Spinning wastes CPU time, since it’s not useful work.
If contention for the lock wasted a significant fraction of CPU time,
perhaps performance could be improved by changing the allocator design
to have multiple free lists, each with its own lock, to allow truly
parallel allocation.
As an example of fine-grained locking, xv6 has a separate lock for each file, so that processes that manipulate different files can often proceed without waiting for each other’s locks. The file locking scheme could be made even more fine-grained if one wanted to allow processes to simultaneously write different areas of the same file. Ultimately lock granularity decisions need to be driven by performance measurements as well as complexity considerations.
As subsequent chapters explain each part of xv6, they will mention examples of xv6’s use of locks to deal with concurrency. As a preview, Figure 6.3 lists all of the locks in xv6.
Lock | Description |
bcache.lock | Protects allocation of block buffer cache entries |
cons.lock | Serializes access to console hardware, avoids intermixed output |
ftable.lock | Serializes allocation of a struct file in file table |
itable.lock | Protects allocation of in-memory inode entries |
vdisk_lock | Serializes access to disk hardware and queue of DMA descriptors |
kmem.lock | Serializes allocation of memory |
log.lock | Serializes operations on the transaction log |
pipe’s pi->lock | Serializes operations on each pipe |
pid_lock | Serializes increments of next_pid |
proc’s p->lock | Serializes changes to process’s state |
wait_lock | Helps wait avoid lost wakeups |
tickslock | Serializes operations on the ticks counter |
inode’s ip->lock | Serializes operations on each inode and its content |
buf’s b->lock | Serializes operations on each block buffer |