9.1 Locking patterns

Cached items are often a challenge to lock. For example, the file system’s block cache (kernel/bio.c:26) stores copies of up to NBUF disk blocks. It’s vital that a given disk block have at most one copy in the cache; otherwise, different processes might make conflicting changes to different copies of what ought to be the same block. Each cached block is stored in a struct buf (kernel/buf.h:1). A struct buf has a lock field which helps ensure that only one process uses a given disk block at a time. However, that lock is not enough: what if a block is not present in the cache at all, and two processes want to use it at the same time? There is no struct buf (since the block isn’t yet cached), and thus there is nothing to lock. Xv6 deals with this situation by associating an additional lock (bcache.lock) with the set of identities of cached blocks. Code that needs to check if a block is cached (e.g., bget (kernel/bio.c:59)), or change the set of cached blocks, must hold bcache.lock; after that code has found the block and struct buf it needs, it can release bcache.lock and lock just the specific block. This is a common pattern: one lock for the set of items, plus one lock per item.

Ordinarily the same function that acquires a lock will release it. But a more precise way to view things is that a lock is acquired at the start of a sequence that must appear atomic, and released when that sequence ends. If the sequence starts and ends in different functions, or different threads, or on different CPUs, then the lock acquire and release must do the same. The function of the lock is to force other uses to wait, not to pin a piece of data to a particular agent. One example is the acquire in yield (kernel/proc.c:512), which is released in the scheduler thread rather than in the acquiring process. Another example is the acquiresleep in ilock (kernel/fs.c:293); this code often sleeps while reading the disk; it may wake up on a different CPU, which means the lock may be acquired and released on different CPUs.

Freeing an object that is protected by a lock embedded in the object is a delicate business, since owning the lock is not enough to guarantee that freeing would be correct. The problem case arises when some other thread is waiting in acquire to use the object; freeing the object implicitly frees the embedded lock, which will cause the waiting thread to malfunction. One solution is to track how many references to the object exist, so that it is only freed when the last reference disappears. See pipeclose (kernel/pipe.c:59) for an example; pi->readopen and pi->writeopen track whether the pipe has file descriptors referring to it.

Usually one sees locks around sequences of reads and writes to sets of related items; the locks ensure that other threads see only completed sequences of updates (as long as they, too, lock). What about situations where the update is a simple write to a single shared variable? For example, setkilled and killed (kernel/proc.c:619) lock around their simple uses of p->killed. If there were no lock, one thread could write p->killed at the same time that another thread reads it. This is a race, and the C language specification says that a race yields undefined behavior, which means the program may crash or yield incorrect results¹¹ 1 “Threads and data races” in https://en.cppreference.com/w/c/language/memory_model. The locks prevent the race and avoid the undefined behavior.

One reason races can break programs is that, if there are no locks or equivalent constructs, the compiler may generate machine code that reads and writes memory in ways quite different than the original C code. For example, the machine code of a thread calling killed could copy p->killed to a register and read only that cached value; this would mean that the thread might never see any writes to p->killed. The locks prevent such caching.