8.1 Overview
The xv6 file system implementation is
organized in seven layers, shown in
Figure 8.1.
The disk layer reads and writes blocks on an virtio hard drive.
The buffer cache layer caches disk blocks and synchronizes access to them,
making sure that only one kernel process at a time can modify the
data stored in any particular block. The logging layer allows higher
layers to wrap updates to several blocks in a
transaction,
and ensures that the blocks are updated atomically in the
face of crashes (i.e., all of them are updated or none).
The inode layer provides individual files, each represented as an
inode
with a unique i-number
and some blocks holding the file’s data. The directory
layer implements each directory as a special kind of
inode whose content is a sequence of directory entries, each of which contains a
file’s name and i-number.
The pathname layer provides
hierarchical path names like
/usr/rtm/xv6/fs.c
,
and resolves them with recursive lookup.
The file descriptor layer abstracts many Unix resources (e.g., pipes, devices,
files, etc.) using the file system interface, simplifying the lives of
application programmers.
Disk hardware traditionally presents the data on the
disk as a numbered sequence of 512-byte
blocks
(also called
sectors):
sector 0 is the first 512 bytes, sector 1 is the next, and so on. The block size
that an operating system uses for its file system maybe different than the
sector size that a disk uses, but typically the block size is a multiple of the
sector size. Xv6 holds copies of blocks that it has read into memory
in objects of type
struct buf
(kernel/buf.h:1).
The
data stored in this structure is sometimes out of sync with the disk: it might have
not yet been read in from disk (the disk is working on it but hasn’t returned
the sector’s content yet), or it might have been updated by software
but not yet written to the disk.
The file system must have a plan for where it stores inodes and
content blocks on the disk.
To do so, xv6 divides the disk into several
sections, as
Figure 8.2 shows.
The file system does not use
block 0 (it holds the boot sector). Block 1 is called the
superblock;
it contains metadata about the file system (the file system size in blocks, the
number of data blocks, the number of inodes, and the number of blocks in the
log). Blocks starting at 2 hold the log. After the log are the inodes, with multiple inodes per block. After
those come bitmap blocks tracking which data blocks are in use.
The remaining blocks are data blocks; each is either marked
free in the bitmap block, or holds content for a file or directory.
The superblock is filled in by a separate program, called
mkfs
,
which builds an initial file system.
The rest of this chapter discusses each layer, starting with the buffer cache. Look out for situations where well-chosen abstractions at lower layers ease the design of higher ones.