8.10 Code: Inode content

Refer to caption
Figure 8.3: The representation of a file on disk.

The on-disk inode structure, struct dinode, contains a size and an array of block numbers (see Figure 8.3). The inode data is found in the blocks listed in the dinode ’s addrs array. The first NDIRECT blocks of data are listed in the first NDIRECT entries in the array; these blocks are called direct blocks. The next NINDIRECT blocks of data are listed not in the inode but in a data block called the indirect block. The last entry in the addrs array gives the address of the indirect block. Thus the first 12 kB ( NDIRECT x BSIZE) bytes of a file can be loaded from blocks listed in the inode, while the next 256 kB ( NINDIRECT x BSIZE) bytes can only be loaded after consulting the indirect block. This is a good on-disk representation but a complex one for clients. The function bmap manages the representation so that higher-level routines, such as readi and writei, which we will see shortly, do not need to manage this complexity. bmap returns the disk block number of the bn’th data block for the inode ip. If ip does not have such a block yet, bmap allocates one.

The function bmap (kernel/fs.c:383) begins by picking off the easy case: the first NDIRECT blocks are listed in the inode itself (kernel/fs.c:388-396). The next NINDIRECT blocks are listed in the indirect block at ip->addrs[NDIRECT]. bmap reads the indirect block (kernel/fs.c:407) and then reads a block number from the right position within the block (kernel/fs.c:408). If the block number exceeds NDIRECT+NINDIRECT, bmap panics; writei contains the check that prevents this from happening (kernel/fs.c:513).

bmap allocates blocks as needed. An ip->addrs[] or indirect entry of zero indicates that no block is allocated. As bmap encounters zeros, it replaces them with the numbers of fresh blocks, allocated on demand (kernel/fs.c:389-390) (kernel/fs.c:401-402).

itrunc frees a file’s blocks, resetting the inode’s size to zero. itrunc (kernel/fs.c:426) starts by freeing the direct blocks (kernel/fs.c:432-437), then the ones listed in the indirect block (kernel/fs.c:442-445), and finally the indirect block itself (kernel/fs.c:447-448).

bmap makes it easy for readi and writei to get at an inode’s data. readi (kernel/fs.c:472) starts by making sure that the offset and count are not beyond the end of the file. Reads that start beyond the end of the file return an error (kernel/fs.c:477-478) while reads that start at or cross the end of the file return fewer bytes than requested (kernel/fs.c:479-480). The main loop processes each block of the file, copying data from the buffer into dst (kernel/fs.c:482-494). writei (kernel/fs.c:506) is identical to readi, with three exceptions: writes that start at or cross the end of the file grow the file, up to the maximum file size (kernel/fs.c:513-514); the loop copies data into the buffers instead of out (kernel/fs.c:522); and if the write has extended the file, writei must update its size (kernel/fs.c:530-531).

The function stati (kernel/fs.c:458) copies inode metadata into the stat structure, which is exposed to user programs via the stat system call.