1.2 I/O and File descriptors
A file descriptor is a small integer representing a kernel-managed object that a process may read from or write to. A process may obtain a file descriptor by opening a file, directory, or device, or by creating a pipe, or by duplicating an existing descriptor. For simplicity we’ll often refer to the object a file descriptor refers to as a “file”; the file descriptor interface abstracts away the differences between files, pipes, and devices, making them all look like streams of bytes. We’ll refer to input and output as I/O.
Internally, the xv6 kernel uses the file descriptor as an index into a per-process table, so that every process has a private space of file descriptors starting at zero. By convention, a process reads from file descriptor 0 (standard input), writes output to file descriptor 1 (standard output), and writes error messages to file descriptor 2 (standard error). As we will see, the shell exploits the convention to implement I/O redirection and pipelines. The shell ensures that it always has three file descriptors open (user/sh.c:152), which are by default file descriptors for the console.
The
read
and
write
system calls read bytes from and write bytes to
open files named by file descriptors.
The call
read(fd
,
buf
,
n)
reads at most
n
bytes from the file descriptor
fd
,
copies them into
buf
,
and returns the number of bytes read.
Each file descriptor that refers to a file
has an offset associated with it.
read
reads data from the current file offset and then advances
that offset by the number of bytes read:
a subsequent
read
will return the bytes following the ones returned by the first
read
.
When there are no more bytes to read,
read
returns zero to indicate the end of the file.
The call
write(fd
,
buf
,
n)
writes
n
bytes from
buf
to the file descriptor
fd
and returns the number of bytes written.
Fewer than
n
bytes are written only when an error occurs.
Like
read
,
write
writes data at the current file offset and then advances
that offset by the number of bytes written:
each
write
picks up where the previous one left off.
The following program fragment (which forms the essence of the program
cat
)
copies data from its standard input
to its standard output. If an error occurs, it writes a message
to the standard error.
1 char buf[512];
2 int n;
3
4 for(;;){
5 n = read(0, buf, sizeof buf);
6 if(n == 0)
7 break;
8 if(n < 0){
9 fprintf(2, "read error\n");
10 exit(1);
11 }
12 if(write(1, buf, n) != n){
13 fprintf(2, "write error\n");
14 exit(1);
15 }
16 }
The important thing to note in the code fragment is that
cat
doesn’t know whether it is reading from a file, console, or a pipe.
Similarly
cat
doesn’t know whether it is printing to a console, a file, or whatever.
The use of file descriptors and the convention that file descriptor 0
is input and file descriptor 1 is output allows a simple
implementation
of
cat
.
The
close
system call
releases a file descriptor, making it free for reuse by a future
open
,
pipe
,
or
dup
system call (see below).
A newly allocated file descriptor
is always the lowest-numbered unused
descriptor of the current process.
File descriptors and
fork
interact to make I/O redirection easy to implement.
fork
copies the parent’s file descriptor table along with its memory,
so that the child starts with exactly the same open files as the parent.
The system call
exec
replaces the calling process’s memory but preserves its file table.
This behavior allows the shell to
implement I/O redirection by forking,
re-opening chosen file descriptors in the child,
and then calling exec
to run the new program.
Here is a simplified version of the code a shell runs for the
command
cat
<
input.txt
:
1 char *argv[2];
2
3 argv[0] = "cat";
4 argv[1] = 0;
5 if(fork() == 0) {
6 close(0);
7 open("input.txt", O_RDONLY);
8 exec("cat", argv);
9 }
After the child closes file descriptor 0,
open
is guaranteed to use that file descriptor
for the newly opened
input.txt
:
0 will be the smallest available file descriptor.
cat
then executes with file descriptor 0 (standard input) referring to
input.txt
.
The parent process’s file descriptors are not changed by this
sequence, since it modifies only the child’s descriptors.
The code for I/O redirection in the xv6 shell works in exactly this way
(user/sh.c:83).
Recall that at this point in the code the shell has already forked the
child shell and that
runcmd
will call
exec
to load the new program.
The second argument to open
consists of a set of
flags, expressed as bits, that control what open
does. The possible values are defined in the file control (fcntl) header
(kernel/fcntl.h:1-5):
O_RDONLY
,
O_WRONLY
,
O_RDWR
,
O_CREATE
, and
O_TRUNC
,
which instruct open
to
open the file for reading,
or for writing,
or for both reading and writing,
to create the file if it doesn’t exist,
and to truncate the file to zero length.
Now it should be clear why it is helpful that
fork
and
exec
are separate calls: between the two, the shell has a chance
to redirect the child’s I/O without disturbing the I/O setup of the main shell.
One could instead imagine a hypothetical combined
forkexec
system call,
but the options for doing I/O redirection with such a call
seem awkward.
The shell could modify its own I/O
setup before calling forkexec
(and then
un-do those modifications); or
forkexec
could take instructions for I/O
redirection as arguments;
or (least attractively) every program like cat
could
be taught to do its own I/O redirection.
Although
fork
copies the file descriptor table, each underlying file offset is shared
between parent and child.
Consider this example:
1 if(fork() == 0) {
2 write(1, "hello ", 6);
3 exit(0);
4 } else {
5 wait(0);
6 write(1, "world\n", 6);
7 }
At the end of this fragment, the file attached to file descriptor 1
will contain the data
hello
world
.
The
write
in the parent
(which, thanks to
wait
,
runs only after the child is done)
picks up where the child’s
write
left off.
This behavior helps produce sequential output from sequences
of shell commands, like
(echo
hello
;
echo
world)
>output.txt
.
The
dup
system call duplicates an existing file descriptor,
returning a new one that refers to the same underlying I/O object.
Both file descriptors share an offset, just as the file descriptors
duplicated by
fork
do.
This is another way to write
hello
world
into a file:
1 fd = dup(1);
2 write(1, "hello ", 6);
3 write(fd, "world\n", 6);
Two file descriptors share an offset if they were derived from
the same original file descriptor by a sequence of
fork
and
dup
calls.
Otherwise file descriptors do not share offsets, even if they
resulted from
open
calls for the same file.
dup
allows shells to implement commands like this:
ls
existing-file
non-existing-file
>
tmp1
2>&1
.
The
2>&1
tells the shell to give the command a file descriptor 2 that
is a duplicate of descriptor 1.
Both the name of the existing file and the error message for the
non-existing file will show up in the file
tmp1
.
The xv6 shell doesn’t support I/O redirection for the error file
descriptor, but now you know how to implement it.
File descriptors are a powerful abstraction, because they hide the details of what they are connected to: a process writing to file descriptor 1 may be writing to a file, to a device like the console, or to a pipe.