7.8 Code: Wait, exit, and kill
sleep
and
wakeup
can be used for many kinds of waiting.
An interesting example, introduced in Chapter 1,
is the interaction between a child’s exit
and its parent’s wait
.
At the time of the child’s death, the parent may already
be sleeping in wait, or may be doing something else;
in the latter case, a subsequent call to wait must
observe the child’s death, perhaps long after it calls exit.
The way that xv6 records the child’s demise until wait
observes it is for exit to put the caller into the ZOMBIE
state, where it stays until the parent’s wait notices it, changes
the child’s state to UNUSED, copies the child’s exit status,
and returns the child’s process ID to the parent.
If the parent exits before the child, the
parent gives the child to the
init
process, which perpetually calls wait;
thus
every child has a parent to clean up after it.
A challenge is
to avoid races and deadlock between
simultaneous parent and child
wait
and
exit
,
as well as simultaneous
exit
and exit
.
wait
starts by acquiring
wait_lock
(kernel/proc.c:391),
which acts as the condition
lock that helps ensure that wait
doesn’t miss a wakeup
from an exiting child.
Then wait
scans the process table.
If it finds a child in ZOMBIE state,
it frees that child’s resources and
its proc
structure, copies
the child’s exit status to the address supplied to wait
(if it is not 0),
and returns the child’s process ID.
If
wait
finds children but none have exited,
it calls
sleep
to wait for any of them to exit
(kernel/proc.c:433),
then scans again.
wait
often holds two locks,
wait_lock
and some process’s pp->lock
;
the deadlock-avoiding order is first wait_lock
and then pp->lock
.
exit
(kernel/proc.c:347) records the exit
status, frees some resources, calls reparent
to give its
children to the init
process, wakes up the parent in case
it is in wait
, marks the caller as a zombie, and
permanently yields the CPU. exit
holds both
wait_lock
and p->lock
during this
sequence.
It holds wait_lock
because
it’s the condition
lock for the wakeup(p->parent)
, preventing a parent in
wait
from losing the wakeup. exit
must hold
p->lock
for this sequence also, to prevent a parent in
wait
from seeing that the child is in state
ZOMBIE
before the child has finally called
swtch
. exit
acquires these locks in
the same order as wait
to avoid deadlock.
It may look incorrect for exit
to wake up the parent
before setting its state to ZOMBIE
,
but that is safe:
although
wakeup
may cause the parent to run,
the loop in
wait
cannot examine the child until the child’s
p->lock
is released by scheduler,
so
wait
can’t look at
the exiting process until well after
exit
has set its state to
ZOMBIE
(kernel/proc.c:379).
While
exit
allows a process to terminate itself,
kill
(kernel/proc.c:598)
lets one process request that another terminate.
It would be too complex for
kill
to directly destroy the victim process, since the victim
might be executing on another CPU, perhaps
in the middle of a sensitive sequence of updates to kernel data structures.
Thus
kill
does very little: it just sets the victim’s
p->killed
and, if it is sleeping, wakes it up.
Eventually the victim will enter or leave the kernel,
at which point code in
usertrap
will call
exit
if
p->killed
is set
(it checks by calling
killed
(kernel/proc.c:627)).
If the victim is running in user space, it will soon enter
the kernel by making a system call or because the timer (or
some other device) interrupts.
If the victim process is in
sleep
,
kill
’s call to
wakeup
will cause the victim to return from
sleep
.
This is potentially dangerous because
the condition being waited for for may not be true.
However, xv6 calls to
sleep
are always wrapped in a
while
loop that re-tests the condition after
sleep
returns.
Some calls to
sleep
also test
p->killed
in the loop, and abandon the current activity if it is set.
This is only done when such abandonment would be correct.
For example, the pipe read and write code
(kernel/pipe.c:84)
returns if the killed flag is set; eventually the
code will return back to trap, which will again
check p->killed
and exit.
Some xv6
sleep
loops do not check
p->killed
because the code is in the middle of a multi-step
system call that should be atomic.
The virtio driver
(kernel/virtio_disk.c:285)
is an example: it does not check
p->killed
because a disk operation may be one of a set of
writes that are all needed in order for the file system to
be left in a correct state.
A process that is killed while waiting for disk I/O won’t
exit until it completes the current system call and
usertrap
sees the killed flag.