Department of Engineering

IT Services

Fork and Exec

The fork system call in Unix creates a new process. The new process inherits various properties from its parent (Environmental variables, File descriptors, etc - see the manual page for details). After a successful fork call, two copies of the original code will be running. In the original process (the parent) the return value of fork will be the process ID of the child. In the new child process the return value of fork will be 0. Here's a simple example where the child sleeps for 2 seconds while the parent waits for the child process to exit. Note how the return value of fork is used to control which code is run by the parent and which by the child.

[fork]
#include <unistd.h>
#include <sys/wait.h>
#include <iostream>
using namespace std;
int main(){
  pid_t pid;
  int status, died;
     switch(pid=fork()){
     case -1: cout << "can't fork\n";
              exit(-1);
     case 0 : sleep(2); // this is the code the child runs
              exit(3); 
     default: died= wait(&status); // this is the code the parent runs 
     }
}

In the following annotated example the parent process queries the child process in more detail, determining whether the child exited normally or not. To make things interesting the parent kills the child process if the latter's PID is odd, so if you run the program a few times expect behaviour to vary.

#include <unistd.h>
#include <sys/wait.h>
#include <signal.h>
#include <iostream>
using namespace std;

int main(){
   pid_t pid;
   int status, died;
   switch(pid=fork()){
   case -1: cout << "can't fork\n";
            exit(-1);
   case 0 : cout << "   I'm the child of PID " << getppid() << ".\n";
            cout << "   My PID is " <<  getpid() << endl;
	    sleep(2);
            exit(3);
   default: cout << "I'm the parent.\n";
            cout << "My PID is " <<  getpid() << endl;
            // kill the child in 50% of runs
            if (pid & 1)
               kill(pid,SIGKILL);
            died= wait(&status);
            if(WIFEXITED(status))
               cout << "The child, pid=" << pid << ", has returned " 
                    << WEXITSTATUS(status) << endl;
            else
 	       cout << "The child process was sent a " 
                    << WTERMSIG(status) << " signal\n";
  }
}

In the examples above, the new process is running the same program as the parent (though it's running different parts of it). Often however, you want the new process to run a new program. When, for example, you type "date" on the unix command line, the command line interpreter (the so-called "shell") forks so that momentarily 2 shells are running, then the code in the child process is replaced by the code of the "date" program by using one of the family of exec system calls. Here's a simple example of how it's done.

#include <unistd.h>
#include <sys/wait.h>
#include <iostream>
using namespace std;

int main(){
   pid_t pid;
   int status, died;
   switch(pid=fork()){
   case -1: cout << "can't fork\n";
            exit(-1);
   case 0 : execl("/usr/bin/date","date",0); // this is the code the child runs 
   default: died= wait(&status); // this is the code the parent runs
   }
}

The child process can communicate some information to its parent via the argument to exit, but this is rather restrictive. Richer communication is possible if one takes advantage of the fact that the child and parent share file descriptors. The popen() command is the tidiest way to do this. The following code uses a more low-level method.

The pipe() command creates a pipe, returning two file descriptors; the 1st opened for reading from the pipe and the 2nd opened for writing to it. Both the parent and child process initially have access to both ends of the pipe. The code below closes the ends it doesn't need.

#include <unistd.h>
#include <sys/wait.h>
#include <iostream>
#include <sys/types.h>
using namespace std;
int main(){
 char str[1024], *cp;
 int pipefd[2];
 pid_t pid;
 int status, died;

  pipe (pipefd);
  switch(pid=fork()){
   case -1: cout << "can't fork\n";
            exit(-1);
   
   case 0 : // this is the code the child runs 
            close(1);      // close stdout
            // pipefd[1] is for writing to the pipe. We want the output
            // that used to go to the standard output (file descriptor 1)
            // to be written to the pipe. The following command does this,
            // creating a new file descripter 1 (the lowest available) 
            // that writes where pipefd[1] goes.
            dup (pipefd[1]); // points pipefd at file descriptor
            // the child isn't going to read from the pipe, so
            // pipefd[0] can be closed
            close (pipefd[0]);
            execl ("/usr/bin/date","date",0);
   default: // this is the code the parent runs 

            close(0); // close stdin
            // Set file descriptor 0 (stdin) to read from the pipe
            dup (pipefd[0]);
            // the parent isn't going to write to the pipe
            close (pipefd[1]);
            // Now read from the pipe
            cin.getline(str, 1023);
            cout << "The date is " << str << endl;
            died= wait(&status);
   }
}

In all these examples the parent process waits for the child to exit. If the parent doesn't wait, but exits before the child process does, then the child is adopted by another process (usually the one with PID 1). After the child exits (but before it's waited for) it becomes a "zombie". If it's never waited for (because the parent process is hung, for example) it remains a zombie. In more recent Unix versions, the kernel releases these processes, but sometimes they can only be removed from the list of processes by rebooting the machine. Though in small numbers they're harmless enough, avoiding them is a very good idea. Particularly if a process has many children, it's worth using waitpid() rather than wait(), so that the code waits for the right process. Some versions of Unix have wait2(), wait3() and wait4() variants which may be useful.

Double fork

One way to create a new process that is more isolated from the parent is to do the following

[double fork]

The original process doesn't have to wait around for the new process to die, and doesn't need to worry when it does.

Notes

  • The parent and child share the same code, but they sometimes share the same data segment too, read-only. Only when one of the processes tries to change the data is a copy made. Some systems implement this by default. Sometimes you need to call vfork().
  • On some systems there's a clone() command. This lets the parent and child share more resources (it's used when implementing threads). Sometimes they may have the same PID and may only differ by their stack segments and processor register value.
  • YoLinux Tutorial
  • "Advanced Programming in the UNIX Environment", W.Richard Stevens, Addison-Wesley, ISBN 0-201-56317-7