Department of Engineering

IT Services

Input/Output

File I/O under Unix

Some file operations work on file pointers and some lower level ones use small integers called file descriptors (an index into a table of information about opened files).

The following code doesn't do anything useful but it does use most of the file handling routines. The manual pages describe how each routine reports errors. If errnum is set on error then perror can be called to print out the error string corresponding to the error number, and a string the programmer provides as the argument to perror.

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>    /* the man pages of the commands say which
                         include files need to be mentioned */
#define TRUE 1
int bytes_read;
size_t  fp_bytes_read;
int fd;   /* File descriptors */
int fd2;
FILE *fp; /* File pointers */
FILE *fp2; 
char buffer[BUFSIZ]; /* BUFSIZ is set up in stdio.h */

int main(){

  /* Use File descriptors */
  fd = open ("/etc/group", O_RDONLY);
  if (fd == -1){
     perror("Opening /etc/group");
     exit(1);
  }

  while (TRUE){
     bytes_read = read (fd, buffer,BUFSIZ);
     if (bytes_read>0)
        printf("%d bytes read from /etc/group.\n", bytes_read);
     else{ 
        if (bytes_read==0){
          printf("End of file /etc/group reached\n");
          close(fd);
          break;
        }
        else if (bytes_read == -1){
          perror("Reading /etc/group");
          exit(1);
        }
     }
   }


 /* now use file pointers */
 fp = fopen("/etc/passwd","r");
 if (fp == NULL){
    printf("fopen failed to open /etc/passwd\n");
    exit(1);
  }

 while(TRUE){
    fp_bytes_read= fread (buffer, 1, BUFSIZ, fp);
        printf("%d bytes read from /etc/passwd.\n", fp_bytes_read);
    if (fp_bytes_read==0)
       break;
 }

 rewind(fp); /* go back to the start of the file */

/* Find the descriptor associated with a stream */
  fd2 = fileno (fp);
  if (fd2 == -1)
    printf("fileno failed\n");

/* Find the stream associated with a descriptor */
  fp2 = fdopen (fd2, "r");
  if (fp2 == NULL)
    printf("fdopen failed\n");
  fclose(fp2);
 }

To take advantage of unix's I/O redirection it's often useful to write filters: programs that can read from stdin and write to stdout. In Unix, processes have stdin, stdout and stderr channels. In stdio.h, these names have been associated with file pointers. The following program reads lines from stdin and writes them to stdout prepending each line by a line number. Errors are printed on stderr. fprintf takes the same arguments as printf except that you also specify a file pointer. fprintf(stdout,....) is equivalent to printf(....).

/* line_nums.c 
   Sample Usage :    line_nums < /etc/group
 */
#include <stdio.h>
#include <stdlib.h>
#define TRUE 1
int lineno = 0;
int error_flag = 0;
char buf[BUFSIZ]; /* BUFSIZ is defined in stdio.h */

main(){
  while(TRUE){
    if (fgets(buf,BUFSIZ, stdin) == NULL){
       if (ferror(stdin) != 0){
          fprintf(stderr,"Error during reading\n");
          error_flag = 1;
       }
       if (feof(stdin) != 0)
          fprintf(stderr,"File ended\n");
       clearerr(stdin);
       break; /* exit the while loop */
    }
    else{
       lineno++;
       /* in the next line, "%3d" is used to restrict the
          number to 3 digits.
       */
       fprintf(stdout,"%3d: ", lineno);
       fputs(buf, stdout);
    }
  }

  fprintf(stderr,"%d lines written\n", lineno);
  exit(error_flag);
}

ferror() and feof() are intended to clarify ambiguous return values. Here that's not a problem since a NULL return value from fgets() can only mean end-of-file, but with for instance getw() such double checking is necessary.

Interactive

Output

For efficiency, writing to files under Unix is usually buffered, so printf(....) might not immediately produce bytes at stdout. Should your program crash soon after a printf() command you might never see the output. If you want to force synchronous output you can

  • Use stderr (which is usually unbuffered) instead of stdout.
  • Use fflush(stdout) to flush out the standard output buffer.
  • Use setbuf(stdout,NULL) to stop standard output being buffered.

Input

scanf is a useful-looking routine for getting input. It looks for input of the format described in its 1st argument and puts the input into the variables pointed to by the succeeding arguments. It returns the number of arguments successfully read.

Suppose you wanted the user to type their surname then their age in. You could do this:-

int age;
char name[50];
int return_val;
main(){
  printf("Type in your surname and age, then hit the Return key\n");
  while(TRUE){
    return_val= scanf("%s %d", name, &age);
    if (return_val == 2)
       break;
    else
       printf("Sorry. Try Again\n");  
  }
}

If you use scanf in this way to directly get user input, and the user types in something different to what scanf() is expecting, scanf keeps reading until its entire input list is fulfilled or EOF is reached. It treats a newline as white space. Thus users can become very frustrated in this example if, say, they keep typing their name, then hitting Return. A better scheme is to store user input in an intermediate string and use sscanf(), which is like scanf() except that its first argument is the string which is to be scanned. E.g. in

...
int ret, x, y, z;
ret = sscanf(str,"x=%d y=%d z=%d", &x, &y, &z);
...

sscanf, given a string `x=3 y=7 z=89', will set the x, y, and z values accordingly and ret will be set to 3 showing that 3 values have been scanned. If str is `x+1 y=4', sscanf will return 2 and won't hang and you can print a useful message to the user.

To read the original string in, fgets() is a safer routine to use than gets() since with gets() one can't check to see if the input line is too large for the buffer. This still leaves the problem that the string may contain a newline character (not just whitespace) when using fgets. One must make annoying provisions for ends of lines that are not necessary when input is treated as a continuous stream of characters.