Search Contact information
University of Cambridge Home Department of Engineering
University of Cambridge >  Engineering Department >  computing help
next up previous contents
Next: Exercises 2 Up: ANSI C for Programmers Previous: Pointers   Contents

Strings

In C a string is just an array of characters. The end of the string is denoted by a zero byte. The various string manipulation functions are described in the online manual page called `string', and declared in the string.h include file. The following piece of code illustrates their use and highlights some problems
/* strings.c */
#include <stdio.h>
#include <string.h>

char str1[10]; /* This reserves space for 10 characters */ 
char str2[10];
char str3[]= "initial text"; /* str3 is set to the right size for you 
                              * and automatically terminated with a 0
                              * byte. You can only initialise
                              * strings this way when defining. 
                              */
char *c_ptr;   /* declares a pointer, but doesn't initialise it. */

unsigned int len;

main()
{
 /* copy "hello" into str1. If str1 isn't big enough, hard luck */ 
 strcpy(str1,"hello");
 /* if you looked at memory location str1 you'd see these byte
    values:  'h','e','l','l','o','\0'
  */

 /* concatenate " sir" onto str1. If str1 is too small, hard luck */ 
 strcat(str1," sir");
 /* values at str1 :  'h','e','l','l','o',' ','s','i','r','\0'
  */

 len = strlen(str1);  /* find the number of characters */
 printf("Length of <%s> is %d characters\n", str1, len);
 
 if(strcmp(str1, str3))
    printf("<%s> and <%s> are different\n", str1, str3);
 else
    printf("<%s> and <%s> are the same\n", str1, str3);

 if (strstr(str1, "boy") == (char*) NULL)
    printf("The string <boy> isn't in <%s>\n", str1);
 else
    printf("The string <boy> is in <%s>\n", str1);

 /* find the first `o' in str1 */
 c_ptr = strchr(str1,'o');

 if (c_ptr == (char*) NULL)
   printf("There is no o in <%s>\n", str1);
 else{
   printf("<%s> is from the first o in <%s> to the end.\n",
                c_ptr, str1); 
   /* Now copy this part of str1 into str2 */
   strcpy(str2, c_ptr);
 }
}
Usually `str1' would be used instead of `&str1[0]' to refer to the address of the first element of the character array, since C defines the value of an array name to be the location of the first element. In fact, once you've set c_ptr to str, the 2 variables behave similarly in most circumstances.

Because the distinction between pointers and arrays often doesn't seem to matter, programmers get surprised when it does. Arrays are not pointers. The array declaration `char str1[10];' requests that space for ten characters be set aside. The pointer declaration `char *c_ptr;' on the other hand, requests a place which holds a pointer. The pointer is to be known by the name c_ptr, and can point to any char (or contiguous array of chars) anywhere. str1 can't be changed: it's where the array begins and where it will always stay.

You can't pass whole arrays to functions, only pointers to them. To declare such pointers correctly you need to be aware of the different ways that multi-dimensional arrays can be stored in memory. Suppose you created a 2D array of characters as follows:-

char fruits[3][10] = {"apple", "banana", "orange"};
This creates space for 3 strings each 10 bytes long. Let's say that `fruits' gets stored at memory location 6000. Then this will be the layout in memory:
      6000 a  p  p  l  e \0 .  .  .  .
      6010 b  a  n  a  n  a \0 .  .  . 
      6020 o  r  a  n  g  e \0 .  .  .

If you wanted to write a function that printed these strings out so you could do `list_names(fruits)', the following routine will work

void list_names(char names[][10] ){
  int i;
  for (i=0; i<3; i++){
    printf("%s\n", names[i]);
  }
}
The routine has to be told the size of the things that names points to, otherwise it won't be able to calculate names[i] correctly. So the `10' needs to be provided in the declaration. It doesn't care about how many things are in the array, so the first pair of brackets might just as well be empty. An equivalent declaration is
void list_names(char (*names)[10])
saying that `names' is a pointer to an array each of whose elements is 10 chars.

The above method of creating arrays wastes a lot of space if the strings differ greatly in length. An alternative way to initialise is as follows:-

char *veg[] =  {"artichoke", "beetroot", "carrot"};
Here `veg' is set up as an array of pointer-to-chars. The layout in memory is different too. A possible layout is:-
   Address Value
      6000 9000
      6004 9600
      6008 9700
      ...  

      9000  a  r  t  i  c  h  o  k  e \0
      9600  b  e  e  t  r  o  o  t  \0
      9700  c  a  r  r  o  t  \0
Note that `veg' is the start of an array of pointers. The actual characters are stored elsewhere. If we wanted a function that would print out these strings, then the `list_names()' routine above wouldn't do, since this time the argument `names' wouldn't be pointing to things that are 10 bytes long, but 4 (the size of a pointer-to-char). The declaration needs to say that `names' points to a character pointer.
void list_names(char **names){
  int i;
  for (i=0; i<3; i++){
    printf("%s\n", names[i]);
  }
}
The following declaration would also work:-
void list_names(char *names[]){
Using cdecl (see page [*]) will help clarify the above declarations.

The program below shows the 2 types of array in action. The functions to print the names out are like the above except that

#include <stdio.h>
#include <stdlib.h>

void list_names(char (*names)[10] ){
  for (; names[0][0]; names++){
    printf("%s\n", *names);
  }
}

void list_names2(char *names[] ){
  for (; *names!=NULL; names++){
    printf("%s\n",*names);
  }
}

int main(int argc, char *argv[]){
char fruits[4][10] = {"apple", "banana", "orange", ""};
char *veg[] =  {"artichoke", "beetroot", "carrot", (char*) NULL};

  list_names(fruits);
  list_names2(veg);
  exit(0);
}


next up previous contents
Next: Exercises 2 Up: ANSI C for Programmers Previous: Pointers   Contents
Tim Love 2010-04-27