Strings
In C a string is just an array of characters. The end of the string is denoted by a zero byte. The various string manipulation functions are described in the online manual page called `string', and declared in the string.h include file. The following piece of code illustrates their use and highlights some problems
/* strings.c */ #include <stdio.h> #include <string.h> char str1[10]; /* This reserves space for 10 characters */ char str2[10]; char str3[]= "initial text"; /* str3 is set to the right size for you * and automatically terminated with a 0 * byte. You can only initialise * strings this way when defining. */ char *c_ptr; /* declares a pointer, but doesn't initialise it. */ unsigned int len; main() { /* copy "hello" into str1. If str1 isn't big enough, hard luck */ strcpy(str1,"hello"); /* if you looked at memory location str1 you'd see these byte values: 'h','e','l','l','o','\0' */ /* concatenate " sir" onto str1. If str1 is too small, hard luck */ strcat(str1," sir"); /* values at str1 : 'h','e','l','l','o',' ','s','i','r','\0' */ len = strlen(str1); /* find the number of characters */ printf("Length of <%s> is %d characters\n", str1, len); if(strcmp(str1, str3)) printf("<%s> and <%s> are different\n", str1, str3); else printf("<%s> and <%s> are the same\n", str1, str3); if (strstr(str1, "boy") == (char*) NULL) printf("The string <boy> isn't in <%s>\n", str1); else printf("The string <boy> is in <%s>\n", str1); /* find the first `o' in str1 */ c_ptr = strchr(str1,'o'); if (c_ptr == (char*) NULL) printf("There is no o in <%s>\n", str1); else{ printf("<%s> is from the first o in <%s> to the end.\n", c_ptr, str1); /* Now copy this part of str1 into str2 */ strcpy(str2, c_ptr); } }
Usually `str1' would be used instead of `&str1[0]
' to
refer to the address of the first element of the character array,
since C defines the value of an array name to be the location of
the first element. In fact, once you've set c_ptr to str,
the 2 variables behave similarly in most circumstances.
- There is not really any difference in the behaviour of the array
subscripting operator [] as it applies to arrays and pointers.
The expressions
str[i]
andc_ptr[i]
are both processed internally using pointers. For instance,str[i]
is equivalent to*((str)+(i))
. - Array and pointer declarations are interchangeable as function formal
parameters. Since arrays decay immediately into pointers, an array is never
actually passed to a function. Therefore, any parameter
declarations which `look like' arrays, e.g.
int f(char a[]) { ... }
are treated by the compiler as if they were pointers, so `char a[]' could be replaced by `char* a'. This conversion holds only within function formal parameter declarations, nowhere else. If this conversion bothers you, avoid it.
Because the distinction between pointers and arrays often doesn't seem to
matter, programmers get surprised when it does. Arrays are not pointers. The array
declaration `char str1[10];
' requests that space for ten characters be
set aside. The pointer
declaration `char *c_ptr;
' on the other hand, requests a place
which
holds a pointer. The pointer is to be known by the name c_ptr
, and
can point to any char (or contiguous array of chars)
anywhere. str1 can't be changed: it's where the array begins and where
it will always stay.
You can't pass whole arrays to functions, only pointers to them. To declare such pointers correctly you need to be aware of the different ways that multi-dimensional arrays can be stored in memory. Suppose you created a 2D array of characters as follows:-
char fruits[3][10] = {"apple", "banana", "orange"};
This creates space for 3 strings each 10 bytes long. Let's say that `fruits' gets stored at memory location 6000. Then this will be the layout in memory:
6000 a p p l e \0 . . . . 6010 b a n a n a \0 . . . 6020 o r a n g e \0 . . .
If you wanted to write a function that printed these strings out so you could do `list_names(fruits)', the following routine will work
void list_names(char names[][10] ){ int i; for (i=0; i<3; i++){ printf("%s\n", names[i]); } }
The routine has to be told the size of the things that names points to, otherwise it won't be able to calculate names[i] correctly. So the `10' needs to be provided in the declaration. It doesn't care about how many things are in the array, so the first pair of brackets might just as well be empty. An equivalent declaration is
void list_names(char (*names)[10])
saying that `names' is a pointer to an array each of whose elements is 10 chars.
The above method of creating arrays wastes a lot of space if the strings differ greatly in length. An alternative way to initialise is as follows:-
char *veg[] = {"artichoke", "beetroot", "carrot"};
Here `veg' is set up as an array of pointer-to-chars. The layout in memory is different too. A possible layout is:-
Address Value 6000 9000 6004 9600 6008 9700 ... 9000 a r t i c h o k e \0 9600 b e e t r o o t \0 9700 c a r r o t \0
Note that `veg' is the start of an array of pointers. The actual characters are stored elsewhere. If we wanted a function that would print out these strings, then the `list_names()' routine above wouldn't do, since this time the argument `names' wouldn't be pointing to things that are 10 bytes long, but 4 (the size of a pointer-to-char). The declaration needs to say that `names' points to a character pointer.
void list_names(char **names){ int i; for (i=0; i<3; i++){ printf("%s\n", names[i]); } }
The following declaration would also work:-
void list_names(char *names[]){
Using cdecl (see online) will help clarify the above declarations.
The program below shows the 2 types of array in action. The functions to print the names out are like the above except that
- The arrays are endstopped so that the functions needn't know beforehand how many elements are in the arrays.
- The for loop uses some common contractions.
#include <stdio.h> #include <stdlib.h> void list_names(char (*names)[10] ){ for (; names[0][0]; names++){ printf("%s\n", *names); } } void list_names2(char *names[] ){ for (; *names!=NULL; names++){ printf("%s\n",*names); } } int main(int argc, char *argv[]){ char fruits[4][10] = {"apple", "banana", "orange", ""}; char *veg[] = {"artichoke", "beetroot", "carrot", (char*) NULL}; list_names(fruits); list_names2(veg); exit(0); }