Department of Engineering

IT Services

C++ Notes for IIA Students

This document starts where 1B C++ teaching ends and illustrates some extra C++ features which can prove useful to local students when trying the GF2 Software project on the IIA year. Last year all the teams used pointers and references, several of the teams used Exceptions and C++ strings, and they all developed classes (though none needed copy constructors). Below, a mixture of fragments and complete programs is included. You'll find it useful to compile and run the programs - they're designed to be run rather than read.

Pointers and References

Pointers and references are both used to deal with the same issue, and they both use the & symbol. This causes confusion. The general rule is to use references rather than pointers unless there's no alternative, but you'll see pointers used in code so even if you don't use them you'll need to understand them. Try running this program -

// Program 1 #include <iostream> using namespace std; int main() { int i=3; int *pointer_to_integer; // this is a pointer pointer_to_integer=&i; cout << "i is stored at memory location " << &i << endl; cout << "the value of i is " << *pointer_to_integer << endl; }

Note that * and & in this context are complementary - given a variable, using "&" finds its location, and if we have a pointer to a variable (i.e. we know the variable's location), then using "*" will give us the variable's value.

Pointers are useful if we want to "call by reference" (rather than "call by value"). Suppose we want to write a function that will triple the value of a given variable. We could try the following

// Program 2 #include <iostream> using namespace std; void triple(int i) { i=i*3; cout << "in triple, i is " << i << endl; cout << "In triple, i is stored at memory location " << &i << endl; } int main() { int i=3; cout << "In main, i is " << i << endl; triple(i); cout << "In main, i is now " << i << endl; cout << "In main, i is stored at memory location " << &i << endl; }

But it doesn't work as we wanted. The problem is that triple is never told where main's i is stored so it can't change its value. The i in triple is a different i which only exists in triple. It's stored in a different place to main's i. That the 2 variables have the same name is a coincidence. The only link between the 2 variables is that when triple is called, triple's i gets its initial value from main's i - i is being passed "by value".

If we give triple a pointer to i so that triple knows the location of i, then it can change the value.

// Program 3 #include <iostream> using namespace std; // the next line tells "triple" to expect a pointer to an integer void triple(int *i) { // this line has changed. *i=*i*3; // this line has changed } int main() { int i=3; cout << "i is " << i << endl; triple(&i); // this line has changed too cout << "i is now " << i << endl; }

But this code is becoming cluttered with * and & symbols. C++ has a way to do the same thing with less clutter.

// Program 4 #include <iostream> using namespace std; // the next line of code tells "triple" that i is being passed // by reference. void triple(int& i) { i=i*3; // here i is an alias for main's i } int main() { int i=3; cout << "i is " << i << endl; triple(i); cout << "i is now " << i << endl; }

When the compiler sees

triple(int& i)

it knows that i is being passed "by reference" and does all the extra work behind the scenes. Programs 3 and 4 do the same thing. References are preferable to pointers because they lead to tidier code and they're safer. A line like

int *pointer_to_integer;

creates a pointer but doesn't point it to anything. If something like

*pointer_to_integer=3;

is done before the pointer is set to a useful memory location, disaster ensues. In contrast, it's hard to create a 'dangling' reference. For further information see

Strings

In the 1B course students use arrays of characters to contain text. C++ has an alternative called strings. The general rule is to use strings unless you have no choice. Here's a simple example

// Program 5 #include <iostream> #include <string> using namespace std; int main() { string s; s="hello"; s=s+" world"; cout << s << endl; }

Note that the string header file needs to be included. I think that this code is as short and understandable as one could reasonably hope for. Note that the string is "elastic" - it grows as required. Compare this code with the old character-array method of C (though it's also legal in C++).

// Program 6 #include <iostream> #include <cstring> using namespace std; int main() { char s[10]; strcpy(s,"hello"); strcat(s," world"); cout << s << endl; }

This code does the same as the first fragment does but is less readable (strcat isn't a memorable name) and contains a bug - the s array isn't elastic, it's only big enough to contain 10 characters. Here we're writing off the end of the array which could have disastrous results. So C++'s more recent features are not only easier to use than the old methods, they're safer too! You can convert between C and C++ strings -

char cstring[10]; strcpy(cstring,"a test"); string a_str; a_str=string(cstring); strcpy(cstring, a_str.c_str());

You can also write to strings in the same way that you write to the screen or a file. Try running this

// Program 7 #include <sstream> #include <string> #include <iostream> using namespace std; int main() { //stringstream s; string t; for (int i=1;i<20;i++){ stringstream s; s << "I" << i; s >> t; cout << t << endl; } }

There's one trap to avoid when using C++ strings - you can use [ ... ] to access elements like you can with arrays, but you shouldn't access (for reading or writing) elements that don't yet exist. For example

#include <string> #include <iostream> using namespace std; int main() { string s; s[0]='c'; cout<< s << endl; }

might not display anything (and might crash) but

#include <string> #include <iostream> using namespace std; int main() { string s; s="hello"; cout<< s << endl; s[0]='c'; cout<< s << endl; }

works ok.

Standard Library

C++ has a library of data structures (lists, vectors, etc) and about 40 algorithms to operate on those structures (sort, etc). In the latest Deitel and Deitel C++ textbook, arrays and vectors are introduced in the same chapter. vectors are no harder to use than arrays, and offer several advantages. Here's a little example

#include <vector> // needed for vector #include <algorithm> // needed for reverse using namespace std; int main() { vector<int> v(3); // Declare a vector of 3 ints v[0] = 7; v[1] = v[0] + 3; v[2] = v[0] + v[1]; reverse(v.begin(), v.end()); }

More examples are online. Once you've managed to use one algorithm on a data structure you'll find that other algorithms and data structures are similar to use. Here's a simple extension of an earlier program using the standard library's sort. A string of characters is being sorted, but sorting a list of numbers or strings can be done in a very similar way.

// Program 8 #include <iostream> #include <string> using namespace std; int main() { string s; s="hello"; s=s+" world"; cout << s << endl; sort(s.begin(),s.end()); cout << s << endl; }

It's worth using C++'s off-the-shelf facilities whenever you can.

Getting command line arguments

When a program is called from the Unix command line its name is sometimes followed by strings. For example

g++ -v cabbage.cc

calls the g++ program with an argument and a filename. There needs to be a way for g++ to get hold of these arguments and filenames. Also it's useful for g++ to be able to pass back to Unix a return value. There's a standard method for this. The first function called when a C++ program is run is always

int main(int argc, char *argv[])

argc is how many strings were on the command line. argv[0] is the first string (which will be the program name) and the other strings (if any) are argv[1], etc. When main returns an integer, this is passed back to the calling process. To see how this works in practise, compile the following to produce a program called test1

#include <iostream> using namespace std; int main(int argc, char *argv[]) { cout << "The command line strings are -\n"; for (int i=0;i<argc;i++) cout << argv[i] << endl; return argc; }

Typing

./test1 -v foo echo $?

should print out the strings, then print "3" - the value returned to the command line process.

The argument strings arrive into the program as character arrays. If you prefer dealing with C++ strings you can use the Standard Library to convert them into a vector (called args in this example) of C++ strings, using some convenient constructors

#include <vector> #include <string> using namespace std; int main(int argc, char* argv[]) { vector<string> args (argv, argv+argc); }

Understanding code

When trying to understand source code written by others, it helps to look at the include files first to identify the main classes and data structures. Don't try to understand each line of each function - with luck, the function names and comments will tell you enough. Try first to follow the code in top-down fashion, starting at the main routine.

In unix, grep is a useful command when you have many files. If you're looking for where a function (add_device for example) is mentioned, you can do

     grep add_device *.h *.cc

to print out all the lines in the source code that mention add_device

C++ code can be very compact. The following line for example is doing quite a lot of work.

eofile = (inf.get(ch)==0);

Here, the inf variable is of type ifstream. Its member function "get" gets the next character from the input filestream and puts it in the ch variable. If there isn't a character to read (i.e. the end of the file has been reached) the "get" function returns 0. If the function returns 0, then the expression "(inf.get(ch)==0)" has the value true, so eofile (a variable being used to indicate whether the end of the file's been reached) is set to true. Another way to write all this is

if (inf.get(ch)==0) eofile = true; else eofile = false;

Code may use C++ features you've not seen before. For example, the following construction isn't often taught to first years.

sig = (target == low) ? 1 : 2;

This uses a "? ... :" construction. The RHS has the value 1 or 2 depending on whether target == low is true. It's equivalent to

if (target == low) sig=1; else sig=2;

There's also a commonly used ploy in include files. When you write big programs you're likely to have several include files. Suppose you have 2 include files like this

// this is myfile.h int globali;

and

// this is myfile2.h #include "myfile.h" int globalj;

Now suppose that in your main file you have

#include "myfile.h" #include "myfile2.h"

The pre-processor (the first stage of compilation) expands #include directives etc - it's a sort of filter whose output is passed on to the next compilation stage Typing

   g++ -E main.cc

will just run the pre-processor, letting you see what the compiler receives. In this case the pre-processor's output will be

 int globali;
 int globali;
 int globalj;

which is a bug - you can't create the same variable twice. There's a standard way to guard against double inclusion. If the files are changed to become

// this is myfile.h #ifndef MYFILE_H #define MYFILE_H int globali; #endif

and

// this is mydfile2.h #ifndef MYFILE2_H #define MYFILE2_H #include "myfile.h" int globalj; #endif

then when main is processed, the preprocessor will reach #ifndef MYFILE_H (ifndef means "if not defined") so MYFILE_H will be defined as a value inside the preprocessor and int globali; will be let through. Then it will reach #ifndef MYFILE2_H and define MYFILE2_H. At this stage it will read myfile.h again but this time MYFILE_H is already defined, so the contents of myfile.h will be ignored, which solves the duplication problem. For each source file, each include file will be read at most once.

For this method to work, each include file must have a different "guard variable" name. By convention the name used is the filename in upper case with _H instead of .h.

Exceptions

In C, return values of calls had to be checked for error values - which could double the code size. C++ exceptions are an alternative to traditional techniques. They're not always better than using return values, and can be over-used. Three keywords are involved

  • try - specifies an area of code where exceptions will operate
  • catch - deals with the exceptional situation produced in the previous try clause.
  • throw - causes an exceptional situation

When an exception is 'thrown' it will be 'caught' by the local "catch" clause if one exists, otherwise it will be passed up through the call hierarchy until a suitable "catch" clause is found. The default response to an exception is to terminate. The example below demonstrates some features. Note that

  • A program can throw whatever's thought useful - a number, a message or an object. There are standard exception objects or you can use one of your own.
  • "catch" routines are like other C++ routines in that there can be several of them taking different arguments
  • Once the exception has been caught and dealt with, the program continues from just after the "catch" routine. If your "try" region surrounds the whole program this means that execution will end, but you can put "try{...}" around a localised region so that execution continues, as in the following example.
#include <iostream> #include <string> using namespace std; class Ball { public: int number; string message; }; int main() { for (int i=1;i<4;i++) { try { switch (i) { case 1: throw 999 ; case 2: throw "help!"; case 3: Ball *ball = new Ball; ball->number =999; ball->message="help!"; throw ball; } } // end of try catch(int errornumber) { cerr << "error number is " << errornumber << endl; } catch (const char* errormessage) { cerr << "error message is " << errormessage << endl; } catch(Ball *b) { cerr << "error " << b->number << " - " << b->message << endl; } } // end of for }

extern

If you have

   int speed=3;

in a file outside of all functions, then this variable can be accessed from other files as long as the other files have

   extern int speed;

speed is a global variable. Such variables are considered bad style (they're error-prone) though sometimes they're hard to avoid. There's a complication if extern is used with const. If you have

   const int speed=3;

the variable isn't visible from other files. Here's a table showing some examples of how the contents of 2 files can interact.

File 1File 2Outcome
int i;
int main()
{
  i=5;
}
int i;
Fails (multiple definition of 'i') because when the 2 compiled files are linked together they each have an 'i' that is visible to the other.
extern int i;
int main()
{
  i=5;
}
int j;
Fails (undefined reference to 'i') because the first file expects an 'i' variable to be available from another file.
static int i;
int main()
{
  i=5;
}
int i;
Compiles. File 1's 'i' is private
const int i=5;
int main()
{
  ;
}
int i;
Compiles. File 1's 'i' is private (const vars are static by default) .
extern const int i=5;
int main()
{
  ;
}
int i;
Fails (multiple definition of 'i') because when the 2 compiled files are linked together they each have an 'i' that is visible to the other. Why? Because if a variable is declared as extern and it's initialised, then memory for that variable will be allocated. So in this situation there's an 'i' in each file.
extern const int i=5;
int main()
{
  ;
}
extern const int i;
Compiles. There's only one 'i' variable in the resulting program - the 'i' in file 2 refers to the 'i' in file 1. If the line in file 2 was extern const int i=2;, linking would fail

Classes

In C++ you can create ints and floats, etc, but you can also invent more complicated types of things. For example, if your program deals with people, you might want to create objects designed to store information about people. Here's a simple example

class person { public: float height; string name; };

This piece of code doesn't create a person object, but makes it possible to create one. Just as you can create an integer by doing "int i;" so you can now create a person by doing

person p;

Once you've created a person you can then fill in the details. E.g.

p.height=1.73; p.name="simon";

As well as having values like height, etc a person can also have actions. For example

class person { public: float height; string name; void sayhello() { cout << "hello!\n"; }; };

gives each person an extra ability which can be called by doing

person p; p.sayhello();

person is an example of a Class, and p is an object of type person. Whenever an object is created, a special action (called a "constructor") is run. If you don't write one yourself, a default one is called. The constructor function has the same name as the class itself so if we want to write our own we could say

class person { public: float height; string name; void sayhello() { cout << "hello!\n"; }; person() { sayhello(); cout << "I've just been created\n";}; };

So now,

person p; person q;

would produce the output

hello! I've just been created hello! I've just been created

This constructor takes no arguments. We could also provide a constructor that takes one argument

person(string n) { name=n; sayhello(); cout << "I've just been created\n";};

which would give us the chance to name people as we create them by doing something like

person p("eve");

When an object is destroyed, a destructor function is called. For our object the destructor would be called ~person, but before we can show it in action we need to set up a situation where people die.

// Program 9 #include <iostream> #include <string> using namespace std; class person { public: float height; string name; void sayhello() { cout << "hello! I'm " << name << "\n"; }; person(string n) { name=n; sayhello(); cout << "I've just been created\n";}; person() { name="NOBODY"; cout << "I've just been created\n";}; ~person() { cout << name << " is about to die\n";}; }; void testfunction() { person p("adam"); } int main() { person q("eve"); testfunction(); }

This re-uses much of the earlier code (the original constructor now sets name to NOBODY for safety's sake). Here, a person is created in the main routine then testfunction is called. In testfunction another person is created, but the lifetime of that person is only as long as the lifetime of the testfunction routine. If you compile and run this you'll get

hello! I'm eve I've just been created hello! I'm adam I've just been created adam is about to die eve is about to die

It might be useful to know how many persons exist at any particular moment. The variables like name and height are unique to each object but it's possible to create a single variable that all the persons can access. The syntax isn't simple, but the facility's useful. We can use this variable as a counter, adding 1 to it when a person is created and subtracting 1 when a person dies.

Here's the revised code -

// Program 10 #include <iostream> #include <string> using namespace std; class person { static int howmany; // all persons share this one variable public: float height; string name; void sayhello() { cout << "hello! I'm " << name << "\n"; }; person(string n) { howmany++; name=n; sayhello(); cout << "I've just been created. There are now " << howmany << " of us.\n"; }; person() { howmany++; name="NOBODY"; cout << "I've just been created. There are now " << howmany << " of us.\n"; }; ~person() { howmany--; cout << name << " is about to die, leaving " << howmany << " of us\n";}; }; void testfunction() { person p("adam"); } int person::howmany=0; int main() { person q("eve"); testfunction(); }

This may already look like quite a long and complicated program. On the plus side

  • Its use of classes is already more advanced than is required to cope with the demands of the IIA Software project
  • the longest routine is only 6 lines long
  • the "top level" code (in main and testfunction) hasn't required changing. You'll find that much of the work in C++ programs involves developing the objects so that the "top level" is uncluttered.

On the minus side, the counting functionality isn't finished - there are ways of creating objects that we haven't taken into account. To see this, add the following line to the end of main

person r=q;

If you compile and run this you get additional output

eve is about to die, leaving -1 of us

The death of the "eve" clone has been registered, but not the clone's creation. When a new person is created by copying an existing one, it uses a different constructor function called the "copy constructor". If you add the following copy constructor to the person class, things will be better

person(const person&) { howmany++; sayhello(); cout << "I've just been created. There are now " << howmany << " of us.\n"; }

Extra items for local GF2 students

make

These notes are mostly for local GF2 students.

"make" is a unix program that helps with project management. It takes the chore out of recompiling multi-sourcefile programs. It depends on textfiles (by default called "Makefile") that describe the project. This short description of the Makefile's format might be sufficient for you to make minor adjustments to existing files. Makefiles have lines like

SRC = logsim.cc names.cc 

This creates a variable called "SRC" whose value (in this case "logsim.cc names.cc") can be obtained using $(SRC). It also has lines like

myprog: $(SRC)
	g++ -o myprog $(SRC)

This is saying that "myprog" depends on logsim.cc and names.cc. The second line (which has to begin with a TAB) shows how to create "myprog". If a user types "make myprog" while in the same folder as the "Makefile" file, then "make" will check to see if the source files exist and complain if they don't. It will then look at the dates of the files. If "myprog" is newer than the sourcefiles it won't do anything, otherwise it will update "myprog".

In lines like this that have a colon, the second line isn't always required. For example,

logsim.cc: logsim.h

is showing that logsim.cc depends on logsim.h. Also the part after the colon isn't always required. With the following in the Makefile

clean: 
	rm -f *.o myprog

typing "make clean" will remove the named files. Here, the target "clean" isn't the name of a file and doesn't depend on anything, so the action line will always be run.

The pay-off comes when the number of files grows. Here's a complete Makefile. It describes a situation where there are 6 source files, 2 of which include the same header file.

OBJECTS = one.o two.o three.o four.o five.o six.o 
.SUFFIXES:	.o .cc

.cc.o :
	g++ -g -c -o $@ $<

myprog: $(OBJECTS)
	g++ -o myprog $(OBJECTS)

one.o: header.h
three.o: header.h

clean: 
	rm -f *.o myprog

This has some extra lines to say how the *.o files should be generated from the *.cc files (don't worry about the cryptic details). Typing "make myprog" will produce (if necessary) an updated executable. Only "one.cc" and "three.cc" use "header.h", so if "header.h" is changed and "make myprog" typed, the minimum of work is done - only 2 of the source files will be recompiled if no other changes have been made since the last recompilation. This will save you a lot of time and typing.

Linked lists

These notes are mostly for local GF2 students.

Each device in the circuits used in the GF2 project is described in a structure - a block of variables. Beforehand we won't know how many devices there will be, so creating a fixed-size array of structures might be a problem with big circuits. In a linked list the items form a chain that can be extended until memory runs out. The Wikipedia linked lists page shows the basic idea. In C++, linked lists are implemented using pointers - each item has a field (often called "next") that "points to" the next one. The final item's "next" field is often set to a special value (0 or NULL).

To visit all the items in a linked list, you start at the beginning, following the pointers until a pointer is NULL. The code will look something like this

devlink d=first_item;

// while you're not at the end of the list ...
while (d != NULL) {
  // look at the item being pointed to 
  // ...
  // then get ready to visit the next item     
  d = d->next;
}

You'll find several animations of linked-lists online - see for example Linked list.

Exercises

  • There's some duplicated code in the person-based code above which (following the 3F6 notes) is a bad thing. Try to factorise the code.
  • Write a program with a function that takes a person as an argument. First pass by value then pass by reference, noting how the count changes.
  • Sometimes it's useful to create "singleton" classes - classes where only one object of that type can exist. Adapt the code so that person becomes a singleton class.