Search Contact information
University of Cambridge Home Department of Engineering
University of Cambridge >  Engineering Department >  computing help

Debugging

Introduction

In Code Complete it notes that "the industry average for code production is 8-20 lines of correct code per day". It's not uncommon for over 50% of programming time to be spent on debugging, and yet the same book says that "experience suggests that there are 15-50 errors per 1000 lines of delivered code" and some expensive bugs have been documented. Testers are important enough to get credits nowadays on computer projects (especially with computer games).

This document looks at various techniques and utilities (particularly ddd on the Teaching System) that aid debugging. Note that debuggers like ddd can only be used if your program already compiles.

Bug avoidance

bug The bigger a program becomes, the bigger the benefits of early bug detection. Pessimistic programming and in-built testing procedures save time in the end. Don't leave testing to the last moment - test each component as you go along. Dealing with a sequence of independent bugs is much easier than fixing inter-related bugs.

Take advantage of what your computing language offers. More recent versions of Fortran and C/C++ offer facilities designed to replace older, more bug-prone methods, and compilers have options that can warn you about legal though suspicious aspects of your code. Read your compiler's documentation.

General approaches

Each language has its own commonly made errors. See the language sections of this document for details. Some general points are

C++ - an example

The following program (called average.cc) was meant to print out the mean of 1, 2, and 3.
#include <iostream> using namespace std; int average(int num[]) { int total; for (int i=0;i<=3; i++) total=total+num[i]; cout << "Average is " << total/3 << endl; } int main() { int n[3]={1, 2, 3}; average(n); }
When I run this program I get
Average is -403085996
Staring at the code long enough might help you detect the bug, as might adding some cerr statements. Here I'll show how to use the debugger we have on the Teaching System (it's called ddd - other debuggers behave similarly)

ddd

ddd is a debugger for C, C++ and Fortran code that gives you control over execution of a program, allowing you to step through a program line by line and check the values of variables. The ddd commands are the same whichever of the languages the suspect program is written in. Before you can use ddd on a program, you have to compile with a `-g' flag.
 g++ -g -o average average.cc
Then type ddd average. A window will appear showing the source code and some menus.We are going to run the program a line at a time. If you press with the right-button on the word main, you'll get a menu.
ddd
Select the "Break at mai[n]" option. This will make execution stop when it reaches the main routine. Then click on "Run" in the DDD floating panel.
A green arrow will appear, showing which line is about to run. Execution has stopped at the breakpoint we set. Click on "Step" in the DDD floating panel. ddd
Notice how the arrow moves onto the average(n) line. Hovering the pointer over the n of this line will display n's current value. "Step" again to go into the average routine.
We're interested in the value of total and i. If you click on each of these with the right button and pick the Display option you'll be shown the value of these variables in the top panel. When you continue Stepping through the code you'll see the arrow going around the loop and the variables' values changing. You'll be able to see when total's value goes wrong (i goes from 0 to 3 inclusive, so 4 elements are added together, though only 3 were set). ddd

jjjjjThe commands already introduced may be all you need for the time being if you're using short programs. Watching how the arrow moves may alert you to situations where the control flow is unexpected, and the ability to observe particular variables saves you having to put extra cout commands in your program. Two tips -

As your programs grow in size, stepwise execution will become more tedious. You need ways of being more selective in your investigations. You can set breakpoints in particular routines, and take bigger steps that don't show what happens in function calls. Browse through the menus to get an idea of the other facilities available. You'll never use most of them, but it's nice to know that they're there. Note that ddd copes happily with multiple source files even if they're in different languages.

C/C++

C++ has all of C's potential for bugs, plus some additions of its own, so this section is also useful for C++ programmers.

Judicious use of print statements can be the best way to work. Remember however that standard output is buffered, so it might not appear on the screen straight away. Using endl in C++ will force output to the screen. Alternatively, use cerr in C++ (fprintf(stderr, ... in C) which is unbuffered. Using the error channel is useful also because you divert the debug output from the normal output. If your program is called foo then on Unix

      ./foo 2>errorlog

will put the error output (but only the error output) into a file called errorlog.

Debug output can become so voluminous that it defeats its purpose. The preprocessor offers some features to control the quantity of output and provide more detail. The code below shows

#include <iostream> using namespace std; #define DEBUG #define DEBUGLEVEL 100 int main() { int i = 5; #ifdef DEBUG cerr << "This is a debug statement on line " << __LINE__ << endl; #endif #if DEBUGLEVEL > 50 cerr << "This is an intense debug statement in file " << __FILE__ << endl; #endif }
See

Fortran

Features available will depend on the version you have. The Teaching System's g77 compiler has a -g option (so you can use ddd on the resulting executable) and various options (e.g. -Wall to provide more warnings). The following options are sometimes available too Mail Pete Clarkson if you have problems. Note that using Fortran 90 may help reduce errors.

Matlab

Mathwork's Techniques for Debugging MATLAB M-files is useful.

The Web

Web applications may present difficulties because of the number of components that may be involved. If there's a problem (in particular performance-related) the cause could be: the network; the database, the web server (with modules for PHP support), the web proxy-server or the browser itself (which comes in various varieties each with different plug-ins, java interpreters and so on). See Testing Web applications for some suggestions.

Crashes

Some common run-time errors are

Your program may also crash if it unexpectedly hits a limit of some kind.

The error message won't usually tell you the line that caused the crash, or even the function that the line's in, but if you run the program within a debugger, you'll get more information. For example, the program on the right tries to write to memory starting at 0. Running the program in ddd shows you the last line executed and lets you find out variable values.

If a program runs into serious problems it may crash, giving a Core Dumped message. This means that a big file called core may be created in the directory from where you ran the program. It may even do this before it reaches the first line of executable code. The resulting core file can be investigated using adb program can be used even if you haven't compiled using the `-g' flag. If a program called testing crashes, typing adb testing, then $c will give a backtrace of routines called, whichmay help you localize the bug. Get out of adb using $q.

Unreliable bugs

Some bugs cannot be reliably provoked. The effect of some bugs is random - sometimes the program will continue running, sometimes it won't. Sometimes the bug "goes away" if you add some cout statements to debug the code, or if you recompile with the -g flag as preparation for using the debugger. This could be because

Memory Problems

2 common causes of crashes relate to memory Tools exist to help detect these problems. One example is Valgrind which works on x86-based linux systems (CUED's teaching machines for example). It includes Memcheck which can detect etc. If the following program is compiled to produce a program called testing
#include <stdlib.h> #include <stdio.h> #include <string.h> int main() { char *str1=(char*)malloc(5); char *str2=(char*)malloc(5); strcpy(str2,"four"); strcpy(str1,"twelve bytes"); printf("str1=%s\n", str1); printf("str2=%c\n", str2); }
then
  valgrind testing
will produce a lot of output, including things like
Address 0x1BB50034 is 7 bytes after a block of size 5 alloc'd ... LEAK SUMMARY: definitely lost: 10 bytes in 2 blocks.
etc.

See Also

© Cambridge University Engineering Dept
Information provided by Tim Love (tpl)
Last updated: October 2010