C++ course - Special Topics

I/O and formatting

The way that text and numbers are output can be controlled.

#include <iostream>
#include <iomanip>
#include <cmath>
using namespace std;

int main()
{
  int i=10;
  cout << "i = " << i << " (default)\n" ;
  cout << hex << "i = " << i << " (base 16)\n";

  double d = sqrt(7.0);
  cout << "d=sqrt(7.0)=" << d << endl;
  cout.precision(3);
  cout << "After cout.precision(3), d=" << d << endl;

  return 0;
}

C++ offers some surprises. For example the following code doesn't print 111 - what does it print?

#include <iostream>
using namespace std;
int main() {
  int i=001+010+100;
  cout << i << endl;
}

There are many other routines too. See C++ Input/Output for details.

I/O Quiz

More surprises can arise when using I/O. Look at the following code (the line numbers aren't part of the code)

#include <iostream>               [0]
using namespace std;              [1]
int main()                        [2]
     {                            [3]
     int hex=16;                  [4]
     cout << hex << 257 << endl;  [5]
     }                            [6]

I/O Exercises

Write a program that reads integers (one per line) from a text file and sums them up.
Write a program that analyses text files - count the words and sentences; work out the average word length and sentence length; determine the frequency of each letter.

If you're using any of the maths routines (sqrt, sin, etc) remember that you'll need to include cmath to declare the routines. Here's a little example that uses various methods (some better than others) to determine the value of pi -

#include <iostream>
#include <cmath>
using namespace std;

int main()
{
  cout.precision(10);
  cout << M_PI << endl;
  cout << 4*atan(1.0f) << endl;

  float quarter_pi=0;
  int sign=1;
  for (int i=1;i<1000;i++) {
    quarter_pi+= sign/(i*2.0-1);
    sign=sign*-1;
      }
  cout << 4*quarter_pi << endl;

  float sixth_pi_squared=0;
  for (int j=1;j<1000;j++)
    sixth_pi_squared+= 1.0/(j*j);
  cout << sqrt(6*sixth_pi_squared) << endl;
}

The results are all different! This is for various reasons

The limits of representation

floats are stored in a fixed number of bytes, so there's a limit to how accurately pi can be represented. Also many rational numbers can't be accurately represented. Just as 1/3 can't be represented in decimals, so there are problems in base 2. Try the following

#include <iostream>
using namespace std;
  int main(){
    double a=24.45;
    cout.precision(50);
    cout << "a =" << a << endl; 
  }

For me, it produces this

  a =24.449999999999999289457264239899814128875732421875

The numbers that can be exactly represented are spread out over a very wide range. A high density of representable numbers is near 1.0 and -1.0, but fewer and fewer representable numbers occur as the numbers go towards 0 or infinity. There are range limits too that can be extended by using double or long double rather than float. You can use the information in <limits> or use sizeof(double), etc to compare the size of float, double and long double. Here's the code to print out the difference between 1 and the smallest value greater than 1 that is representable as a float

  #include <iostream>
  #include <limits>
  using namespace std;

  int main() {
    cout << numeric_limits<float>::epsilon() << endl;
  }

Keep in mind that there may be performance penalties when using higher precision. Also the double representation does not necessarily increase the precision over float. Actually, in most implementations the worst-case precision decreases but the range increases.

In any case there is no point in using higher precision if the additional bits which will be computed are garbage anyway. Try running the following. On most Intel machines with g++ you can do g++ -m128bit-long-double to use longer doubles than usual.

   #include <iostream>
   using namespace std;
   int main(int argc, char* argv[]) {
   float f1;
   double d1;
   long double ld1;
   cout << "float length=" << sizeof(f1) <<  " doublelength=" << sizeof(d1) <<  " long doublelength=" << sizeof(ld1) << endl;  
   f1=d1=ld1=123.45678987654321;
   cout.precision(50);
   cout << "f1=" << f1 << ", f1**2=" << f1*f1 << endl;
   cout << "d1=" << d1 << ", d1**2=" << d1*d1 << endl;
   cout << "ld1=" << ld1 << ", ld1**2=" << ld1*ld1 << endl;
   return 0;
   }

Here's another program that's worth running. On our machines, it produces "Not Equal ..."

   #include <iostream>
   using namespace std;
   int main() {
     float x=3.0/7.0;
     if (x==3.0/7.0)
        cout << "Equal\n";
     else {
        cout << "Not Equal\n";
        cout.precision(20);
        cout << "x=" << x << " but 3.0/7.0 =" << 3.0/7.0 << endl;
     }
   }

The trouble is that the double-precision representation of 3.0/7.0 (which is used on the RHS of the comparison because double-precision is the default) isn't the same as the single-precision form used on the LHS of the comparison. In the middle of a big program such problems can be hard to detect so it's worth being aware of the problems from the start.

Calculation problems

Partly because the representation issue affects intermediate values, calculations don't always produce the expected result. Just as in base 10 to 2 decimal places, 3 * (10/3) doesn't equal 10 and (0.55/100)*100 doesn't equal 0.55, so computer arithmetic has its limitations - 0.3-0.2-0.1 doesn't equal 0 (try it!). Try guessing what the following program will print

#include <iostream>
using namespace std;

int main() {
  double number = 0.0;
  double exact[]={0.0, 0.1, 0.2, 0.3, 0.4 ,0.5, 0.6, 0.7, 0.8, 0.9, 1.0};
  for (int i=0;i<11; i=i+1) {
    if(number == exact[i])
      cout << exact[i] << " matches "<< endl;
    number+=0.1;
  }
}

Even adding or subtracting the same numbers in a different order may lead to a different result. In Matlab, doing

   4/3 - 1/3 - 1

produces 0, but

   4/3 -1 - 1/3

doesn't. Try it in C++!

So, before you do any heavy computation, especially with real numbers, I suggest that you browse through a Numerical Analysis book or look at the University computing service's course on computer arithmetic. Extensive course notes are online - see How Computers Handle Numbers.

Things to avoid are

Finding the difference between very similar numbers (if you're summating an alternate sign series, add all the positive terms together and all the negative terms together, then combine the two). The following calculations in C++ (or Fortran, or Matlab) on most machines produce the following answers

Calculation Answer

1.2219+0.003-1.2249 -2.2204e-016

(1.2249-1.2219)-0.003 1.1362e-016

1.2229+0.003-1.2259 0
Dividing by a very small number (try to change the order of operations so that this doesn't happen).
Multiplying by a very big number.

Calculation	Answer
1.2219+0.003-1.2249	-2.2204e-016
(1.2249-1.2219)-0.003	1.1362e-016
1.2229+0.003-1.2259	0

Common problems that you might face are :-

Testing for equality

It's especially risky to test for exact equality. Better is to use something like

  d = max(1.0, fabs(a), fabs(b))

and then test fabs(a - b) / d against a relative error margin. Useful constants in <climits> are FLT_EPSILON, DBL_EPSILON, and LDBL_EPSILON, defined to be the smallest numbers such that

 1.0f + FLT_EPSILON  != 1.0f
 1.0  + DBL_EPSILON  != 1.0 
 1.0L + LDBL_EPSILON != 1.0L

respectively.

Over- and underflow

This is when a calculation produces a bigger or smaller number than the calculation can cope with. You can test the operands before performing an operation in order to check whether the operation would work. You should always avoid dividing by zero. For other checks, split up the numbers into fractional and exponent part using the frexp() and ldexp() library functions and compare the resulting values against HUGE (all in <cmath>).

Infinity

The IEEE standard for floating-point maths recommends a set of functions to be made available. Among these are functions to classify a value as NaN, Infinity, Zero, Denormalized, Normalized, and so on. Most implementations provide this functionality, although there are no standard names for the functions. Such implementations often provide predefined identifiers (such as _NaN, _Infinity, etc) to allow you to generate these values. With the g++ compiler you can do "d=NAN;" to set a variable to NaN, and use isnan(d) to check a variable.
If x is a floating point variable, then (x != x) will be TRUE if and only if x has the value NaN. Some C++ implementations claim to be IEEE 748 conformant, but if you try the (x!=x) test above with x being a NaN, you'll find that they aren't.

Algorithm problems

There are many algorithms to calculate pi - some work (converge) faster than others. Especially with an iterative method one can reach the stage where rounding errors accumulate to the point that further iterations make the result worse rather than better.

Machine differences

The IEEE standards allow for some flexibility. You shouldn't expect all computers and compilers to produce the same numerical results from the same source code. For example, some chips use 80-bit intermediate values when calculating with 64-bit numbers. Compilers might let you control such behaviour - g++ for example has handware-dependent options -mno-fancy-math-387, -msse, etc.

Before you start writing much maths-related code, check to see that it hasn't all been done before. Many maths routines, including routines that offer arbitrary precision are available from netlib. Also try the IEEE 754 Converter and read Computer numbering formats, Signed_number_representations, Floating-point Basics and, if you're really keen, What Every Computer Scientist Should Know About Floating-Point Arithmetic

Other resources are listed on CUED's maths page.

Maths libraries based on the Standard Library

Template Numerical Toolkit
Boost Graph Library
Blitz++ (library for scientific computing)

Maths Quiz

Look at the following code (the line numbers aren't part of the code)

#include <iostream>              [0]
using namespace std;             [1]
int main()                       [2]
{                                [3]
  float f= 0.3;                  [4]
  float g = f/2;                 [5]
                                 [6]
  if (0.15 == g)                 [7]
    cout << "g is 0.15" << endl; [8]
  else {                         [9]
    cout.precision(8);          [10]
    cout << "g=" << g << endl;  [11]
  }                             [12]
}                               [13]

Maths Exercises

Write a program that invites the user to type in a number then uses the sqrt() routine on that number. Use the behaviour of sqrt() to report whether the given number is negative.
Write a program that invites the user to type in a complex number then calculates the square root of that number. Use complex<float>.
Which of 117/9 and 11.7/.9 is the bigger?

Speed

The easiest way to make your code faster is to read the compiler documentation. g++ has over 150 optimisation features that can be independently controlled. Using -O turns on several of these features - "the compiler tries to reduce code size and execution time, without performing any optimizations that take a great deal of compilation time". 30% improvements are common. The -O3 option is more aggresive. Using "-Os" optimizes for size. For other more labour-intensive possibilities see

Design

Stroustrup writes that the last section of his book aims

to bridge the gap between would-be language-independent design and programming that is myopically focussed on details. Both ends of this spectrum have their place in a large project, but to avoid disaster and excessive cost, they must be part of a continuum of concerns and techniques.

Because of C++'s O-O and generic abilities, there's a chance that for small programs the design concepts can trivially be translated into class declarations.

Programs vary considerably in their composition. I've seen these figures quoted

Coding takes 10% of the total time. Debugging takes 50%.
The Graphical User Interface is 80% of the code
Error handling can be 50% of the code

Complexity is the enemy, so

Divide and conquer.
Use modules - namespaces or files (helps the optimiser too).

Don't re-invent the wheel

Copy models
Adapt existing parts
When making new parts design them for re-use

Makefiles

If you have many source files you don't need to recompile them all if you only change one of them. By writing a makefile that describes how the executable is produced from the source files, the make command will do all the work for you, but you need to create a "makefile".

The following makefile says that pgm depends on two files a.o and b.o, and that they in turn depend on their corresponding source files (a.cc and b.cc) and a common file incl.h:

   pgm: a.o b.o
       aCC a.o b.o -o pgm
   a.o: incl.h a.cc
       aCC -Aa -c a.cc
   b.o: incl.h b.cc
       aCC -Aa -c b.cc

Lines with a `:' are of the form

target : dependencies

make updates a target only if it's older than a file it depends on. The way that the target should be updated is described on the line following the dependency line (Note: this line needs to begin with a TAB character).

Alternatively, you can use an IDE (Integrated Development Environment) to help manage your project.

Department of Engineering

IT Services