Search Contact information
University of Cambridge Home Department of Engineering
University of Cambridge >  Engineering Department >  computing help

Faster, longer programs, and optimisation

I'll deal with general issues first, then continue towards the particular. Some of the easier things to do also result in the biggest improvements

How computers work

Memory

There's a memory hierarchy - the smallest, fastest memory is in the CPU. Much more memory is on the disc, but it's slower

Very roughly speaking, the CPU is 8* faster than the L2 cache which is 256* faster than RAM which is 80* faster than Disc.

Best case? (Doing the same operations in a different order may triple the speed of calculation).

Worst case?

CPU

Nowadays there are "dual-core" CPUs, and computers can have more than one CPU (our Linux Servers each have 4). Unless you're a low-level programmer though, you won't be able to take explicit advantage of this potential for parallel programming (unless you're a matlab user - see Matlab Parallelisation). See the Parallel Programming page if you're interested. If you're writing programs that use huge amounts of memory, it's worth knowing whether the CPU's are "32 bit" or "64 bit". The biggest memory address that a "32 bit" CPU can access is about 4G at the very best, so there's no chance of having arrays that big. The memory range of a "64 bit" CPU is far greater - the teaching system machines have 64 bit CPUs.

On the teaching system you can get a rough idea of a programs' CPU and memory usage by running the Computer panel's system monitor screendump

Especially for specialised number-crunching areas, power can be obtained fairly cheaply. Some groups in the department use big clusters of cheap machines to run programs, but unless you belong to those groups you won't be able to use these machines. Alternatively, in the Guardian (Feb 2008) an academic said he could run a simulation 130 times faster on a PlayStation 3 (which has several CPUs/accelerators) than on his desktop PC. Some CUED people use CUDA (a library that uses an nVidia board) to get 10x speed-up. There's a MATLAB plug-in for CUDA

Big Issues

Do estimates of time/space requirements as early as possible. Ask yourself

Identifying the bottleneck

Slow graphics? Slow disc access? Is a particular routine slow? On Unix, typing top will give some statistics about programs that are running (how big they are, how much CPU they're using). To get fine-grain information you need to use a "profiler". On matlab this is easy. If your main program is called snail, then typing the following inside matlab -

profile on 
snail   
profile plot
profile off
will display a histogram showing a function-by-function breakdown of performance. With fortran or C++ on Unix, you'll need to compile using the "+g" option, run your program (called snail for example), then run prof snail. You'll get output looking a little like this
 %Time Seconds Cumsecs  #Calls   msec/call  Name

  32.4    0.35    0.35                      __strlen20
  23.1    0.25    0.60 5770000        0.00  _strlen
  20.4    0.22    0.82   10000        0.02  slow
  11.1    0.12    0.94   10000        0.01  fast
  10.2    0.11    1.05                      _mcount
   1.4    0.01    1.06                      _monstartup

Coding

Trade-offs - Speed/Size/Readability

Space

The smaller you make your data structures (and the more localised the access to them) the better. If your process (i.e. the running program) grows too big, if will run slowly and may become too big to run. Asking for more disk quota won't help. See Memory Optimisation for system-specific and language-specific details.

Running long jobs

See Also

© Cambridge University Engineering Dept
Information provided by Tim Love (tpl)
Last updated: January 2012