Department of Engineering

IT Services

Scientific Computing

This document is about writing computing-intensive (usually numerical-orientated) programs

Languages

Not all of a program is going to be number-crunching - you need to get the data in, call other programs, and display the results. The computing service suggest python as the "glue" language. At CUED we also have matlab, which although not free has good graphics capabilities and thousands of functions. Both matlab and python may be fast enough to do all but the most intensive parts of your work.

You may well use Fortran or C++ for the number crunching work - see

How Computers Work

It helps to know the basics of how computers work. Read up about
  • Memory/cache - doing the same operations in a different order may triple the speed of calculation because of cache effects
  • CPU/cores - our linux servers each have 4 dual-core CPUs - how can these be exploited? It's hard to write a program to explicitly exploit dual-core multi-CPU systems. Languages like (C Omega) and F# are being developed to assist programmers, but for now you'll have to hope that the Operating System exploits the possibilities. If you're lucky, newer releases of the program you use might be written to exploit parallelism transparently (newer Photoshops sometimes do). A few programs (e.g. the newest matlab) may offer you the chance to explicitly exploit parallelism.
  • Processes/threads - if you're working at a low level, you might be able to explicitly control process/thread behaviour
  • Numbers - how do computers store numbers? See

Programming

  • If you're doing lots of calculations you'll benefit from reading about Numerical Analysis. It shows you how to solve problems as quickly and accurately as possible - e.g. adding the same numbers in a different order may give a more accurate answer - (x+y) + z doesn't necessarily equal x + (y+z). In several areas of Numerical Analysis little has changed for decades, though new algorithms have been developed (e.g. FFT), and new issues have emerged (e.g. cache-friendiness, parallelisability). Be prepared to interface with code in other languages - a substantial body of well-tested libraries have been written in older dialects (C and F77).
    More recently, IEEE have produced standards for floating point behaviour, and packages like LAPACK have made routines more easily available.
  • For C/C++ programmers, the Maths section of our C++ course might be useful
  • Read about Basic optimising and profiling
  • Read about Memory optimising
  • The Grid Engine page has some practical tips
  • If you're using Matlab you'll find Matlab - faster scripts useful.

Managing Large Amounts of Data

  • Programs like matlab can help you visualize your output.
  • A generalised data format like HDF5 might help with storing, sharing, and displaying data.

Available Hardware

Our Teaching System offers Grid Engine. Your group may have its own facilities. 2 recent developments are

  • Use of graphics cards (or even games consoles) to do fast floating point operations. They work, but they're not IEEE and they create heat. See cuda for a C language environment that provides access to processing power of NVIDIA GPUs
  • Use of clusters - some groups use InfiniBand for fast inter-machine communication

Support

The University provides support. The Scientific Computing series - CS Programming courses - is useful.

The Centre for Scientific Computing offers courses and facilities.