Shell Scripts

Image shellfig This talk offers a quick introduction to "shell scripts".

Interpreted computing languages (languages that don't need to be compiled) are increasingly popular. Examples include Python (used by Google!), Perl, PHP, JavasScript and Ruby. Development times can be short, and it's easier to implement concepts - you don't lose mental focus by having to compile, etc. The resulting programs might not be as fast as those written in C++, but often that doesn't matter.

Some of these languages (PHP for example) are used on web pages. "The shell" is the program that reads what you type on the Unix command line. As well as dealing with line-editing and wild-card characters it also has a language. Shell Scripts are text files containing commands that the shell can run. If you've used the Unix command line and you've written programs in some language already, you might not find Shell Scripts too hard. Several shells exist - the Korn shell, the Bourne shell (called sh), the C shell, etc. I'm using bash, the "Bourne-Again SHell". To use any shell you'll need to familiarize yourself with

some common Unix commands (like ls, etc), and be aware that they can take various options.
a few shell constructions (like for, if, etc)
using shell variables, and some facilities that are especially useful when writing scripts (like command substitution, piping, etc)

Some Unix commands that you might not use much from the command line (like cut, wc etc) are especially useful in shell scripts. Once you start combining these ideas you can soon create useful programs. So today I'll look at some commonly used commands then show how to combine them to create little utilities.

Firstly then, here are some quick examples using the ls, head, find and sort commands. Use the man command to find out more - note however that these manual pages aren't always easy to read

ls -t | head -5 (list the 5 newest files in the folder)
ls -l | sort -g -r -k5 | head (list the 10 biggest files in the folder)
find . -maxdepth 3 -atime +2 -type d (list the folders - not regular files - that are 3 levels or less below the current folder and were changed more than 2 days ago)

Basic Ideas

The main material is in the Shell Scripts and Awk handout. Here are some examples that use the basic ideas mentioned above.

Redirection - to put the output of a program into a file you can do something like
date > outputfile
To append to an existing file use >> instead of >
Using variables - to set and display variables you do this
i=6 echo $i
Typing printenv will show the variables that the shell uses.
Arithmetic - if you do
a=2 b=3 c=$a+$b echo $c
you'll get
2+3
The trick is to use let (or put the expression within (( ... )) brackets).
a=2 b=3 let c=$a+$b echo $c
will give
5
Piping - the output of a program can be inputted to a program rather than be displayed on screen. Suppose you wanted to know how many files were in the current folder. You could type ls and count the files, but wc will do the counting for you (wc stands for WordCount, but it counts lines and characters too. We'll make it output just the number of lines)
ls | wc -l
will print the answer.
Command Substitution - the output of a command can be stored in a variable. If you want to save the previous example's answer into a variable you could do
numberOfFiles=$(ls | wc -l)
Scripts - suppose you wanted to find out how many people called Smith were on your system. You could do grep Smith /etc/passwd and count the lines of output. Suppose you wanted to write a program called howmany so that typing ./howmany Smith would print out something like "Number of users called Smith=17" how would you do it? Create a file called howmany and in it type
grep $1 /etc/passwd | wc -l
Save the file, and make it executable by doing chmod a+x howmany. When you type ./howmany Smith, the $1 in the script will have the value of the 1st parameter given to the script (namely "Smith"), so a number should be typed out showing how many lines in the password file contain the word "Smith". The output can be made more readable by changing the file contents to
num=$(grep $1 /etc/passwd | wc -l) echo "Number of users called $1=$num"
As a final development we'll change the script so that it can accept more than one name. We'll use a for loop. Here's the code
for name in $* do num=$(grep $name /etc/passwd | wc -l) echo "Number of users called $name=$num" done
$* expands to $1 $2 etc. The first time round the loop, name has the value of $1 (the first argument). The second time round the loop, name has the value of $2, etc, so ./howmany Smith Brown should work ok, though to finish the job we'd need to cope with the possibility of there being no users with those names.

Processes

A process is a program - not the lifeless bytes on a DVD but the active, running program. In this section I'll briefly illustrate some features of process creation.

Foreground/Background processes and job control

sleep is a program that does nothing, but it's useful when giving demonstrations. If you do

sleep 1000

you won't get a command-line prompt back for 1000 seconds - the process is run "in the foreground". But often you want several programs running at at once. Type

sleep 1000 &

and you'll get a command-line prompt back straight away, along with 2 numbers - something like

[1] 5677

The number in square brackets tell you which "background process" you've just started (in this case it's the 1st one). The other number is the process ID (each process on the system has a unique identity number). You can manipulate processes. If you do

fg %1

you'll move background process 1 (the sleep process) into the foreground. If you then type Ctrl-Z (i.e. hold the Control key down and hit Z) you'll suspend the process. You can push it into the background using

bg %1

and kill it while it's in the background using

kill %1

To kill the foreground process you just do Ctrl-C. You can run many background processes simultaneously.

Environments

Each process has many properties associated with it - an owner, a list of files that it's opened, the current folder, etc. When you start a new process (by typing xclock for example, or running a shell script) it will inherit many properties from its parent, but it's important to remember that the new process is mostly independent of the old. For example, if you write a script called GoToRoot containing the following code

cd /

(/ means "the 'root' folder in the tree of files") then run it from your home directory, you'll find that afterwards you won't be in the root (top) directory, you'll still be in your home directory. GoToRoot was run in a new process. To see that more clearly, you can change the file so that the process IDs are printed out -

echo "Parent ID is $PPID" echo "ID is $$" cd /

type echo $$ to find the ID of the current shell, then run the script again. The new process moves to the root directory, but the process dies, leaving the parent process (the original command line shell) untouched. If however you type

source ./GoToRoot

the contents of the file will be run within the current process, and something different should happen.

And finally ... more worked examples

changesuffix

Write a program that changes all files with a jpeg suffix to a jpg suffix.

People often try

mv *.jpeg *.jpg

It doesn't work - *.jpeg is expanded to a list of filenames, so is *.jpg. The resulting command isn't useful. First you need to how to change a particular filename (foo.jpeg, say). The trick is to use the basename command.

basename foo.jpeg .jpeg

will print foo. so if a filename like foo.jpeg is stored in a variable f we could do

newf=$(basename $f .jpeg).jpg

All we need to do is put this in a loop

for f in *.jpeg do newf=$(basename $f .jpeg).jpg mv $f $newf done

sizeof

Write a program called sizeof that given a program name, tells you the size of the program: e.g. "sizeof xclock" might display "xclock is 43344 bytes long"

First we need to create a file called sizeof and make it executable. One way to do that is to type
touch sizeof chmod a+x sizeof
Next we need to think about what goes into the file. Let's experiment on the command line first. We need to be able to find a program given its name. which does this. e.g.
which xclock
displays
/usr/bin/xclock
How can we find the size of /usr/bin/xclock?
ls -l $(which xclock)
displays
-rwxr-xr-x 1 root root 43344 2008-11-21 03:16 /usr/bin/xclock
The 5th column is the size, and that's all we want, so we'll use cut, saying that ' ' (a space) is the separator
ls -l $(which xclock) | cut -f5 -d' '
We want to store the result of this in a variable (s say). We can do that using
s=$(ls -l $(which xclock) | cut -f5 -d' ')
That's solved our main problem.
Now let's get the script working. Inside the script, the name of the program we're looking for is $1 (because it's the first argument), and so the location of that program is which $1. So version 1 of our script is
s=$(ls -l $(which $1) | cut -f5 -d' ') echo $1 is $s bytes long
That works if we type "sizeof xclock", but is messy if we just type "sizeof", forgetting to give an argument. Let's make sure that users provide exactly 1 argument by beginning the script as follows
if [ $# -ne 1 ] then echo "$0 needs exactly 1 argument" exit fi
We should also deal with situations where the argument isn't a program. which returns true if it finds the program and false otherwise. When we call which for this purpose we don't want it to display anything, so we'll send the output and errors to /dev/null (a black-hole for bytes). Our final program is
if [ $# -ne 1 ] then echo "$0 needs exactly 1 argument" exit fi if which $1 >/dev/null 2>/dev/null then s=$(ls -l $(which $1) | cut -f5 -d' ') echo $1 is $s bytes long else echo "$1 isn't a program" fi

Note that about half of this program deal with errors - par for the course.

storebyyear

Write a program called storebyyear that looks at all the files in the current folder, putting all those creating in 2009 in a folder called "2009", and similarly for other years. It should create the folders if they don't already exist.

Firstly then, how do we get the year of a file? If the output of ls -l is in the following format

-rwxr-xr-x 1 root root 43344 2008-11-21 03:16 /usr/bin/xclock

we can use cut to pick out the "2008-11-21" field (the 6th field, where fields are separated by spaces), then cut again (getting the 1st field, where fields are separated by '-') to pick the "2008" part of this field. Then we can check to see if we need to make a new folder before copying the file. Here, without error checking is a program.

for filename in * do thedate=$(ls -l $filename | cut -f6 -d' ' ) theyear=$(echo $thedate | cut -f1 -d'-') if [ ! -d $theyear ] then echo "Making a $theyear folder" mkdir $theyear fi cp -i $filename $theyear done

How could you adjust this so that it only deals with JPEG files?

Department of Engineering

IT Services