Python 1A Mich booster talk
There are several Python features that you might have copied from one example to the next without knowing quite what they're for. This document strives to fill in some of these gaps in understanding and provide overviews. It's particularly aimed at 1st year CUED Mich students, whose recent questions have inspired this document.
Versions
Sooner or later you'll come up against problems caused by Python existing in several versions. The main split is between Python 2 and Python 3, but there are also differences between distributions (e.g. which packages and editors are provided).
If you install anaconda on Macs or Linux you'll have at least 2 python packages on your machine, which can become rather confusing. In particular you might find that installing a package for one python program doesn't made it available for other python programs.
There are also differences in behaviours depending on the operating system used (file-opening for example may differ between Windows and MacOS). So when you report a problem it's worth including information about which Python you're using, and on what type of machine. The following code gives some system information that may help
import sys sys.executable sys.version_info sys.path sys.platform
If you're searching for information about (say) the len
command in Python 3, it's worth searching for python 3 len
rather than python len
.
Notebooks
Notebooks are only one way to run Python. Notebook files can be run locally or "in the cloud". In the Mich term you're running the files "in the cloud" (probably on a machine in the States, which might be overloaded in the afternoon). Notebooks have problems (and facilities like "%timeit
") that aren't to do with Python. The option that restarts the "kernel" and re-runs the cells in order is the most reliable way to unblock problems.
You need to be aware that the order the cells are run in matters. Functions and variables created in one cell will be available in a later cell.
Software Overview
Here's the software you're likely to use -
- python - a program that given appropriate text (python code) will make a computer do things. Linux machines and Macs have it installed by default - just open a terminal window and type
python
- Jupyter Notebooks - lets you use documents that contain live code, equations, visualizations and text. You can use them "in the cloud" (Google's Colaboratory for example) or install the facility on your own machine.
- Anaconda - a suite of programs related to Python (including python itself and Jupyter Notebooks)
- Visual Studio code - an editor that's part of the Anaconda suite. You can use any editor to write Python code, but VS Code has some helpful Python-specific features.
- pip - a program that installs python packages (packages are add-ons, providing extra functions)
How to do the work
Some of the exercises are to give you coding practice, some are more to introduce you to examinable topics. The documentation helps you do the exercises and vice versa - i.e. after having done the exercises it's a good idea to read the documentation again.
When the course was DPO-based, students did 2 sessions a week and had programs marked each session so that they were never behind schedule. You will have to schedule your work and assess your own progress. Especially if you're new to programming, don't assume you can do everything in the last weekend before the deadline.
Checking your code
We've added assert
lines to many of the program. These lines help you check your program. When these lines run in Colab, a green tick will appear beside them if your code seems correct. If you see a red exclamation mark or a cross you probably need to fix something.
But most importantly, read the question!. After you've written a program go back and read the question again. If the question asks you to write a function you need to write a function. If it says use recursion, then use recursion - if you don't know what recursion is, you need to read the coursework material again.
Functions
You're used to using function like sin
in calculators or on spreadsheets. Python lets you write your own functions. Suppose you want to write a function that returns the double of a number. You could try this
def double(): return number*2 number=7 print(double())
This forces the programmer to assign a value to a number
variable before calling the function. It's horrible. Much better is
def double(number): return number*2 print(double(7))
Note that when this version of the double
function is called, you don't have to first create a variable called number
even though the function uses a variable called number
. Instead of "7" in this program you could create a variable (called "x", say), set it to a value and use double(x)
.
The following program illustrates several points
def fun1(): print("hello") def fun2(): return "hello" print("Calling fun1 ...") fun1() print("Calling fun2 ...") fun2() answer1=fun1() answer2=fun2() print("Running fun1() returns", answer1) print("Running fun2() returns", answer2) answer3=fun2 print("Running fun2 returns",answer3)
When run from the command line, it produces
Calling fun1 ... hello Calling fun2 ... hello Running fun1() returns None Running fun2() returns hello Running fun2 returns <function fun2 at 0x7f9eec2edea0>
Note that
- Though
fun1
doesn't return anything, running it from the command line produces the same output as callingfun2
- If you forget to put brackets on the end of a function name, it's not a syntax error, though you probably won't get the result you expect.
Debugging
If programs aren't working, don't just stare at the code as if it were a maths proof. Treat it more like a circuit that you can probe to determine inputs/outputs of, or like a patient to whom you can ask questions. Print out
variables that might help you diagnose the problem. If you don't know whether something (x, for example) is a dictionary or a list, just ask Python by using print(type(x))
. Print, don't think.
For example, if your program is trying to sum a series but the answer's wrong, don't just stare at the maths. Print out the 1st term. Is it right? Print out the 2nd term. Print out the sum of the 1st 2 terms. Soon you'll see what's wrong. It's a good idea to write the program in little steps, checking at each stage. A little program I looked at recently had 3 bugs in its 10 lines, which made debugging hard - fixing one bug didn't seem to make a difference, which was disappointing.
An old exam question showed students some code, asking them to list the syntax errors and bad features. See what you think about this -
max=9; if max=7 print("max is wrong"
Another strategy is to pretend you're the computer and go through your program line by line, keeping a note of variables' values, seeing if your estimates match what the computer says.
Understanding python error messages
The error messages that Python prints aren't always easy to understand. At least you'll be told the line and file in which the error was detected (though the error may be earlier). You may also be able to deduce what the problem is, but you'll need to understand the jargon. You could use a search engine to see what the message means. Here are examples of common bugs.
-
Suppose this is in
ex1.py
def trysorting(): numbers=[1,3,2] sortednumbers=numbers.sort() print(sortednumbers[1]) trysorting()
This defines a function
trysorting
then calls it. Running this file produces the following error messageTraceback (most recent call last): File "ex1.py", line 7, in
trysorting() File "ex1.py", line 4, in trysorting print(sortednumbers[1]) TypeError: 'NoneType' object is not subscriptable Note that this a traceback - it shows you not just the line where the error was detected, but any function calls that led to the error. In this case
trysorting()
was run which eventually caused line 4 to be run. Even though line 4 is singled out, the real cause might be earlier. The error message'NoneType' object is not subscriptable
sounds rather cryptic. A "subscriptable" object is an object that can be subscripted (i.e. something like[1]
can be mentioned after its name).sortednumbers
isn't a list (which is subscriptable). It's of typeNone
.The programmer probably thought that the
numbers.sort()
method would put a sorted list of numbers intosortednumbers
. It doesn't. It sorts the values innumbers
"in place" - i.e. the order of the values innumbers
is changed - and returnsNone
. Replacingnumbers.sort()
on line 3 bysorted(numbers)
solves the problem - i.e. there was nothing wrong with line 4 after all. - Suppose you ran a cell with
max([1,3,2])
in it and you got an error messageTypeError: 'int' object is not callable
. What went wrong? The message means that you're using something as a function even though it's an integer. In an earlier cell you have something likemax=9
, so that the wordmax
no longer refers to the standardmax
function (n.b. Be careful when choosing variable and function names so that they don't clash with existing ones). Typingdel max
will cure this particular bug, removing themax
variable so that themax
function becomes visible.
What is self
?
Let's suppose you wanted to create a new type of thing in Python - not a number or a string but a person
. We'll give each person a name. Here's some code to create the Class (the new type of object), then create a variable of that type.
class person: def __init__(self,thename): self.name=thename p1=person("ali")
Unlike the functions you've so far written, you have no choice about the name of __init__
. It's called automatically when an object is created. Note the use of self
. It refers to the object that's being used - in this case p1
. So after this code is run, p1.name
exists.
Suppose now that we wanted to write a function to print a particular person's name. There are 2 ways we can go about this
- Write a function that is given the person as an input parameter -
def printname(theperson): print(theperson.name) printname(p1)
- Write a method - a function that belongs to the class. Note that the method goes inside the class definition and has in the definition an input parameter
self
, which refers to the person whoseprintname
method is being called. So in the example below the class has been modified. Whenp1.printname()
runs,self
will refer top1
class person: def __init__(self,thename): self.name=thename def printname(self): print(self.name) p1=person("ali") p1.printname()
Which is preferable? It's not always clear. In the Lent exercise you're told when to write a method and when to write a function.
Another function with a fixed name is __lt__
. Whenever 2 things are compared using <
, Python calls the appropriate __lt__
command (for strings it would be the one in the string class). Python also uses the __lt__
command when the sort
function is used. When it sorts strings, it uses the one in the string class. In Exercise 12 you create a new StudentEntry
class. If you want to use Python's sort
function with it, you'll need to supply the class with its own __lt__
method. Note that if s and t are StudentEntrys, then
s<t
calls
__lt__(s,t)
so __lt__
should return True when s (aka self) is less than t (aka other).
Comprehensions
Friends might tell you a clever solution to a problem, a solution you don't understand. Such solutions often involve comprehensions. They're not necessary, but they shorten programs. Many tasks involve going through a list and picking out certain items of interest, creating a list of those items.
Here's a simplified example, putting odd numbers into a list called answers
numbers=range(10) answers=[] for i in numbers: if i%2 == 1: answers.append(i)
Comprehensions were designed to deal with this kind of common task. You don't have to use them, but they'll make your code shorter and faster. The following code does the same as the fragment above.
numbers=range(10) answers=[ i for i in numbers if i%2 == 1 ]
Learning more
See the Advanced Python page.