Python
Python is a great ‘glue’ language, fast for prototyping and great for stitching together
libraries for machine learning and data science. You're Cambridge computer scientists, so you know how to code — and Python is
so intuitive that you can just about pick it all up by looking at example code.
If you're new to Python, start from the beginning and read up to List comprehensions.
If you already know some Python, start with iterating over two lists together
to make sure you know Python's shortcuts. Then read to the end,
to see how Python compares to languages from other IA courses, OCaml and to Java.
Syntax
Basics
As you go through this tutorial, some pages have mini exercises. Click ‘Show me’ if you get stuck
following the instructions.
Try evaluating this code, and look at the output.
Each code block returns the value of the final expression,
in this case line 6. If we want to see other
values, we can use print
statements.
x1 = "hello "
y = 2
print(x1 * y)
x2 = 1.5
x2 * y
Multiple returns and assignments
Another way to see multiple outputs is with tuples (pairs, triples, etc.).
In Python it's easy to create tuples ‘on the fly’.
Try adding a line at the end, to return a tuple of two values.
(x1*y, x2*y)
We can also assign several variables in one go, using tuples. Add the line
(v1, v2) = (x1*y, x2*y)
and then return v1 + str(v2)
x1 = "hello "
x2 = 1.5
y = 2
x1 = "hello "
x2 = 1.5
y = 2
(v1, v2) = (x1*y, x2*y)
v1 + str(v2)
Indentation and control flow
Python has all the usual control flow statements, if
, else
, while
,
continue
, break
.
We'll come to for
loops later, in the section on iterating over lists.
Python uses indentation to mark out blocks of statements,
rather than the delimiters used by most other languages.
Modify the code snippet here so it prints appropriate messages for the three cases, x > y
,
x < y
, and x == y
.
Note: while you could do this with a nested if
block, there's another structure that's easier to read:
if cond1:
...
elif cond2:
...
else:
....
Control structures like if
must be followed by an indented block.
If we're forced to include a block but our code doesn't need any statements to be executed,
use the no-op statement, pass
.
(x,y) = (7,15)
if x > y:
print("x larger")
else:
print("x smaller")
(x,y) = (7,15)
if x > y:
print("x larger")
elif x < y:
print("x smaller")
else:
print("they're equal")
Expressions
Reading error messages
One of the most important skills in any language is being able to read error messages!
Run the code here, and have a look at the error message.
- Start with the bottom line, which tells us it's a
TypeError
to add
a string to an integer.
- Then work up. The second-bottom line says the error occured on
line 2
of our code.
- It gives a full stack trace, going all the way back into library code, but we can ignore those parts
of the trace.
x = 'hello'
y = x + 5
y
Logic
Python has the usual logical operators, though the syntax is a bit wordier than other languages.
Python's truth values are True
and False
(note the capital letter).
Use ==
to test for equality (versus =
for assignment).
If / then expression
It's handy to have concise notation to return one value if a condition is met, a second value otherwise.
Try these lines:
res = "x is " + ("lower" if x < y else "higher")
print(res)
(This is called a ‘ternary operator’. In Java it's (x < y) ? "lower" : "higher"
)
Precedence of and, or, not
What precedence do Python logical operators have? Try this code and find out:
5 > 4 and not 0 > 10
5 > (4 and not 0) > 10
(5 > 4) and (not (0 > 10))
(x,y) = (5,12)
print(y - x == 1) # False
print(x * y == 60) # True
(x,y) = (5,12)
print(y - x == 1) # False
print(x * y == 60) # True
res = "x is " + ("lower" if x < y else "higher")
print(res)
5 > 4 and not 0 > 10 # True
(5 > 4) and (not (0 > 10)) # True (same expression as above)
5 > (4 and not 0) > 10 # False (involves typecasting between int and bool)
None: the null value
Python has a special value None
. It's a reserved keyword,
and it denotes ‘nothing’. It's a bit like Java's null.
The Python convention is that if a function doesn't have any value to return, it returns None
.
When we run a chunk of code, if the final value is None
then the notebook won't
display any output.
How would you display the value of x
?
To test if a value is None
, write “if x is None
” rather than “if x == None
”.
The difference between is
and ==
will be discussed later,
when we come to equality and identity of objects.
x = None
x
x = None
print(x)
Maths
All the usual maths operators work — though watch out for division which uses a different syntax to Java.
Try out these expressions and see what they give.
7 / 3
7 // 3
7 % 3
3**2
min(3,4)
abs(-10)
round(1.618)
round(1.618,2)
print("/ floating point division", 7 / 3)
print("// integer division", 7 // 3)
print("% modulus", 7 % 3)
print("** is power", 3**2)
print("min-imum", min(3,4))
print("abs-olute value", abs(-10))
print("round to integer", round(1.618))
Importing modules
In Python we organize functions etc. into modules. There are built-in modules for maths,
random numbers, and many other functions.
Add these lines at the top, to make these modules available.
import math
import random
(It's common to put import statements at the top of a notebook or script,
as they only need to be run once per session, but they can actually appear anywhere.
In this tutorial, each page is a new session.)
math.floor(-3.4)
math.exp(2)
math.log(101, 10)
math.sin(math.pi)
random.random()
import math
import random
math.floor(-3.4)
math.exp(2)
math.log(101, 10)
math.sin(math.pi)
random.random()
Strings
Python strings can be enclosed by either single quotes or double quotes or triple-quotes.
This is handy if we want a string containing quote marks.
Strings, like everything else in Python, are objects, and they have methods for various string-processing tasks.
Make the “shout” string upper-case, and replace substrings in “hitchhiker”, with these methods.
(Here, ¤ stands for an arbitrary string.)
¤.upper()
¤.replace('hi', 'ma')
Here's a full list of methods: String Methods documentation
Remember, if you want to see multiple outputs, you need to either call print
, or return them as a tuple.
x = '''
Triple-quoted strings are allowed
to span multiple lines
'''
"shout"
'hitchhiker'
x = '''
Triple-quoted strings are allowed
to span multiple lines
'''
"shout".upper()
'hitchhiker'.replace('hi', 'ma')
Splicing values into strings
A handy way to splice values into strings is with f-strings, i.e. strings with f before the opening quote.
Each
chunk of the string encosed in { }
is evaluated, and the result is spliced back into the string.
For example,
x = 'world'
f"hello {x}"
Try it yourself:
Change the first string to splice in name
and age+1
The chunk can also specify the output format:
In the second string, replace 3.14
by {math.pi:.3}
The documentation
describes more format specifiers.
(name, age) = ('Zaphod', 27)
s1 = "My name is Zaphod and I will be 28 next year"
print(s1)
import math
s2 = "The value of π to 3 significant figures is 3.14"
print(s2)
(name, age) = ('Zaphod', 27)
s1 = f"My name is {name} and I will be {age+1} next year"
print(s1)
import math
s2 = f"The value of π to 3 significant figures is {math.pi:.3}"
print(s2)
String processing
The code here shows some basic functions for string processing.
Python's syntax for joining together a list of strings, line 4,
is unusual. Most other languages think that ‘join’ should be a method of the array object,
but Python thinks it should be a method of the string object.
(We'll come to lists and arrays later in this tutorial.)
Can you get rid of the line breaks in x
, leaving it as words separated by single spaces?
Regular expressions
If you do any serious data processing in Python, you will likely find yourself needing
regular expressions.
Here's an example. What do you think it produces?
import re
s = 'In 2024 there will be an election'
re.search(r'(\\d+)', s)[0]
re.sub(r'a(n?) (\\w+)ion', 'a calamity', s)
s1 = "To remove blank space at the end ".strip()
print("_" + s1 + "_")
s2 = '---'.join(["join", "several", "strings"])
print(s2)
a1 = s1.split() # split at blank space
print(a1)
x = '''
Will no one rid me
of these meddlesome
line breaks?
'''
s1 = "To remove blank space at the end ".strip()
print("_" + s1 + "_")
s2 = '---'.join(["join", "several", "strings"])
print(s2)
a1 = s1.split() # split at blank space
print(a1)
x = '''
Will no one rid me
of these meddlesome
line breaks?
'''
' '.join(x.split())
Collections
Collection types
Python has four common types for storing collections of values:
- lists, which also behave like arrays
- tuples, which also behave like arrays
- dictionaries (called HashMaps in Java)
- sets
They can store heterogeneous collections of objects, e.g. integers and strings.
In IA courses on OCaml and Java we learnt about lists versus arrays.
In those courses, and in IA Algorithms, we study the efficiency of various implementation choices.
In Python, you shouldn’t think about these things, at least not in the first instance.
The Pythonic style is to just go ahead and code, and only worry about efficiency after we have working code.
As the famous computer scientist Donald Knuth said,
Programmers waste enormous amounts of time thinking about, or worrying about,
the speed of noncritical parts of their programs, and these attempts at efficiency
actually have a strong negative impact when debugging and maintenance are considered.
We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
Only when we have special requirements should we switch to a dedicated collection type, such as a
deque or a
heap
or the specialized numerical types we’ll learn about in section 2.
my_list = [1, 2, 'buckle my shoe']
my_tuple = (3, 4, 'knock at the door')
my_dict = {'Adrian': None, 'Laura': 32, 'Guarav': 19}
my_set = {'Adrian', 'Laura', 'Guarav'}
Lists and tuples
Python lists and Python tuples are both used to store sequences of elements.
They both support iterating over the elements, concatenation, random access, and so on.
They're a bit like lists, and a bit like arrays.
len(ℓ) # length
x in ℓ # does ℓ contain item x?
ℓ.index(x) # find position of item x
ℓ[n] # nth item (n=0 is first)
ℓ[-n] # nth from end (n=-1 is last)
ℓ1+ℓ2 # concatenate
The difference is that lists are mutable, whereas tuples are immutable.
Add these two lines, and run the code. You'll get an error message. What do you think it means?
x[0] = 0
x.append(3)
Modify the code so those lines work.
x = (1, 2, 'buckle my shoe')
# length
print(f"length={len(x)}")
# concatenation
y = (3, 4, 'knock at the door')
print(x + y)
x = [1, 2, 'buckle my shoe']
# length
print(f"length={len(x)}")
# concatenation
y = (3, 4, 'knock at the door')
print(x + list(y))
# This creates a list out of the tuple y.
# Or, we could have made y be a list, y = [3, 4, 'knock at the door']
x[0] = 0
x.append(3)
x
Exercise
What is the difference between the two commented lines? Do they give the same result?"
a = [1, 2, 'buckle my shoe']
b = (a, 3, 4, 'knock at the door')
# b[0].append('then')
# b[0] = b[0] + ['then']
print("a:", a)
print("b:", b)
a = [1, 2, 'buckle my shoe']
b = (a, 3, 4, 'knock at the door')
# This line modifies the 'a' list
# b[0] is still the 'a' list, same as before -- but the 'a' list has changed
b[0].append('then')
# This line causes an error
# b is a tuple, so we can't modify it.
# b[0] = b[0] + ['then']
print("a:", a)
print("b:", b)
Operations on lists
Since lists are mutable, we have a choice between modifying them in-place
versus returning a new list without changing the original.
The code on the right illustrates the difference between the in-place method
ℓ.sort()
and the pure function sorted(ℓ)
.
The operation names.sort()
modifies the list in-place.
What value does it return?
Is the output window not showing what you expect?
Look back at the notes about None.
names = ['Bethe', 'Alpher', 'Gamov']
# return a new list
names2 = sorted(names)
names2 = names2 + ['Dalton','Edison']
print(names, "→", names2)
# modify the list in-place
names.sort()
names.extend(['Dalton','Edison'])
print(names)
names = ['Bethe', 'Alpher', 'Gamov']
# In-place operations return None.
# This is so we don't confuse ourselves by using an in-place operation
# when we actually meant to return a new value
res = names.sort()
print(res)
Iterating over lists
To iterate over items in a list,
for x in my_list:
... # do something with x
To just do something n times, we can create the list [0,1,…,n-1] using range
, and iterate
over it.
If we don't care about the loop variable, it's conventional to call it _
.
for _ in range(n):
... # do something
Technically, range(n)
produces a lazy list-like object, which
only uses O(1) storage, rather than O(n). To create the actual list, list(range(n))
.
To iterate over items in a list together with their indexes,
for (i, x) in enumerate(my_list):
... # do something with i, x
Consider the sequence 1,3,6,10,..., where the nth item xn is given by
xn = xn-1+n. Can you compute
x1+x2+⋯+xn using iteration?
for i, name in enumerate(['Bethe', 'Alpher', 'Gamov']):
print(f"Person {name} is in position {i}")
n = 5
# Let x be the list as described. Compute x1 + x2 + ... + xn.
for i, name in enumerate(['Bethe', 'Alpher', 'Gamov']):
print(f"Person {name} is in position {i}")
n = 5
(x, s) = (1, 0)
for i in range(n):
(x, s) = (x+i+2, s+x)
s
Iterating over two lists together
To iterate over two lists simultaneously, zip
them.
for x1,x2 in zip(list1,list2):
... # do something with x1,x2
(We can in fact zip together as many lists as we like — unlike zips on clothes!)
Can you modify the code to produce this printout?
Course 1: apple with cheddar
Course 2: orange with wensleydale
Course 3: grape with brie
fruits = ['apple', 'orange', 'grape']
cheeses = ['cheddar', 'wensleydale', 'brie']
for fruit,cheese in zip(fruits, cheeses):
print(f"{fruit} goes with {cheese}")
fruits = ['apple', 'orange', 'grape']
cheeses = ['cheddar', 'wensleydale', 'brie']
for i,(fruit,cheese) in enumerate(zip(fruits, cheeses)):
print(f"Course {i+1}: {fruit} with {cheese}")
# Or, less elegantly:
# for i,fruit,cheese in zip(range(len(fruits)), fruits, cheeses)
Slice notation
We can pick out subsequences of a list or tuple using the slice notation,
x[start:end]
.
x[:n] # first n items
x[n:] # all except first n
x[-n:] # last n items
x[:-n] # all except last n
x[m:n] # x[m], ..., x[n-1]
We can also assign into slices:
x[slice
] = list_of_replacement_items
There's also the step notation x[start:end:step]
:
x[::d] # x[0], x[d], ...
x[::-d] # x[-1], x[-1-d], ...
x[m::d] # x[m], x[m+d], ...
Strings behave like lists of characters, and the slice notation works on them also.
Given the string 'once upon a time', can you use slice notation to (a) reverse the characters, (b) reverse the words?
x = 'once upon a time'
# reverse characters
# reverse words
x = 'once upon a time'
# reverse characters
x[::-1]
# reverse words
' '.join(x.split()[::-1])
Dictionaries
The other very useful data type is the dictionary (what Java calls a Map or HashMap).
d = {} # create an empty dictionary
d[k] # get value for key k
d.get(k, v0) # default to v0 if key missing
k in d
d[k] = v # set or update value for key k
del d[k] # delete entry for key k
for k,v in d.items():
... # do something with key k, value v
Given a dictionary keyed by person, saying which room they're in,
can you create a new dictionary keyed by room, containing a list of its occupants?
room = {'adrian': 10, 'chloe': 5, 'guarav': 10, 'shay': 11,
'alexis': 11, 'rebecca': 10, 'zubin': 5}
occupants = {}
# ???
for room, occupants_here in occupants.items():
ns = ', '.join(occupants_here)
print(f'Room {room} has {ns}')
room = {'adrian': 10, 'chloe': 5, 'guarav': 10, 'shay': 11,
'alexis': 11, 'rebecca': 10, 'zubin': 5}
occupants = {}
for name, room in room.items():
if room not in occupants:
occupants[room] = []
occupants[room].append(name)
for room, occupants_here in occupants.items():
ns = ', '.join(occupants_here)
print(f'Room {room} has {ns}')
List comprehensions
There's one unusual piece of syntax that we see all over the place in Python code, called comprehension.
It's a succinct way to write code for creating new lists by transforming other lists or dictionaries.
In this code, list
represents an arbitrary list,
dict
represents an arbitrary dictionary,
and expr
represents any expression at all.
[expr
for x in list
]
[expr
for k,v in dict
.items()]
We can also specify a filter condition, to only keep some of the items. Here test_expr
represents a boolean expression.
[expr
for x in list
if test_expr
]
Can you select the squares of the even numbers in my_list
?
Hint: look at the maths operators.
my_list = range(10)
# Squares of the items in the list
[x**2 for x in my_list]
# Select the squares of even numbers in my_list
# Should get answer [0,4,16,36,64]
my_list = range(10)
# Squares of the items in the list
[x**2 for x in my_list]
# Select the squares of even numbers in my_list
[x**2 for x in my_list if x % 2 == 0]
Dictionary comprehensions
We can also create a new dictionary using comprehensions.
{key_expr
: val_expr
for x in list
}
{key_expr
: val_expr
for x in dict
.items()}
Here, key_expr
and val_expr
represent arbitrary expressions.
We can also filter with a cond_expr
.
What do you think happens if we specify the same key twice? Evaluate this expression, and explain what you get.
{x**2: x for x in range(-5,5)
my_list = range(10)
# A dict to map an item to its square
{x: x**2 for x in my_list}
my_list = range(10)
# A dict to map an item to its square
{x: x**2 for x in my_list}
# For x in [-5,-4,-3,...,3,4],
# set the key to be x**2 and the val to be x
# and when we set the same key twice, the later value overrides the earlier
{x**2: x for x in range(-5,5)}
Exercise
Give a one-line expression to sort names
by length, breaking ties alphabetically.
Hint:
- Make a list of
(len(name),name)
using list comprehension
- Sort your list — when Python sorts a list of tuples, it uses lexicographic ordering
- Use another list comprehension to pick out the second element of each pair in your sorted list
names = ['adrian', 'chloe', 'guarav', 'shay', 'alexis', 'rebecca', 'zubin']
# sorted by length, breaking ties alphabetically
names = ['adrian', 'chloe', 'guarav', 'shay', 'alexis', 'rebecca', 'zubin']
# sorted by length, breaking ties alphabetically
[v for _,v in sorted([(len(n),n) for n in names])]
Exercise
Give a one-line expression that counts the number of distinct elements in a list,
using only list and dictionary comprehensions.
Hint:
- Create a dictionary whose keys are the elements of
x
and whose values are anything, e.g. True
- Find the length of this dictionary using
len
In practice, a better way to express our intent is by creating a set
from
x
,
len(set(x))
import random
random.seed(1618)
x = [random.randint(0,10) for _ in range(20)]
# number of distinct elements in x
import random
random.seed(1618)
x = [random.randint(0,10) for _ in range(20)]
# number of distinct elements in x
len({y:True for y in x})
Double comprehensions
To iterate over nested lists, we can use a double comprehension
[¤ for ℓ in my_list for item in ℓ]
which is shortand for
for ℓ in my_list:
for item in ℓ:
# produce ¤
Similar syntax works for dictionaries.
Flatten the dictionary d into a single list of all the tasks to be done.
my_list = [[0,1,2], [4,5]]
flattened = [x for s in my_list for x in s]
print(flattened)
d = {'clean':['kitchen','bathroom'], 'wash':['clothes','dishes','face']}
# Flatten this into a single list
# ['clean 1: kitchen', 'clean 2: bathroom', 'wash 1: clothes', ...]
my_list = [[0,1,2], [4,5]]
flattened = [x for s in my_list for x in s]
print(flattened)
d = {'clean':['kitchen','bathroom'], 'wash':['clothes','dishes','face']}
# Flatten this into a single list
# ['clean 1: kitchen', 'clean 2: bathroom', 'wash 1: clothes', ...]
[f"{act} {i+1}: {v}" for act,things in d.items() for i,v in enumerate(things)]
Unpacking with * and **
The unpacking syntax is advanced Python.
- If ℓ is a list, then
*ℓ
is the sequence of items in ℓ
- If δ is a dictionary, then
**δ
is the sequence of keys and values in δ
This sounds crazy. What is a sequence of items, if not a list?
Don't think of the sequence in terms of what it is, think of what it can be used for.
-
The arguments to a function are a sequence. So, if we want to call
f(ℓ[0], ℓ[1], ...)
,
we can do it with f(*ℓ)
- When we create a list, the things inside
[ ]
are a sequence.
This gives a simple way to concatenate and append to lists.
Similarly for dictionaries.
How would you create a new dictionary that combines d1 and d2?
x = [2,3,True]
print(x)
print(*x)
print([*x, "yes", range(2)])
# create a new dictionary that combines these two
d1 = {'a':3, 'b':10}
d2 = {'b':5, 'c':12}
x = [2,3,True]
print(x)
print(*x)
print([*x, "yes", range(2)])
# create a new dictionary that combines these two
d1 = {'a':3, 'b':10}
d2 = {'b':5, 'c':12}
{**d1, **d2}
Programming
Assertions
One of the most useful statements in scientific computing is assert
.
assert testexpr
, errmessage
When we're exploring an idea there may be all sorts of corner cases that we want to leave to one side —
our big idea might not even work in the first place, so there's no point implementing all the corner cases
until we've decided the idea works.
But on the other hand, in case it does work, we don't want to leave ‘logic mines’ in our code
that will explode under our feet when we're looking elsewhere.
An AssertionError
is a message to our
future self saying “you'll need to think this through” and it's much more helpful than silently
returning a wrong answer.
import math
def myidea(x):
z = math.sin(2*x) + 0.5
assert z >= 0.2, "TODO: the maths behind this eqn is wrong for z < 0.2"
return math.exp(z - 0.2)
myidea(1.1) # works
myidea(2.3) # fails with AssertionError
Defining functions
This is what functions look like in Python.
This particular function solves the quadratic equation a + b x + c x2=0
for x, and returns a list with 0, 1, or 2 solutions.
Try calling the function in these three ways. What do you see?
roots(2,3,0)
roots(2,3)
roots(b=3, a=2)
Python has a nice system for named arguments and default arguments.
We can provide the arguments in order. Or we can provide them in any order, as long as we give their names.
And we can skip arguments with default values, in this case c=0
.
This function returns a single value, namely a list (or it throws an exception). In general,
- If we want to return several variables, return them in a tuple,
and unpack the tuple using multiple assignment.
- If our function finishes without an explicit return statement, it will return
None
.
- It's possible for different branches of a function to return values of different types —
at risk to our sanity.
import math
def roots(a, b, c=0):
"""Return a list with the real roots of a + b*x + c*(x**2) == 0"""
if b == 0 and c == 0:
raise Exception("This polynomial is constant")
if c == 0:
return [-a/b]
elif a == 0:
return [0] + roots(b=c, a=b)
else:
discr = b**2 - 4*c*a
if discr < 0:
return []
else:
return [(-b+s*math.sqrt(discr))/2/c for s in [-1,1]]
Returning nothing
It's often handy for functions to be able to return either a value, or a marker that there is no value.
For example, head(ℓ)
should return the first item in list ℓ unless the list is empty in which
case there's nothing to return.
In Python, the convention is to return None
when you have nothing to return, and a value otherwise.
The Python style is to assume we're all adults.
Trust that the person who called you won't just blindly assume there is
a value.
This is in contrast to other languages like OCaml, in which we'd define head
to return an enumerated
‘maybe’ datatype
None | Some['a]
. This forces everyone who uses the function to check whether or not the answer is None
.
Rewrite this function using the more idiomatic ternary expression a if cond else b
.
When we run this code, it displays (No output)
. Most Python development environments,
when they evaluate a block and get the result None
, will suppress the output.
def head(a_list):
if len(a_list) > 0:
return a_list[0]
else:
return None
head([])
def head(a_list):
return a_list[0] if len(a_list) > 0 else None
Dynamic typing
Python uses dynamic typing, which means that values are tagged with their types during execution
and checked only then.
To illustrate, consider the functions goodfunc
and badfunc
.
We won't be told of errors until badfunc()
is invoked, even though it's clear when we
define it that it will fail.
def double_items(xs):
return [x*2 for x in xs]
def goodfunc():
return double_items([1,2,[3,4]]) + double_items("hello world")
def badfunc():
return double_items(10)
badfunc()
Duck typing
Python programmers are encouraged to use duck typing, which means that you should test values for
what they can do rather than what they’re tagged as.
“If it walks like a duck, and it quacks like a duck, then it’s a duck”.
In this example, double_items(xs)
iterates through xs
and applies *2
to every element,
so it should apply to any xs that supports iteration and whose elements all support *2
.
These operations mean different things to different types: iterating over a list returns its elements,
while iterating over a string returns its characters; doubling a number is an arithmetical operation,
doubling a string or list repeats it.
Why does double_items("hello")
not cause an error,
whereas double_items(10)
does?
Python does allow you to test the type of a value
but
programmers are encouraged not to do this.
Python’s philosophy is that library designers are providing a service, and programmers are adults.
If a library function uses comparison and addition, and if the end-user programmer invents a
new class that supports comparison and addition, then why on earth shouldn’t the programmer be allowed to
use the library function?
I’ve found this useful for simulators: I replaced ‘numerical timestamp’ with
‘rich timestamp class that supports auditing, listing which events depended on which other events’,
and I didn’t have to change a single line of the simulator body.)
Some statically typed languages like Haskell and Scala support this via dynamic type classes,
but their syntax is rather heavy.
def double_items(xs):
return [x*2 for x in xs]
double_items([5,3,1,8])
double_items(['yes','of course'])
double_items('hello')
def double_items(xs):
return [x*2 for x in xs]
double_items([5,3,1,8])
double_items(['yes','of course'])
# note: strings behave like lists of characters
double_items('hello')
Objects and types
Python is an object-oriented programming language. Every value is an object, and it has a class.
Python supports inheritance and multiple inheritance, and static methods, and class variables, and so on.
# get the class of x
type(x)
# is x an instance of class C,
# or does x's class inherit from C?
isinstance(x, C)
Here's example code, a Tree class. Each Tree object has a list of its children,
and this may include other Trees.
- To create an object of class
C
, use myobj = C(...)
- Every method takes a first argument
self
referring to the current object, this
in Java.
We don't supply self
when we call methods, Python fills it in for us.
- The constructor is called
__init__
, and it too takes a first argument self
- Use
self
inside the class definition to call other methods or to set
member variables
- To define a class that inherits from another class C, start with
class NewClass(C):
Implement a function flatten(t)
which returns a list consisting of
all the leaves of tree t.
class Tree:
def __init__(self, children):
self.children = children
t1 = Tree([3,2])
print(t1.children)
t2 = Tree([10,t1,"hello"])
print(t2.children)
# Implement a function flatten(t)
# flatten(t2) should return [10,3,2,"hello"]
class Tree:
def __init__(self, children):
self.children = children
t1 = Tree([3,2])
print(t1.children)
t2 = Tree([10,t1,"hello"])
print(t2.children)
# Implement a function flatten(t)
# flatten(t2) should return [10,3,2,"hello"]
def flatten(x):
if isinstance(x, Tree):
return [y for child in x.children for y in flatten(child)]
else:
return [x]
flatten(t2)
Object attributes
Set member attributes of an object using
obj.attr = val
This can be done in any method of the class. They don't need to be declared in the class
definition or in the constructor.
Member attributes can also be set outside the class,
what's called ‘monkey patching’.
Like so many language features in Python, this is sometimes tremendously handy,
and sometimes the source of infuriating bugs.
class C:
def __init__(self, x):
self.zing = x
myobj = C(10)
myobj.zong = 11
print(myobj.zing, myobj.zong)
Interfaces
Python doesn't support interfaces, because they don't make sense in a
duck typing language.
It's more Pythonic to just charge ahead and assume the caller gave you
objects with the right methods.
Here's an example, flattening a tree. We've seen how to achieve this
with a Tree class. But we don't actually need a class at all!
The code shown here will work on any objects the caller gives us.
It uses duck typing.
The flatten
function assumes that branch nodes are iterable objects e.g. lists or tuples,
and it simply charges ahead and tries to iterate.
If x isn't iterable then the for
will raise a TypeError
exception,
telling us that x must be a leaf.
Dunder methods
Python lets us define custom classes that hook into the usual Python syntax.
For example, if we define a new class with the method
__iter__
then objects of our class can be iterated over with for
syntax, just like a list.
There's a long list of
special method names
that we can use. They're sometimes called ‘dunder methods’, for ‘double underline’.
def flatten(x):
try:
return [y for child in x for y in flatten(child)]
except TypeError as e:
return [x]
x = [1,[[2,4,3],9],[5,[6,7],8]]
flatten(x)
Equal value (==) or identity (is)
When we ask if two objects are equal, there are two things we might mean:
x is y # x is the same object as y
x == y # x and y have the same value
Objects can customize the “same value” test.
For lists, the test is: do the two lists have have the same length, and are all their values equal?
To customize this test for your own classes, use the dunder method
__eq__
.
x = [1, 5, 3]
y = [1, 5, 3]
print(x is y, x == y)
Functional programming
In Python as in OCaml, functions can be returned as results, assigned, put into lists, passed as arguments,
and so on.
In this example, noisifier
is a function that returns another function.
The inner function ‘remembers’ the value of σ under which it was defined; this is known as a closure.
import random
def noisifier(σ):
def add_noise(x):
return x + random.uniform(-σ, σ)
return add_noise
σlist = [0.1, 1, 10]
fs = [noisifier(σ) for σ in σlist]
for f,σ in zip(fs, σlist):
print(f"1.5 plus noise σ={σ} gives {f(1.5):.3}, {f(1.5):.3}, {f(1.5):.3}")
Anonymous (lambda) functions
We can use lambda
to define anonymous functions, i.e. functions without names.
def illustrate_func(f, xs):
for x in xs:
print(f"f({x}) = {f(x)}")
illustrate_func(lambda x: x+1, xs = range(5))
print()
illustrate_func(lambda x: x*2, xs = range(5))
Generators and lazy lists
A generator (or lazy list) is a sequence where the elements are only
computed on demand. We can create them by simply
defining a function that has the yield
statement somewhere inside.
def f():
....
yield ¤
g = f()
next(g), next(g)
When we call next(g)
, it runs through f
until it reaches the next
yield statement, then it emits a value and pauses.
Think of g
as an execution pointer plus call stack:
it remembers where it is inside the f
function, and calling next
tells
it to resume executing until the next time it hits yield.
The range
function,
which we've seen used for iteration, is actually a generator.
Generators also let us implement infinite sequences.
Generator comprehensions
We can write comprehensions for generators, using the same sort of syntax
as for list and dictionary comprehensions. Use ( )
for generator comprehensions,
as opposed to
[ ]
for lists and { }
for dictionaries.
(¤ for x in g if ¤)
Get the first 10 even Fibonacci numbers, using the filtered generator
even_fibs = (x for x in fib() if x % 2 == 0)
def fib():
x,y = 1,1
while True:
yield x
x,y = (y, x+y)
fibs = fib()
[next(fibs) for _ in range(10)]
def fib():
x,y = 1,1
while True:
yield x
x,y = (y, x+y)
fibs = fib()
[next(fibs) for _ in range(10)]
even_fibs = (x for x in fib() if x % 2 == 0)
[next(even_fibs) for _ in range(10)]
Next steps
Notebooks
Well done!
You've finished this tutorial about Python.
The next step is to move on to longer programs!
Jupyter notebooks are great for building up longer programs interactively, trying out
ideas and visualizing the results. This makes them a good fit for scientific computing — for machine
learning a data science. Though they do make it easy to fall into spaghetti coding …
Have a look at Exercise 0.
You'll pick up some tips for organizing your code in Jupyter notebooks. And you'll also
be introduced to the Autograder, the system for online ticking used in this Scientific Computing course.
x = "Well done! "
for i in range(len(x)+1):
print(x[i:] + x[:i])