Python is a great ‘glue’ language, fast for prototyping and great for stitching together libraries for machine learning and data science. You're Cambridge computer scientists, so you know how to code — and Python is so intuitive that you can just about pick it all up by looking at example code.

If you're new to Python, start from the beginning and read up to List comprehensions.

If you already know some Python, start with iterating over two lists together to make sure you know Python's shortcuts. Then read to the end, to see how Python compares to languages from other IA courses, OCaml and to Java.

Syntax

Basics

As you go through this tutorial, some pages have mini exercises. Click ‘Show me’ if you get stuck following the instructions.

Try evaluating this code, and look at the output.

Each code block returns the value of the final expression, in this case line 6. If we want to see other values, we can use print statements.

x1 = "hello "
y = 2
print(x1 * y)

x2 = 1.5
x2 * y

Multiple returns and assignments

Another way to see multiple outputs is with tuples (pairs, triples, etc.). In Python it's easy to create tuples ‘on the fly’.

Try adding a line at the end, to return a tuple of two values.

(x1*y, x2*y)

We can also assign several variables in one go, using tuples. Add the line

(v1, v2) = (x1*y, x2*y)

and then return v1 + str(v2)

x1 = "hello "
x2 = 1.5
y = 2

x1 = "hello "
x2 = 1.5
y = 2
(v1, v2) = (x1*y, x2*y)
v1 + str(v2)

Indentation and control flow

Python has all the usual control flow statements, if, else, while, continue, break.

We'll come to for loops later, in the section on iterating over lists.

Python uses indentation to mark out blocks of statements, rather than the delimiters used by most other languages.

Modify the code snippet here so it prints appropriate messages for the three cases, x > y, x < y, and x == y.

Note: while you could do this with a nested if block, there's another structure that's easier to read:

if cond1:
    ...
elif cond2:
    ...
else:
    ....

Control structures like if must be followed by an indented block. If we're forced to include a block but our code doesn't need any statements to be executed, use the no-op statement, pass.

(x,y) = (7,15)

if x > y:
    print("x larger")
else:
    print("x smaller")

(x,y) = (7,15)

if x > y:
    print("x larger")
elif x < y:
    print("x smaller")
else:
    print("they're equal")

Expressions

Reading error messages

One of the most important skills in any language is being able to read error messages!

Run the code here, and have a look at the error message.

Start with the bottom line, which tells us it's a TypeError to add a string to an integer.
Then work up. The second-bottom line says the error occured on line 2 of our code.
It gives a full stack trace, going all the way back into library code, but we can ignore those parts of the trace.

x = 'hello'
y = x + 5
y

Logic

Python has the usual logical operators, though the syntax is a bit wordier than other languages. Python's truth values are True and False (note the capital letter).

Use == to test for equality (versus = for assignment).

If / then expression

It's handy to have concise notation to return one value if a condition is met, a second value otherwise.

Try these lines:

res = "x is " + ("lower" if x < y else "higher")
print(res)

(This is called a ‘ternary operator’. In Java it's (x < y) ? "lower" : "higher")

Precedence of `and, or, not`

What precedence do Python logical operators have? Try this code and find out:

5 > 4 and not 0 > 10
5 > (4 and not 0) > 10
(5 > 4) and (not (0 > 10))

(x,y) = (5,12)
print(y - x == 1)     # False
print(x * y == 60)    # True

(x,y) = (5,12)
print(y - x == 1)     # False
print(x * y == 60)    # True

res = "x is " + ("lower" if x < y else "higher")
print(res)

5 > 4 and not 0 > 10        # True
(5 > 4) and (not (0 > 10))  # True (same expression as above)
5 > (4 and not 0) > 10      # False (involves typecasting between int and bool)

None: the null value

Python has a special value None. It's a reserved keyword, and it denotes ‘nothing’. It's a bit like Java's null. The Python convention is that if a function doesn't have any value to return, it returns None.

When we run a chunk of code, if the final value is None then the notebook won't display any output.

How would you display the value of x?

To test if a value is None, write “if x is None” rather than “if x == None”. The difference between is and == will be discussed later, when we come to equality and identity of objects.

x = None
x

x = None
print(x)

Maths

All the usual maths operators work — though watch out for division which uses a different syntax to Java.

Try out these expressions and see what they give.

7 / 3
7 // 3
7 % 3
3**2
min(3,4)
abs(-10)
round(1.618)
round(1.618,2)

print("/ floating point division", 7 / 3)
print("// integer division", 7 // 3)
print("% modulus", 7 % 3)
print("** is power", 3**2)
print("min-imum", min(3,4))
print("abs-olute value", abs(-10))
print("round to integer", round(1.618))

Importing modules

In Python we organize functions etc. into modules. There are built-in modules for maths, random numbers, and many other functions.

Add these lines at the top, to make these modules available.

import math
import random

(It's common to put import statements at the top of a notebook or script, as they only need to be run once per session, but they can actually appear anywhere. In this tutorial, each page is a new session.)

math.floor(-3.4)
math.exp(2)
math.log(101, 10)
math.sin(math.pi)

random.random()

import math
import random

math.floor(-3.4)
math.exp(2)
math.log(101, 10)
math.sin(math.pi)

random.random()

Strings

Python strings can be enclosed by either single quotes or double quotes or triple-quotes. This is handy if we want a string containing quote marks.

Strings, like everything else in Python, are objects, and they have methods for various string-processing tasks.

Make the “shout” string upper-case, and replace substrings in “hitchhiker”, with these methods. (Here, ¤ stands for an arbitrary string.)

¤.upper()
¤.replace('hi', 'ma')

Here's a full list of methods: String Methods documentation

Remember, if you want to see multiple outputs, you need to either call print, or return them as a tuple.

x = '''
Triple-quoted strings are allowed
to span multiple lines
'''

"shout"
'hitchhiker'

x = '''
Triple-quoted strings are allowed
to span multiple lines
'''

"shout".upper()
'hitchhiker'.replace('hi', 'ma')

Splicing values into strings

A handy way to splice values into strings is with f-strings, i.e. strings with f before the opening quote. Each chunk of the string encosed in { } is evaluated, and the result is spliced back into the string. For example,

x = 'world'
f"hello {x}"

Try it yourself:

Change the first string to splice in name and age+1

The chunk can also specify the output format:

In the second string, replace 3.14 by {math.pi:.3}

The documentation describes more format specifiers.

(name, age) = ('Zaphod', 27)
s1 = "My name is Zaphod and I will be 28 next year"
print(s1)

import math
s2 = "The value of π to 3 significant figures is 3.14"
print(s2)

(name, age) = ('Zaphod', 27)
s1 = f"My name is {name} and I will be {age+1} next year"
print(s1)

import math
s2 = f"The value of π to 3 significant figures is {math.pi:.3}"
print(s2)

String processing

The code here shows some basic functions for string processing.

Python's syntax for joining together a list of strings, line 4, is unusual. Most other languages think that ‘join’ should be a method of the array object, but Python thinks it should be a method of the string object. (We'll come to lists and arrays later in this tutorial.)

Can you get rid of the line breaks in x, leaving it as words separated by single spaces?

Regular expressions

If you do any serious data processing in Python, you will likely find yourself needing regular expressions. Here's an example. What do you think it produces?

import re
s = 'In 2024 there will be an election'
re.search(r'(\\d+)', s)[0]
re.sub(r'a(n?) (\\w+)ion', 'a calamity', s)

s1 = "To remove blank space at the end  ".strip()
print("_" + s1 + "_")

s2 = '---'.join(["join", "several", "strings"])
print(s2)

a1 = s1.split() # split at blank space
print(a1)

x = '''
Will no  one  rid me
of these  meddlesome
line breaks?
'''

s1 = "To remove blank space at the end  ".strip()
print("_" + s1 + "_")

s2 = '---'.join(["join", "several", "strings"])
print(s2)

a1 = s1.split() # split at blank space
print(a1)

x = '''
Will no  one  rid me
of these  meddlesome
line breaks?
'''

' '.join(x.split())

Collections

Collection types

Python has four common types for storing collections of values:

lists, which also behave like arrays
tuples, which also behave like arrays
dictionaries (called HashMaps in Java)
sets

They can store heterogeneous collections of objects, e.g. integers and strings.

In IA courses on OCaml and Java we learnt about lists versus arrays. In those courses, and in IA Algorithms, we study the efficiency of various implementation choices. In Python, you shouldn’t think about these things, at least not in the first instance. The Pythonic style is to just go ahead and code, and only worry about efficiency after we have working code. As the famous computer scientist Donald Knuth said,

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

Only when we have special requirements should we switch to a dedicated collection type, such as a deque or a heap or the specialized numerical types we’ll learn about in section 2.

my_list = [1, 2, 'buckle my shoe']
my_tuple = (3, 4, 'knock at the door')
my_dict = {'Adrian': None, 'Laura': 32, 'Guarav': 19}
my_set = {'Adrian', 'Laura', 'Guarav'}

Lists and tuples

Python lists and Python tuples are both used to store sequences of elements. They both support iterating over the elements, concatenation, random access, and so on. They're a bit like lists, and a bit like arrays.

len(ℓ)     # length
x in ℓ     # does ℓ contain item x?
ℓ.index(x) # find position of item x
ℓ[n]       # nth item (n=0 is first)
ℓ[-n]      # nth from end (n=-1 is last)
ℓ1+ℓ2      # concatenate

The difference is that lists are mutable, whereas tuples are immutable.

Add these two lines, and run the code. You'll get an error message. What do you think it means?

x[0] = 0
x.append(3)

Modify the code so those lines work.

x = (1, 2, 'buckle my shoe')

# length
print(f"length={len(x)}")

# concatenation
y = (3, 4, 'knock at the door')
print(x + y)

x = [1, 2, 'buckle my shoe']

# length
print(f"length={len(x)}")

# concatenation
y = (3, 4, 'knock at the door')
print(x + list(y))
# This creates a list out of the tuple y.
# Or, we could have made y be a list, y = [3, 4, 'knock at the door']

x[0] = 0
x.append(3)
x

Exercise

What is the difference between the two commented lines? Do they give the same result?"

a = [1, 2, 'buckle my shoe']
b = (a, 3, 4, 'knock at the door')
# b[0].append('then')
# b[0] = b[0] + ['then']
print("a:", a)
print("b:", b)

a = [1, 2, 'buckle my shoe']
b = (a, 3, 4, 'knock at the door')

# This line modifies the 'a' list
# b[0] is still the 'a' list, same as before -- but the 'a' list has changed
b[0].append('then')

# This line causes an error
# b is a tuple, so we can't modify it.
# b[0] = b[0] + ['then']

print("a:", a)
print("b:", b)

Operations on lists

Since lists are mutable, we have a choice between modifying them in-place versus returning a new list without changing the original.

The code on the right illustrates the difference between the in-place method ℓ.sort() and the pure function sorted(ℓ).

The operation names.sort() modifies the list in-place. What value does it return?

Is the output window not showing what you expect? Look back at the notes about None.

names = ['Bethe', 'Alpher', 'Gamov']

# return a new list
names2 = sorted(names)
names2 = names2 + ['Dalton','Edison']
print(names, "→", names2)

# modify the list in-place
names.sort()
names.extend(['Dalton','Edison'])
print(names)

names = ['Bethe', 'Alpher', 'Gamov']

# In-place operations return None.
# This is so we don't confuse ourselves by using an in-place operation
# when we actually meant to return a new value

res = names.sort()
print(res)

Iterating over lists

To iterate over items in a list,

for x in my_list:
    ... # do something with x

To just do something n times, we can create the list [0,1,…,n-1] using range, and iterate over it. If we don't care about the loop variable, it's conventional to call it _.

for _ in range(n):
    ... # do something

Technically, range(n) produces a lazy list-like object, which only uses O(1) storage, rather than O(n). To create the actual list, list(range(n)).

To iterate over items in a list together with their indexes,

for (i, x) in enumerate(my_list):
    ... # do something with i, x

Consider the sequence 1,3,6,10,..., where the nth item x_n is given by x_n = x_n-1+n. Can you compute x₁+x₂+⋯+x_n using iteration?

for i, name in enumerate(['Bethe', 'Alpher', 'Gamov']):
    print(f"Person {name} is in position {i}")

n = 5
# Let x be the list as described. Compute x1 + x2 + ... + xn.

for i, name in enumerate(['Bethe', 'Alpher', 'Gamov']):
    print(f"Person {name} is in position {i}")

n = 5
(x, s) = (1, 0)
for i in range(n):
    (x, s) = (x+i+2, s+x)
s

Iterating over two lists together

To iterate over two lists simultaneously, zip them.

for x1,x2 in zip(list1,list2):
    ... # do something with x1,x2

(We can in fact zip together as many lists as we like — unlike zips on clothes!)

Can you modify the code to produce this printout?

Course 1: apple with cheddar
Course 2: orange with wensleydale
Course 3: grape with brie

fruits = ['apple', 'orange', 'grape']
cheeses = ['cheddar', 'wensleydale', 'brie']

for fruit,cheese in zip(fruits, cheeses):
    print(f"{fruit} goes with {cheese}")

fruits = ['apple', 'orange', 'grape']
cheeses = ['cheddar', 'wensleydale', 'brie']

for i,(fruit,cheese) in enumerate(zip(fruits, cheeses)):
    print(f"Course {i+1}: {fruit} with {cheese}")

# Or, less elegantly:
# for i,fruit,cheese in zip(range(len(fruits)), fruits, cheeses)

Slice notation

We can pick out subsequences of a list or tuple using the slice notation, x[start:end].

x[:n]   # first n items
x[n:]   # all except first n
x[-n:]  # last n items
x[:-n]  # all except last n
x[m:n]  # x[m], ..., x[n-1]

We can also assign into slices:

x[slice] = list_of_replacement_items

There's also the step notation x[start:end:step]:

x[::d]  # x[0], x[d], ...
x[::-d] # x[-1], x[-1-d], ...
x[m::d] # x[m], x[m+d], ...

Strings behave like lists of characters, and the slice notation works on them also.

Given the string 'once upon a time', can you use slice notation to (a) reverse the characters, (b) reverse the words?

x = 'once upon a time'

# reverse characters

# reverse words

x = 'once upon a time'

# reverse characters
x[::-1]

# reverse words
' '.join(x.split()[::-1])

Dictionaries

The other very useful data type is the dictionary (what Java calls a Map or HashMap).

d = {}        # create an empty dictionary

d[k]          # get value for key k
d.get(k, v0)  # default to v0 if key missing
k in d

d[k] = v      # set or update value for key k
del d[k]      # delete entry for key k

for k,v in d.items():
    ... # do something with key k, value v

Given a dictionary keyed by person, saying which room they're in, can you create a new dictionary keyed by room, containing a list of its occupants?

room = {'adrian': 10, 'chloe': 5, 'guarav': 10, 'shay': 11,
        'alexis': 11, 'rebecca': 10, 'zubin': 5}

occupants = {}
# ???

for room, occupants_here in occupants.items():
    ns = ', '.join(occupants_here)
    print(f'Room {room} has {ns}')

room = {'adrian': 10, 'chloe': 5, 'guarav': 10, 'shay': 11,
        'alexis': 11, 'rebecca': 10, 'zubin': 5}

occupants = {}
for name, room in room.items():
    if room not in occupants:
        occupants[room] = []
    occupants[room].append(name)

for room, occupants_here in occupants.items():
    ns = ', '.join(occupants_here)
    print(f'Room {room} has {ns}')

List comprehensions

There's one unusual piece of syntax that we see all over the place in Python code, called comprehension. It's a succinct way to write code for creating new lists by transforming other lists or dictionaries.

In this code, list represents an arbitrary list, dict represents an arbitrary dictionary, and expr represents any expression at all.

[expr for x in list]
[expr for k,v in dict.items()]

We can also specify a filter condition, to only keep some of the items. Here test_expr represents a boolean expression.

[expr for x in list if test_expr]

Can you select the squares of the even numbers in my_list?

Hint: look at the maths operators.

my_list = range(10)

# Squares of the items in the list
[x**2 for x in my_list]

# Select the squares of even numbers in my_list
# Should get answer [0,4,16,36,64]

my_list = range(10)

# Squares of the items in the list
[x**2 for x in my_list]

# Select the squares of even numbers in my_list
[x**2 for x in my_list if x % 2 == 0]

Dictionary comprehensions

We can also create a new dictionary using comprehensions.

{key_expr: val_expr for x in list}
{key_expr: val_expr for x in dict.items()}

Here, key_expr and val_expr represent arbitrary expressions. We can also filter with a cond_expr.

What do you think happens if we specify the same key twice? Evaluate this expression, and explain what you get.

{x**2: x for x in range(-5,5)

my_list = range(10)

# A dict to map an item to its square
{x: x**2 for x in my_list}

my_list = range(10)

# A dict to map an item to its square
{x: x**2 for x in my_list}

# For x in [-5,-4,-3,...,3,4], 
# set the key to be x**2 and the val to be x
# and when we set the same key twice, the later value overrides the earlier
{x**2: x for x in range(-5,5)}

Exercise

Give a one-line expression to sort names by length, breaking ties alphabetically.

Hint:

Make a list of (len(name),name) using list comprehension
Sort your list — when Python sorts a list of tuples, it uses lexicographic ordering
Use another list comprehension to pick out the second element of each pair in your sorted list

names = ['adrian', 'chloe', 'guarav', 'shay', 'alexis', 'rebecca', 'zubin']

# sorted by length, breaking ties alphabetically

names = ['adrian', 'chloe', 'guarav', 'shay', 'alexis', 'rebecca', 'zubin']

# sorted by length, breaking ties alphabetically
[v for _,v in sorted([(len(n),n) for n in names])]

Exercise

Give a one-line expression that counts the number of distinct elements in a list, using only list and dictionary comprehensions.

Hint:

Create a dictionary whose keys are the elements of x and whose values are anything, e.g. True
Find the length of this dictionary using len

In practice, a better way to express our intent is by creating a set from x,

len(set(x))

import random
random.seed(1618)
x = [random.randint(0,10) for _ in range(20)]

# number of distinct elements in x

import random
random.seed(1618)
x = [random.randint(0,10) for _ in range(20)]

# number of distinct elements in x
len({y:True for y in x})

Double comprehensions

To iterate over nested lists, we can use a double comprehension

[¤ for ℓ in my_list for item in ℓ]

which is shortand for

for ℓ in my_list:
    for item in ℓ:
        # produce ¤

Similar syntax works for dictionaries.

Flatten the dictionary d into a single list of all the tasks to be done.

my_list = [[0,1,2], [4,5]]
flattened = [x for s in my_list for x in s]
print(flattened)

d = {'clean':['kitchen','bathroom'], 'wash':['clothes','dishes','face']}
# Flatten this into a single list
# ['clean 1: kitchen', 'clean 2: bathroom', 'wash 1: clothes', ...]

my_list = [[0,1,2], [4,5]]
flattened = [x for s in my_list for x in s]
print(flattened)

d = {'clean':['kitchen','bathroom'], 'wash':['clothes','dishes','face']}
# Flatten this into a single list
# ['clean 1: kitchen', 'clean 2: bathroom', 'wash 1: clothes', ...]

[f"{act} {i+1}: {v}" for act,things in d.items() for i,v in enumerate(things)]

Unpacking with * and **

The unpacking syntax is advanced Python.

If ℓ is a list, then *ℓ is the sequence of items in ℓ
If δ is a dictionary, then **δ is the sequence of keys and values in δ

This sounds crazy. What is a sequence of items, if not a list?

Don't think of the sequence in terms of what it is, think of what it can be used for.

The arguments to a function are a sequence. So, if we want to call f(ℓ[0], ℓ[1], ...), we can do it with f(*ℓ)
When we create a list, the things inside [ ] are a sequence. This gives a simple way to concatenate and append to lists. Similarly for dictionaries.

How would you create a new dictionary that combines d1 and d2?

x = [2,3,True]
print(x)
print(*x)

print([*x, "yes", range(2)])

# create a new dictionary that combines these two
d1 = {'a':3, 'b':10}
d2 = {'b':5, 'c':12}

x = [2,3,True]
print(x)
print(*x)

print([*x, "yes", range(2)])

# create a new dictionary that combines these two
d1 = {'a':3, 'b':10}
d2 = {'b':5, 'c':12}
{**d1, **d2}

Programming

Assertions

One of the most useful statements in scientific computing is assert.

assert testexpr, errmessage

When we're exploring an idea there may be all sorts of corner cases that we want to leave to one side — our big idea might not even work in the first place, so there's no point implementing all the corner cases until we've decided the idea works.

But on the other hand, in case it does work, we don't want to leave ‘logic mines’ in our code that will explode under our feet when we're looking elsewhere.

An AssertionError is a message to our future self saying “you'll need to think this through” and it's much more helpful than silently returning a wrong answer.

import math

def myidea(x):
    z = math.sin(2*x) + 0.5
    assert z >= 0.2, "TODO: the maths behind this eqn is wrong for z < 0.2"
    return math.exp(z - 0.2)

myidea(1.1)  # works
myidea(2.3)  # fails with AssertionError

Defining functions

This is what functions look like in Python. This particular function solves the quadratic equation a + b x + c x²=0 for x, and returns a list with 0, 1, or 2 solutions.

Try calling the function in these three ways. What do you see?

roots(2,3,0)
roots(2,3)
roots(b=3, a=2)

Python has a nice system for named arguments and default arguments. We can provide the arguments in order. Or we can provide them in any order, as long as we give their names. And we can skip arguments with default values, in this case c=0.

This function returns a single value, namely a list (or it throws an exception). In general,

If we want to return several variables, return them in a tuple, and unpack the tuple using multiple assignment.
If our function finishes without an explicit return statement, it will return None.
It's possible for different branches of a function to return values of different types — at risk to our sanity.

import math

def roots(a, b, c=0):
    """Return a list with the real roots of a + b*x + c*(x**2) == 0"""
    if b == 0 and c == 0:
        raise Exception("This polynomial is constant")
    if c == 0:
        return [-a/b]
    elif a == 0:
        return [0] + roots(b=c, a=b)
    else:
        discr = b**2 - 4*c*a
        if discr < 0:
            return []
        else:
            return [(-b+s*math.sqrt(discr))/2/c for s in [-1,1]]

Returning nothing

It's often handy for functions to be able to return either a value, or a marker that there is no value. For example, head(ℓ) should return the first item in list ℓ unless the list is empty in which case there's nothing to return.

In Python, the convention is to return None when you have nothing to return, and a value otherwise. The Python style is to assume we're all adults. Trust that the person who called you won't just blindly assume there is a value.

This is in contrast to other languages like OCaml, in which we'd define head to return an enumerated ‘maybe’ datatype None | Some['a]. This forces everyone who uses the function to check whether or not the answer is None.

Rewrite this function using the more idiomatic ternary expression a if cond else b.

When we run this code, it displays (No output). Most Python development environments, when they evaluate a block and get the result None, will suppress the output.

def head(a_list):
    if len(a_list) > 0:
        return a_list[0] 
    else:
        return None
        
head([])

def head(a_list):
    return a_list[0] if len(a_list) > 0 else None

Dynamic typing

Python uses dynamic typing, which means that values are tagged with their types during execution and checked only then.

To illustrate, consider the functions goodfunc and badfunc. We won't be told of errors until badfunc() is invoked, even though it's clear when we define it that it will fail.

def double_items(xs):
    return [x*2 for x in xs]

def goodfunc():
    return double_items([1,2,[3,4]]) + double_items("hello world")

def badfunc():
    return double_items(10)

badfunc()

Duck typing

Python programmers are encouraged to use duck typing, which means that you should test values for what they can do rather than what they’re tagged as. “If it walks like a duck, and it quacks like a duck, then it’s a duck”.

In this example, double_items(xs) iterates through xs and applies *2 to every element, so it should apply to any xs that supports iteration and whose elements all support *2. These operations mean different things to different types: iterating over a list returns its elements, while iterating over a string returns its characters; doubling a number is an arithmetical operation, doubling a string or list repeats it.

Why does double_items("hello") not cause an error, whereas double_items(10) does?

Python does allow you to test the type of a value but programmers are encouraged not to do this.

Python’s philosophy is that library designers are providing a service, and programmers are adults. If a library function uses comparison and addition, and if the end-user programmer invents a new class that supports comparison and addition, then why on earth shouldn’t the programmer be allowed to use the library function?

I’ve found this useful for simulators: I replaced ‘numerical timestamp’ with ‘rich timestamp class that supports auditing, listing which events depended on which other events’, and I didn’t have to change a single line of the simulator body.) Some statically typed languages like Haskell and Scala support this via dynamic type classes, but their syntax is rather heavy.

def double_items(xs):
    return [x*2 for x in xs]

double_items([5,3,1,8])
double_items(['yes','of course'])
double_items('hello')

def double_items(xs):
    return [x*2 for x in xs]

double_items([5,3,1,8])
double_items(['yes','of course'])
# note: strings behave like lists of characters
double_items('hello')

Objects and types

Python is an object-oriented programming language. Every value is an object, and it has a class. Python supports inheritance and multiple inheritance, and static methods, and class variables, and so on.

# get the class of x
type(x)

# is x an instance of class C,
# or does x's class inherit from C?
isinstance(x, C)

Here's example code, a Tree class. Each Tree object has a list of its children, and this may include other Trees.

To create an object of class C, use myobj = C(...)
Every method takes a first argument self referring to the current object, this in Java. We don't supply self when we call methods, Python fills it in for us.
The constructor is called __init__, and it too takes a first argument self
Use self inside the class definition to call other methods or to set member variables
To define a class that inherits from another class C, start with class NewClass(C):

Implement a function flatten(t) which returns a list consisting of all the leaves of tree t.

class Tree:
    def __init__(self, children):
        self.children = children

t1 = Tree([3,2])
print(t1.children)

t2 = Tree([10,t1,"hello"])
print(t2.children)

# Implement a function flatten(t)
# flatten(t2) should return [10,3,2,"hello"]

class Tree:
    def __init__(self, children):
        self.children = children

t1 = Tree([3,2])
print(t1.children)

t2 = Tree([10,t1,"hello"])
print(t2.children)

# Implement a function flatten(t)
# flatten(t2) should return [10,3,2,"hello"]

def flatten(x):
    if isinstance(x, Tree):
        return [y for child in x.children for y in flatten(child)]
    else:
        return [x]

flatten(t2)

Object attributes

Set member attributes of an object using

obj.attr = val

This can be done in any method of the class. They don't need to be declared in the class definition or in the constructor.

Member attributes can also be set outside the class, what's called ‘monkey patching’. Like so many language features in Python, this is sometimes tremendously handy, and sometimes the source of infuriating bugs.

class C:
    def __init__(self, x):
        self.zing = x

myobj = C(10)
myobj.zong = 11

print(myobj.zing, myobj.zong)

Interfaces

Python doesn't support interfaces, because they don't make sense in a duck typing language.

It's more Pythonic to just charge ahead and assume the caller gave you objects with the right methods.

Here's an example, flattening a tree. We've seen how to achieve this with a Tree class. But we don't actually need a class at all! The code shown here will work on any objects the caller gives us. It uses duck typing.

The flatten function assumes that branch nodes are iterable objects e.g. lists or tuples, and it simply charges ahead and tries to iterate. If x isn't iterable then the for will raise a TypeError exception, telling us that x must be a leaf.

Dunder methods

Python lets us define custom classes that hook into the usual Python syntax.

For example, if we define a new class with the method __iter__ then objects of our class can be iterated over with for syntax, just like a list.

There's a long list of special method names that we can use. They're sometimes called ‘dunder methods’, for ‘double underline’.

def flatten(x):
    try:
        return [y for child in x for y in flatten(child)]
    except TypeError as e:
        return [x]

x = [1,[[2,4,3],9],[5,[6,7],8]]
flatten(x)

Equal value (==) or identity (is)

When we ask if two objects are equal, there are two things we might mean:

x is y   # x is the same object as y
x == y   # x and y have the same value

Objects can customize the “same value” test. For lists, the test is: do the two lists have have the same length, and are all their values equal?

To customize this test for your own classes, use the dunder method __eq__.

x = [1, 5, 3]
y = [1, 5, 3]

print(x is y, x == y)

Functional programming

In Python as in OCaml, functions can be returned as results, assigned, put into lists, passed as arguments, and so on.

In this example, noisifier is a function that returns another function. The inner function ‘remembers’ the value of σ under which it was defined; this is known as a closure.

import random

def noisifier(σ):
    def add_noise(x):
        return x + random.uniform(-σ, σ)
    return add_noise

σlist = [0.1, 1, 10]
fs = [noisifier(σ) for σ in σlist]
for f,σ in zip(fs, σlist):
    print(f"1.5 plus noise σ={σ} gives {f(1.5):.3}, {f(1.5):.3}, {f(1.5):.3}")

Anonymous (lambda) functions

We can use lambda to define anonymous functions, i.e. functions without names.

def illustrate_func(f, xs):
    for x in xs:
        print(f"f({x}) = {f(x)}")

illustrate_func(lambda x: x+1, xs = range(5))
print()
illustrate_func(lambda x: x*2, xs = range(5))

Generators and lazy lists

A generator (or lazy list) is a sequence where the elements are only computed on demand. We can create them by simply defining a function that has the yield statement somewhere inside.

def f():
    ....
    yield ¤

g = f()
next(g), next(g)

When we call next(g), it runs through f until it reaches the next yield statement, then it emits a value and pauses. Think of g as an execution pointer plus call stack: it remembers where it is inside the f function, and calling next tells it to resume executing until the next time it hits yield.

The range function, which we've seen used for iteration, is actually a generator. Generators also let us implement infinite sequences.

Generator comprehensions

We can write comprehensions for generators, using the same sort of syntax as for list and dictionary comprehensions. Use ( ) for generator comprehensions, as opposed to [ ] for lists and { } for dictionaries.

(¤ for x in g if ¤)

Get the first 10 even Fibonacci numbers, using the filtered generator

even_fibs = (x for x in fib() if x % 2 == 0)

def fib():
    x,y = 1,1
    while True:
        yield x
        x,y = (y, x+y)

fibs = fib()
[next(fibs) for _ in range(10)]

def fib():
    x,y = 1,1
    while True:
        yield x
        x,y = (y, x+y)

fibs = fib()
[next(fibs) for _ in range(10)]

even_fibs = (x for x in fib() if x % 2 == 0)
[next(even_fibs) for _ in range(10)]

Next steps

Notebooks

Well done!

You've finished this tutorial about Python.

The next step is to move on to longer programs! Jupyter notebooks are great for building up longer programs interactively, trying out ideas and visualizing the results. This makes them a good fit for scientific computing — for machine learning a data science. Though they do make it easy to fall into spaghetti coding …

Have a look at Exercise 0. You'll pick up some tips for organizing your code in Jupyter notebooks. And you'll also be introduced to the Autograder, the system for online ticking used in this Scientific Computing course.

x = "Well done! "
for i in range(len(x)+1):
    print(x[i:] + x[:i])

Python

Syntax

Basics

Multiple returns and assignments

Indentation and control flow

Expressions

Reading error messages

Logic

If / then expression

Precedence of and, or, not

None: the null value

Maths

Importing modules

Strings

Splicing values into strings

String processing

Regular expressions

Collections

Collection types

Lists and tuples

Exercise

Operations on lists

Iterating over lists

Iterating over two lists together

Slice notation

Dictionaries

List comprehensions

Dictionary comprehensions

Exercise

Exercise

Double comprehensions

Unpacking with * and **

Programming

Assertions

Defining functions

Returning nothing

Dynamic typing

Duck typing

Objects and types

Object attributes

Interfaces

Dunder methods

Equal value (==) or identity (is)

Functional programming

Anonymous (lambda) functions

Generators and lazy lists

Generator comprehensions

Next steps

Notebooks

Well done!

Precedence of `and, or, not`