ex2 practical exercises for Example Sheet 2.¶

The example sheet asks you to implement the four functions given below: exp_poisson_confint, exp_uniform_confint, climate_inc_confint, and climate_pred_confint. The skeletons are given. To test your answers on Moodle, please upload either a Jupyter notebook called ex2.ipynb or a plain Python file called ex2.py.

These functions are meant to compute 95% confidence intervals, using computational approximation based on sampling. This means that your answers might not be exact. The tester will calculate the true confidence of the interval you propose, and will require that it be reasonably close to 95%. You should be able to use around 300,000 samples, which should be enough to pass the tester, with no more than a few seconds of runtime. If you can't, you should refactor your code to make it faster (using numpy vectorized commands) and more accurate (using the log-sum-exp trick).

The tester will always use the same arguments when it calls your functions. The function arguments are only passed in for your convenience. If you'd prefer to precompute all the answers and simply provide dummy functions that return your precomputed answers, you may.

exp_poisson_confint. We have a dataset $x_1,\dots,x_n$ distributed as $\operatorname{Poisson}(\Theta)$. Find a 95% confidence interval for $\Theta$, assuming an $\operatorname{Exp}(1)$ prior.

def exp_poisson_confint(p, x):
    # Input: p=0.95 and x=np.array([4,0,0,0,3,1,1,0,1,1,1,0,1,0,1,2,3,0,1,1])
    # TODO: compute a p-confidence interval for Θ
    return (lo,hi)

exp_uniform_confint. We have a dataset $x_1,\dots,x_n$ distributed as $\operatorname{Uniform}[A,B]$. Find a 95% confidence interval for $B$, assuming an $\operatorname{Exp}(λ_0)$ prior for $A$ and a $\operatorname{Exp}(\mu_0)$ prior for $B$.

def exp_uniform_contint(p, x, λ0, μ0):
    # Input: p=0.95, λ0=0.5, μ0=1.0, x=np.array([2, 3, 2.1, 2.4, 3.14, 1.8])
    # TODO: compute a p-confidence interval for B
    return (lo,hi)

climate_inc_confint and climate_pred_confint. This question uses a dataset of UK temperatures, restricted to Cambridge temperatures from 2010 onwards. It assumes the model $$ \text{Temp} \sim A + b \sin(2\pi(t+\phi)) + C(t-2000) + N(0,\sigma^2) $$ where $A$ and $C$ are unknown and the other parameters are known. We'll use $A\sim N(a,(\sigma_a)^2)$ and $C\sim N(c,(\sigma_c)^2)$ as priors. [NOTE: An earlier version of this text stated $B\sim N(b,\sigma_b^2)$. Fixed 2023-10-31.]

def climate_inc_confint(p, t, temp, climatemodel):
    # Input: p=0.95, t and temp taken from the dataset, climatemodel an object
    # with attributes (a,b,φ,c,σ,σa,σc) specifying the model parameters.
    # TODO: compute a p-confidence interval for C
    return (lo,hi)

def climate_pred_confint(p, newt, t, temp, climatemodel):
    # Input: as above, plus newt which is the timepoint we want to predict
    # TODO: compute a p-confidence interval for A+C*(newt-2000)
    # return (lo,hi)

TEST¶

The Moodle checker will look for a markdown cell with the contents # TEST, and ignore everything beneath it. Put your working code above this cell, and put any experiments and tests below.

In [ ]:
import numpy as np
import pandas
In [ ]:
x = np.array([4,0,0,0,3,1,1,0,1,1,1,0,1,0,1,2,3,0,1,1])
exp_poisson_confint(0.95, x)
In [ ]:
λ0,μ0 = .5,1.0
x = np.array([2, 3, 2.1, 2.4, 3.14, 1.8])
exp_uniform_confint(.95, x=x, λ0=λ0, μ0=μ0)
In [ ]:
url = 'https://www.cl.cam.ac.uk/teaching/current/DataSci/data/climate_202309.csv'
climate = pandas.read_csv(url)
df = climate.loc[(climate.station=='Cambridge') & (climate.yyyy>=2010)]
t,temp = df.t.values, df.temp.values

import collections
ClimateModel = collections.namedtuple('ClimateModel', ['a','b','φ','c','σ','σa','σc'])
climatemodel = ClimateModel(a=10, b=6.6826, φ=-0.27731, c=0, σ=1.4183, σa=5, σc=0.1)

print(climate_inc_confint(0.95, t=t, temp=temp, climatemodel=climatemodel))
print(climate_pred_confint(0.95, 2050, t=t, temp=temp, climatemodel=climatemodel))