Research
NOTE: this page will be updated rather a lot throughout September 2012...

1 Current research interests

1.1 Theoretical models for supervised learning

I have a long-standing research interest in computational learning theory, a body of work which aims to prove bounds on the performance of learning systems. My primary interest here has been in results applying in some way to the learning curves of supervised learners; that is, graphs of generalization performance against training set size, either in the worst case (with respect, for example, to the statistics of the examples) or in the average case. For worst case results proofs tend to depend on recent work in mathematical statistics; for average case results techniques developed in statistical mechanics are often employed.

The following subjects are at present of particular interest to me:

1.2 Bayesian inference

I am interested in the general problem of Bayesian inference. Contrary to what some might argue, this interest is not something I consider to be in conflict with my interest in computational learning theory. Philosophically speaking the two subjects tend to fall into two camps regarding interpretation: the Bayesian camp and the Frequentist camp. Both have much to offer. Each asks its own questions. Each provides answers of interest within its own context. Most importantly as far as I'm concerned, each is fun to think about.

Specifically, I am interested in the following issues:

1.3 Boosting

Boosting is a thoroughly intriguing procedure allowing several weak learners to be combined to form a single more powerful classifier. It has its roots in computational learning theory, but most recent work has focussed on explaining its apparent, rather counterintuitive, ability to resist overfitting, by seeking to analyze its operation within a number of different frameworks.

I am interested in Boosting in general, and in particular in alternative explanations for the effectiveness of this procedure in its applied forms.

1.4 Learning to prove

A long-standing and very successful research programme in this laboratory is that of automated theorem proving. I am working in collaboration with Larry Paulson and James Bridge on ways in which machine learning might be incorporated into such systems.

2 Research students

At present I supervise:

Past research students are:

3 Things that I've researched in the past but no longer pursue

4.1 Support Vector Machines for QSAR

How do we design good drugs?

Prior to around 1980 the approach was simple: a compound was designed specifically - by one or more clever chemists - to interact in a desired way with a target of interest. This process was, unsurprisingly, both expensive and time-consuming.

More recently the process has changed. Instead of designing a single compound of interest we search chemical space by elimination for compounds with desirable properties such as toxicity or "drug-likeness". This can be cast as a classic, although very hard, supervised learning problem wherein compounds are described by attribute vectors and labeled with the presence or otherwise of the property of interest. Typically the resulting learning problem has highly non-symmetric priors (as most compounds are not interesting) and non-symmetric costs (as the cost of rejecting a compound as inactive when in fact it is a perfect cure for HIV is extremely high).

Problems of this kind have traditionally been addressed using various statistical and other intelligent systems techniques. As part of the Rocket project we applied support vector machines to this problem, in collaboration with GlaxoSmithKline.

4.2 Quantum computation applied to machine learning

Quantum computation is a very intriguing concept indeed. And that's putting it mildly. We are at present living in very interesting times; although there is evidence that quantum phenomena might be harnessed to perform computations considered intractable (in the formal sense) for any standard (non-quantum) computer, we are not at present aware of exactly how the complexity of quantum algorithms relates to the non-quantum complexity classes. Also, it is not clear whether a large-scale quantum computer can be built, and the design of quantum algorithms is not as straightforward (?!) as the design of algorithms in its classical sense.

I am interested in the way in which the advantages potentially offered by quantum computers might be of benefit within the general field of machine learning.