Reading prior to PhD study




General books

There are some truly wonderful books that deal with machine learning and AI in general. For an introduction to a large part of the field, but at a reasonably easy level:

Machine Learning
Tom Mitchell
McGraw Hill 1997

For more depth try:

Pattern Classification
Second Edition
Richard O. Duda, Peter E. Hart and David G. Stork
Wiley-Interscience 2000

The Elements of Statistical Learning
T. Hastie, R. Tibshirani and J. H. Friedman
Springer-Verlag 2001


Neural Networks

There are many books available on neural networks. Of everything I've seen to date the best two to take a look at are:

Neural Networks for Pattern Recognition
Christopher M. Bishop
Oxford University Press 1995

Neural Networks: A Comprehensive Foundation
Second Edition
Simon Haykin
Prentice Hall 1998

If your background is more towards physics and/or you want to see some further material with that kind of flavour try:

Introduction to the Theory of Neural Networks
John Hertz, Anders Krogh and Richard G. Palmer
Perseus Books Group 1991


General AI

For general AI the best thing you can read is:

Artificial Intelligence: A Modern Approach
Second edition
Stuart Russell and Peter Norvig

This is long, and it will take you a while to get through, but its presentation and its general approach are outstanding.


Computational learning theory

There is an excellent web site at:

www.learningtheory.org

containing numerous resources. For a readable introduction to probably approximately correct (PAC) learning etc try:

Computational Learning Theory
Martin Anthony and Norman Biggs
Cambridge University Press

and for more in-depth coverage of more recent and advanced material try:

Neural Network Learning: Theoretical Foundations
Martin Anthony and Peter L. Bartlett
Cambridge University Press 1999

Another introductory book with a different feel to it, and covering some further areas is:

An Introduction to Computational Learning Theory
Michael J. Kearns and Umesh V. Vazirani
The MIT Press 1994

An excellent an readable account of support vector machines is:

An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
Nello Cristianini and John Shawe-Taylor
Cambridge University Press 2000

For an in depth coverage of some more recent material, in particular connections between the learning theoretic approach and Bayes take a look at:

Learning Kernel Classifiers: Theory and Algorithms
Ralf Herbrich
The MIT Press 2002


Boosting

Again, there is a nice web site with many resources at:

www.boosting.org

A nice introduction (you should read something on general machine learning from the suggestions above first) is:

The Boosting Approach to Machine Learning: An Overview
Robert E. Schapire
MSRI Workshop on Nonlinear Estimation and Classification 2002

The following two papers propose explanations for how boosting works:

Additive Logistic Regression: a Statistical View of Boosting
J. Friedman, T. Hastie and R. Tibshirani
The Annals of Statistics
Volume 28, number 2, pages 337-374, 2000

Boosting the margin: A new explanation for the effectiveness of voting methods
Robert E. Schapire, Yoav Freund, Peter Bartlett and Wee Sun Lee
The Annals of Statistics
Volume 26, number 5, pages 1651-1686, 1998


Bayesian inference

An excellent introduction to a very wide range of modern techniques can be found in:

Pattern Recognition and Machine Learning
Christopher M. Bishop
Springer, 2006

A good introduction to many of the issues can be found in

Information Theory, Inference and Learning Algorithms
David J. C. MacKay
Cambridge University Press, 2002

although your best bet is to concentrate on the machine learning parts. The coding is very interesting, but not so relevant to applying here for a machine learning PhD. A great book by a legend is:

Probability Theory: The Logic of Science
E. T. Jaynes and G. Larry Bretthorst
Cambridge University Press, 2003

For a good summary see Zoubin Ghahramani's talk at:

http://www.gatsby.ucl.ac.uk/~zoubin/ICML04-tutorial.html

An excellent introduction to Gaussian processes as applied to machine learning, now available online, is:

Gaussian Processes for Machine Learning
Carl E Rasmussen and Christopher K I Williams
MIT Press, 2006.

More specific material on approximate integration can be found in:

An Introduction to MCMC for Machine Learning
C. Andrieu, N. de Freitas, A. Doucet and M. Jordan
Machine Learning, Volume 50, pages 5-43, 2003

On the same subject, see also the excellent review:

Probabilistic Inference Using Markov Chain Monte Carlo Methods
Radford Neal
Technical Report CRG-TR-93-1
Department of Computer Science
University of Toronto, 1993

which is available from:

http://www.cs.toronto.edu/~radford/publications.html