Computer Laboratory Home Page Search A-Z Directory Help
University of Cambridge Home Computer Laboratory
Computer Science Syllabus - Bioinformatics
Computer Laboratory > Computer Science Syllabus - Bioinformatics

Bioinformatics next up previous contents
Next: Comparative Programming Languages Up: Lent Term 2006: Part Previous: Lent Term 2006: Part   Contents


Lecturer: Dr P. Liò

No. of lectures and examples classes: 12 + 3


The immense growth of biological information stored in databases has led to a critical need for people who can understand the languages and techniques of Computer Science and Biology. The first two lectures will focus on some aspects of Molecular Biology that are relevant to the course; other biological insights will be given as examples and applications in other lectures. This course will discuss algorithms for some important computational problems in Molecular Biology that are pertinent to Biotechnology and to the so-called ``Post-genome era''. We shall focus on Hidden Markov models and Phylogenetic inference algorithms for Comparative Genomics. Students will learn how to describe biological processes using Pi calculus and will be encouraged to think in two ways: How can Computer Science be useful to Biology? How can Biology be useful to Computer Science? In order to seed and to stimulate this approach, part of the course will be based on biological networks that offer many opportunities for comparing biological and electronics systems.


  • Introduction to genomics. Basic concepts of molecular biology and genomics. the Human genome.

  • Introduction to biological networks. Molecular biology of communications within and between cells. Universality of networks topologies. Cancer.

  • Sequence alignment. Dynamic programming. Global versus local. Scoring matrices. The Blast family of programs. Significance of alignments.

  • Hidden Markov Models in Bioinformatics. Definition and applications in Bioinformatics. Examples of the Viterbi, the Forward and the Backward algorithms. Parameter estimation for HMMs. [2 lectures]

  • Trees. The Phylogeny problem. Distance methods, parsimony, bootstrap. Stationary Markov processes. Rate matrices. Maximum likelihood. Felsenstein's post-order traversal. [2 lectures]

  • Multiple sequence alignment. Aligning more than two sequences. Genomes alignment. Structure-based alignment.

  • Finding regulatory elements. Finding regulatory elements in aligned and unaligned sequences. Gibbs sampling.

  • Introduction to microarray data analysis. Steady state and time series microarray data. From microarray data to biological networks. Identifying regulatory elements using microarray data. [2 lectures]

  • Pi calculus. Description of biological networks; stochastic Pi calculus, Gillespie algorithm.

Examples classes

  • Databases and genome browsers. Databases for sequences and gene expression, genome browsers. Bioinformatics tools for Java.

  • Phylogenetic inference. Phylogeny and HMM. The HIV virus phylogeny.

  • Biological networks. Analysis of cell cycle or cancer related microarray data and network models.


At the end of this course students should

  • understand the terminology of Bioinformatics and be able to use it with precision

  • understand and be able to use databases, and apply advanced data analysis techniques in appropriate situations

  • be able to describe biological processes using Pi calculus

Recommended reading

* Durbin, R., Eddy, S., Krough, A. & Mitchison, G. (1998). Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press.
Felsenstein, J. (2003). Inferring phylogenies. Sinauer Associates.
Fall, C., Marland, E., Wagner, J. & Tyson, J. (2002). Computational cell biology. Springer-Verlag Telos (1st ed.).
Bower, J.M. & Bolouri, H. (2001). Computational modeling of genetic and biochemical networks. MIT Press.

next up previous contents
Next: Comparative Programming Languages Up: Lent Term 2006: Part Previous: Lent Term 2006: Part   Contents
Christine Northeast
Sun Sep 11 15:46:50 BST 2005