Computer Laboratory – Course material 2007

Next: Compiler Construction Up: Lent Term 2008 Previous: Lent Term 2008 Contents

Bioinformatics

Lecturer: Dr P. Liò

No. of lectures and examples classes: 12 + 2

Aims

The immense growth of biological information stored in databases has led to a critical need for people who can understand the languages and techniques of Computer Science and Biology. The first two lectures will focus on some aspects of Molecular Biology that are relevant to the course; other biological insights will be given as examples and applications in other lectures. This course will discuss algorithms for some important computational problems in Molecular Biology that are pertinent to Biotechnology and to the so-called ``Post-genome era''. We shall focus on Hidden Markov models and Phylogenetic inference algorithms for Comparative Genomics. Students will learn how to describe biological processes using Pi calculus and will be encouraged to think in two ways: How can Computer Science be useful to Biology? How can Biology be useful to Computer Science? In order to seed and to stimulate this approach, part of the course will be based on biological networks that offer many opportunities for comparing biological and electronics systems.

Lectures

DNA and protein sequence alignment. Biology and information content of DNA and protein sequences. Why alignment is important. Dynamic programming. Global versus local. Scoring matrices. 4 Russians approach. [2 lectures]
Biological data mining. Biological context. Algorithms related to Fasta, BLAST and its family, PatternHunter. Significance of alignments. [1 lecture]
Hidden Markov Models in Bioinformatics. Biological context. Examples of the Viterbi, the Forward and the Backward algorithms. Parameter estimation for HMMs. [3 lectures]
Sequence comparison and evolution. Biological context. Algorithms related to Distance, Parsimony and Likelihood methods; Bootstrap. Models of DNA and protein evolution. [2 lectures]
Introduction to microarray data analysis. Biological context. Steady state and time series microarray data. Clustering methods. From microarray to networks. [2 lectures]
Introduction to biological networks. Biological context. Basic networks topologies. Finding regulatory elements in aligned and unaligned sequences. Gillespie algorithm. Concept of System Biology. [2 lectures]

Objectives

At the end of this course students should

understand the terminology of Bioinformatics, be aware of most common types of biomolecular data and their format. Be able to interface with biologists and physicians.
understand important algorithms and data analysis techniques and be able to code them.

Recommended reading

* Durbin, R., Eddy, S., Krough, A. & Mitchison, G. (1998). Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press.
Felsenstein, J. (2003). Inferring phylogenies. Sinauer Associates.
Fall, C., Marland, E., Wagner, J. & Tyson, J. (2002). Computational cell biology. Springer-Verlag Telos (1st ed.).
Bower, J.M. & Bolouri, H. (2001). Computational modeling of genetic and biochemical networks. MIT Press.

Next: Compiler Construction Up: Lent Term 2008 Previous: Lent Term 2008 Contents