Bioinformatics

Overview

The first part of this course focuses on sequence data.

First we learn how to compare two sequences, or two subsequences in the same sequence (using alignment algorithms), or more than two sequences (progressive alignment).

Searching a database for nearly exact matches (using Blast algorithm) is the most important routine in the Bioinformatics labs.

When we have a group of sequences we could build a tree to study their relationships. To do so we could use parsimony or distance algorithms.

We can deal with different trees by understanding how to modify the topology and how to derive the consensus topology.

We use hidden Markov models to infer properties such as exon/intron arrangements in a gene or the 2D, 3D structure of a protein.

The second part of the course is about clustering microarray (gene expression) data using K-means or the Markov clustering algorithm; then we can reconstruct the genetic networks (Wagner algorithm).

Finally a network of biochemical reactions could be simulated using the Gillespie algorithm. Key web examples at the end of each lecture

Lecture notes (slides)

PDF

Additional examples