# Thomas Brouwer

I am a PhD student under Pietro Lio' in machine learning and bioinformatics at the Computer Laboratory, University of Cambridge, where I also obtained my BA in Computer Science in 2014.

My research is focused on developing Bayesian probabilistic models for analysing and integrating biological datasets. In particular, I study and develop Bayesian matrix factorisation methods. My applications are mainly focused on drug sensitivity prediction and gene expression datasets.

## News

- Paper got accepted to AISTATS 2017, titled "Bayesian Hybrid Matrix Factorisation for Data Integration"!
- Presented a poster at the NIPS Workshop on Advances in Approximate Bayesian Inference, titled "Fast Bayesian Nonnegative Matrix Factorisation and Tri-Factorisation" (9 December 2016).
- Visited Professor Samuel Kaski's group at Aalto University, Helsinki, and gave a talk "Hybrid matrix factorisation" (5-12 November 2016).
- Visited Professor Samuel Kaski's group at Aalto University, Helsinki, and gave a talk "Bayesian data integration by multiple matrix tri-factorisation" (11-18 May 2016).
- Presentated at the 7th Workshop on Complex Networks, CompleNet 2016, titled "FactorNet: network analysis for biclusters identification" (23 - 25 March 2016).
- Gave lecture at Cambridge University, as part of the Research Students Lectures series (17 November 2015).
- Passed first year viva of the PhD! (10 August 2015).

## Research

My research is focused on developing Bayesian probabilistic models for drug development. I am focusing on three application areas:

**Drug sensitivity prediction**- predicting how sensitive different (cancer) cell lines are to different drugs, given other sensitivity values, and features about the drugs (chemical structure, primary targets) and cell lines (gene expression profile, copy number variations, mutation data).**Drug synergy**- predicting whether two drugs work better together than by themselves, again using drug and cell line features.**Drug repositioning**- given information about which drugs work well on a number of diseases, try to infer which other drug-disease associations might work well. This is especially useful for rare diseases, without any good treatment options or extensive research to find them.

To achieve this, I focus on models that use probability distributions to describe the problems in terms of random variables, and use **Bayesian inference** to infer the distributions of these random variables after observing a dataset. Bayesian methods are more resistant to noise and overfitting than non-probabilistic approaches. I am currently specialising in **matrix factorisation** methods, which can be used to predict missing values in datasets, and extending them to incorporate other datasets such as features and similarity information between drugs and cell lines. I am also interested in investigating what the effects are of the inference methods and Bayesian prior choices on predictive performance.

## Presentations and notes

##### Papers

**Hybrid Bayesian Matrix Factorisation for Data Integration**- arXiv, paper, supplementary, AISTATS 2017.**Fast Bayesian Nonnegative Matrix Factorisation and Tri-Factorisation**- arXiv, paper, supplementary, NIPS 2016 Workshop on Advances in Approximate Bayesian Inference.

##### Conferences

**20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017)**- paper and poster titled*Bayesian Hybrid Matrix Factorisation for Data Integration*- 21 April 2017.**NIPS 2016 Workshop on Advances in Approximate Bayesian Inference**- poster titled*Fast Bayesian Nonnegative Matrix Factorisation and Tri-Factorisation*- 9 December 2016.**CompleNet 2016: 7th Workshop on Complex Networks**- 15-minute presentation titled*FactorNet: network analysis for biclusters identification*- 23 to 25 March 2016**Big Data in Medicine: Exemplars and Opportunities in Data Science**- poster titled*Identifying effective drugs for cancer*- 19 June 2015

##### Summer schools

**Microsoft PhD Summer School**- poster titled*Probabilistic models for improving drug development*- 29 June to 3 July 2015**Kyoto Machine Learning Summer School**- poster, spotlight presentation titled*Bayesian non-negative matrix tri-factorisation*- 23 August to 4 September 2015

##### Other talks

**Hybrid matrix factorisation**- 7 November 2016 - slides - 30-minute talk in Professor Samuel Kaski's group, Aalto University, Helsinki, Finland**Bayesian data integration by multiple matrix tri-factorisation**- 12 May 2016 - slides - 30-minute talk in Professor Samuel Kaski's group, Aalto University, Helsinki, Finland**Matrix factorisation and extensions**- 4 February 2016 - slides - 30-minute talk for Bioinformatics group**Introduction to Bayesian inference**- 17 November 2015 - slides - 1-hour lecture for Research Students Lectures series

##### Notes

**Probabilistic non-negative matrix factorisation and extensions**- literature review (unfinished!)

## Code (GitHub)

All my Python code is publicly available on my **GitHub account**.

**Bayesian hybrid matrix factorisation for data integration**- implementations for Bayesian hybrid matrix factorisation model; inference using Gibbs sampling.**Fast Bayesian non-negative matrix factorisation and tri-factorisation**- implementations for non-probabilistic inference, variational inference, Gibbs sampling, and iterated conditional modes.**K-means clustering with missing values**- for datasets with missing values.