MPhil projects

Contents

Programming style exploration
Computing with multi-omics condition-specific Turing machines
Goggling the nucleus
Multi layer and Deep Mining of comorbidities

Programming style exploration

Proposer: Max Conway
Supervisor: Pietro Lio' and Max Conway

Tools like lint and style checkers can be used to check code against manually curated, strict style rules, like spacing style, but when a human reads code to assess quality we tend to also look for subtler and more context specific cues, such as function size and variable descriptiveness.

This project is about using a machine learning approach to explore these properties of code. The starting point would be to extract a set of simple features, explore their relationships and try to predict metadata like author and date of writing, before either generalizing to more languages or developing more in depth features and more specific hints and advice.

My suggested approach would be R both for evaluation and as a dataset since it has easily accessible source code and is good for statistics, but there are many reasonable approaches.

Computing with multi-omics condition-specific Turing machines

Proposer: Pietro Lio', Claudio Angione
Supervisors: Pietro Lio'

If Turing were a first-year graduate student interested in computers, he would probably migrate into the field of computational biology. During his studies, he presented a work about a mathematical and computational model of the morphogenesis process, in which chemical substances react together. Moreover, a protein can be thought of as a computational element, i.e. a processing unit, able to transform an input into an output signal. Thus, in a biochemical pathway, an enzyme reads the amount of reactants (substrates) and converts them in products. In this work, we consider the biochemical pathway in unicellular organisms (e.g. bacteria) as a living computer, and we are able to program it in order to obtain desired outputs. The genome sequence is thought of as an executable code specified by a set of commands in a sort of ad-hoc low-level programming language. Each combination of genes is coded as a string of bits each of which represents a gene set. By turning on a gene set, we turn on the chemical reaction associated with it. Starting from a recently published work on metabolic machines we would like to compute using additional layers of biological information, such as methylation and protein protein interactions. We would like to investigate the feasibility of such biological computer taking into account different layers of information.

References:

Nam H, Campodonico M, Bordbar A, Hyduke DR, Kim S, et al. (2014) A Systems Approach to Predict Oncometabolites via Context-Specific Genome-Scale Metabolic Networks. PLoS Comput Biol 10(9): e1003837. doi:10.1371/journal.pcbi.1003837
Patil et al. Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics, 6(1):308, 2005.

Goggling the nucleus

Proposers: Yoli Shavit and Pietro Lio'
Supervisors: Pietro Lio'

Accumulating data from local and genome-wide chromosome capture data, as well as from imaging, are now changing the way we understand the nuclear organization. We have previously developed methods and tools for the visualization [1,3], integration[3], correction[2], comparison[1,2], calibration[4] and reconstruction[4] of these data. Creating a unified framework for this geo-nuclear map, integrating spatial, genetic and epigenetic properties will allow researchers, for the first time, to 'Google' the nucleus.

References:

Shavit Y and Lio' P (2013) CytoHiC: a cytoscape plugin for visual comparison of Hi-C networks. Bioinformatics, http://www.ncbi.nlm.nih.gov/pubmed/23508968 *29*, 1206-7.
Shavit Y and Lio' P (2014) Combining a wavelet change point and the Bayes factor for analysing chromosomal interaction data. Mol Biosyst, *10*, 1576-85.
Merelli I, Liò P,Milanesi L (2013) NuChart: an R package to study gene spatial neighbourhoods with multi-omics annotations. PLoS One, *19*, e75146.
Shavit Y, Hamey FK, Lio' P (2014) FisHiCal: an R package foriterative FISH-based calibration of Hi-C data. Bioinformatics,[Epubahead of print]

Multi layer and Deep Mining of comorbidities

Proposer:Pietro Lio'
Supervisor: Pietro Lio'
Special Resources: None

The aim of this project is to use newly developed machine learning techniques for diagnosis of comorbidities based on clinical (image), multi omic and literature data. The project is flexible and could use SVM and decision trees. Please contact the supervisor for more details.

References:

Capobianco E. and Liò, P. (2015). Comorbidity Networks: beyond disease correlations. The Journal of Complex Networks , jcomplexnetw (2015) doi: 10.1093/comnet/cnu048 Moni MA, Liò P (2015). How to build personalised multi-omics comorbidity profiles. Frontiers in Cell and Developmental Biology, section Systems Biology. Front. Cell Dev. Biol. doi: 10.3389/fcell.2015.00028. Bartocci E, Lio', P. Computational modeling, formal analysis and tools for systems biology. Plos Computational Biology, (in press).
Veličković P. Lio', P. (2015) Molecular multiplex network inference. Journal of Complex Networks (in press)
Moni, MA, Haoming Xu, Liò, P (2015). Network regularised Cox regression and multiplex network models to predict disease comorbidities and survival of cancer. Computational Biology and Chemistry (in press).

The application of graph models has proven very useful for understanding the behavior of complex systems. This includes self organization in biology, biological competition (game theory), traffic flow, social interactions and made famous by Conway's game of life [1]. Traditionally, cellular automata have been applied on regular grids which can often be a crude approximation to the actual cellular organization in biology or irregular connections in the real world. In fact, connections in the real world are often more suited to graph based representations [2, 3]. This project aims at developing a graph model of stem cell dynamical model in the intestine. The method developed will be compared with already existent intestine cancer models and tested with data available in the literature [4]. A possible extension would be the implementation of the graph based cellular automaton on GPU. Ultimately, this type of automata would contribute to understand stem cell dynamics in the intestine and cancer as well as other real world problems.

Computer Laboratory

MPhil projects

Programming style exploration

Computing with multi-omics condition-specific Turing machines

References:

Goggling the nucleus

References:

Multi layer and Deep Mining of comorbidities

References:

A graph based cellular automaton to study stem cells

References: