Hi 👋 I’m Dimitris Spathis

Doctoral Researcher
University of Cambridge
         


I am a doctoral researcher in Computer Science at the University of Cambridge, supervised by Prof. Cecilia Mascolo. My work as a whole enables machine learning models to learn richer semantics of high-dimensional and complex data (mobile sensors, time-series, audio, images, text or other modalities). I am thankful to be supported by Jesus College Cambridge, the EPSRC, and the ERC.

My research is driven by doing more with less information. The most prominent bottleneck of deep learning today is access to labeled datasets, carefully curated for niche tasks. To this end, I work on data-efficient models that learn generalizable and personalized representations by leveraging the following fundamental paradigms:

  • Self-supervision. I haven’t seen any parents teaching physics to an infant. Instead, gravity can only be learned by pure observation/interaction. Similarly, we align the input data in such a way that we pretend there is a part missing and then train a model to fill the gaps. But which is the best alignment?
  • Multi-tasking. Humans perform thousands of tasks naturally, but our models are still painfully single-purpose. Bonus: with competing objectives you may discover some surprising properties of the tasks themselves.
  • Transfer learning. We never start learning something new from scratch; we build upon basic abstractions instead. Our models can learn to re-use and fine-tune, but what are their limits?

Data-driven models of human behavior encode many complexities of the real world and hence I spend an inordinate amount of time thinking about sparsity, irregular sampling, long-term dependencies, noise, multi-modality, and long-tails. Beyond theory, I collaborate closely with world-class experts from health and the social sciences to apply robust concepts from data science and accelerate scientific discovery.

Previously, during my studies I have been fortunate to work at R&D teams in diverse industries including multinational telcos (Telefonica Research), internet startups (Qustodio), retail tech companies (Ocado), and research labs. Further, our ongoing research in audio AI (covid-19-sounds.org) has drawn international attention (covered by BBC, The Guardian, Forbes, The Times, El País).

CV

Publications


2021

Self-supervised transfer learning of physiological representations from free-living wearable data
Dimitris Spathis, Ignacio Perez-Pozuelo, Soren Brage, Nicholas Wareham, Cecilia Mascolo
ACM Conference on Health, Inference, and Learning (CHIL), Virtual Event, USA

Exploring Automatic COVID-19 Diagnosis via voice and symptoms from Crowdsourced Data
Jing Han, Chloë Brown*, Jagmohan Chauhan*, Andreas Grammenos*, Apinan Hasthanasombat*, Dimitris Spathis*, Tong Xia*, Pietro Cicuta, Cecilia Mascolo
International Conference on Acoustics, Speech, & Signal Processing (ICASSP), Toronto, Canada

SelfHAR: Improving Human Activity Recognition through Self-training with Unlabeled Data
Chi Ian Tang, Ignacio Perez-Pozuelo*, Dimitris Spathis*, Soren Brage, Nicholas Wareham, Cecilia Mascolo
Proc. on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT/Ubicomp), 5(1)

The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates
Björn W. Schuller, Anton Batliner, Christian Bergler, Cecilia Mascolo, Jing Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J. M. Rothkrantz, Joeri Zwerts, Jelle Treep, Casper Kaandorp
Conference of the International Speech Communication Association (Interspeech), Brno, Czechia

Digital Phenotyping and Sensitive Health Data: Implications for Data Governance
Ignacio Perez-Pozuelo, Dimitris Spathis, Jordan Gifford-Moore, Jessica Morley, Josh Cowls
Journal of the American Medical Informatics Association

Anticipatory Detection of Compulsive Body-focused Repetitive Behaviors with Wearables
Benjamin Searle, Dimitris Spathis, Marios Constantinides, Daniele Quercia, Cecilia Mascolo
ACM International Conference on Mobile Human-Computer Interaction (MobileHCI), Toulouse, France

Wearables, smartphones and artificial intelligence for digital phenotyping and health
Ignacio Perez-Pozuelo, Dimitris Spathis, Emma Clifton, Cecilia Mascolo
Digital Health, Chapter 3

Universals and variations in musical preferences: A study of preferential reactions to Western music in 350,000 people across 53 countries
David M. Greenberg, Sebastian Wride, Daniel Snowden, Dimitris Spathis, Jeff Potter, Jason Rentfrow
Journal of Personality and Social Psychology, to appear

2020

Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data
Chloë Brown*, Jagmohan Chauhan*, Andreas Grammenos*, Jing Han*, Apinan Hasthanasombat*, Dimitris Spathis*, Tong Xia*, Pietro Cicuta, Cecilia Mascolo
International Conference on Knowledge Discovery and Data Mining (KDD), San Diego, USA

Learning Generalizable Physiological Representations from Large-scale Wearable Data
Dimitris Spathis, Ignacio Perez-Pozuelo, Soren Brage, Nicholas Wareham, Cecilia Mascolo
NeurIPS Machine Learning for Mobile Health workshop (ML4MH @ NeurIPS 2020), Vancouver, Canada

Exploring Contrastive Learning in Human Activity Recognition for Healthcare
Chi Ian Tang, Ignacio Perez-Pozuelo, Dimitris Spathis, Cecilia Mascolo
NeurIPS Machine Learning for Mobile Health workshop (ML4MH @ NeurIPS 2020), Vancouver, Canada

2019

Sequence Multi-task Learning to Forecast Mental Wellbeing from Sparse Self-reported Data
Dimitris Spathis, Sandra Servia, Katayoun Farrahi, Cecilia Mascolo, Jason Rentfrow
International Conference on Knowledge Discovery and Data Mining (KDD), Anchorage, USA

Passive mobile sensing and psychological traits for large scale mood prediction
Dimitris Spathis, Sandra Servia, Katayoun Farrahi, Cecilia Mascolo, Jason Rentfrow
International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth), Trento, Italy

Interactive dimensionality reduction using similarity projections
Dimitris Spathis, Nikolaos Passalis, Anastasios Tefas
Knowledge-Based Systems, 165: 77-91

2018

Fast, Visual and Interactive Semi-supervised Dimensionality Reduction
Dimitris Spathis, Nikolaos Passalis, Anastasios Tefas
ECCV Efficient Feature Representation Learning workshop (CEFRL @ ECCV 2018), Munich, Germany

2017

Diagnosing Asthma and Chronic Obstructive Pulmonary Disease with Machine Learning
Dimitris Spathis, Panayiotis Vlamos
Health Informatics Journal, 25(3): 811–827 (issue published in 2019)

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words
Joan Serra, Ilias Leontiadis, Dimitris Spathis, Gianluca Stringhini, Jeremy Blackburn, Athena Vakali
ACL Abusive Language Online workshop (ALW @ ACL 2017), Vancouver, Canada

2016

A comparison between semi-supervised and supervised text mining techniques on detecting irony in greek political tweets
Basilis Charalampakis, Dimitris Spathis, Elias Kouslis, Katia Kermanidis
Engineering Applications of Artificial Intelligence, 51: 50-57

2015

Detecting Irony on Greek Political Tweets: A Text Mining Approach
Basilis Charalampakis, Dimitris Spathis, Elias Kouslis, Katia Kermanidis
International Conference on Engineering Applications of Neural Networks, Rhodes, Greece

2014

Glocal News: An Attempt to Visualize the Discovery of Localized Top Local News, Globally
Dimitris Spathis, Theofilos Mouratidis, Spyros Sioutas, Athanasios Tsakalidis
International Conference on Conceptual Modeling, Hong Kong, China

Preprints

Improving the Definition of Depressed Mood with Digital Phenotyping
Maxime Taquet, Dimitris Spathis, Jason Rentfrow, Cecilia Mascolo, Guy M Goodwin
SSRN preprint, 3725630, 2020

Detecting sleep in free-living conditions without sleepdiaries: a device-agnostic, wearable heart rate sensing approach
Ignacio Perez-Pozuelo, Marius Posa, Dimitris Spathis, Kate Westgate, Nicholas Wareham, Cecilia Mascolo, Soren Brage, Joao Palotti
medRxiv preprint, 2020

Photo-Quality Evaluation based on Computational Aesthetics: Review of Feature Extraction Techniques
Dimitris Spathis
arXiv preprint, 1612.06259, 2016

Service


I take great joy in participating in the academic community, contributing and learning from peer review. I am serving on the following international conferences and journals:

Program Committee (PC): AAAI 2021, IJCAI 2020, KDD 2020 & 2021, Sensiblend @ Ubicomp 2021

Organizer: Tutorial on federated learning for mobile sensing at MobiCom 2021 (co-organized with Stefanos Laskaridis & Mario Almeida at Samsung AI)

Reviewer: Nature Scientific Reports, Nature Digital Medicine, NeurIPS, ICLR, ICML, AAAI, IJCAI, KDD, CHI, Ubicomp/IMWUT, ICASSP, Expert Systems with Applications, Neurocomputing, WWW/The Web Conference, Engineering Applications of Artificial Intelligence, ICWSM, ICPR, and more.

Talks


Learning Generalizable Physiological Representations from Large-scale Wearable Data AI to model Human Behaviour and Health Sequence Multi-task Learning to Forecast Mental Wellbeing from Sparse Self-reported Data
  • 📍 KDD'19, Anchorage, USA — August 4, 2019
Passive mobile sensing and psychological traits for large scale mood prediction Deep sequence learning for large-scale inference of human behaviour from mobile sensor data
  • 📍 MRC Epidemiology Unit, University of Cambridge, UK — March 5, 2019
  • 📍 Ocado, Barcelona, Spain / Hatfield, UK (remote) — July 10, 2019
Fast, Visual and Interactive Semi-supervised Dimensionality Reduction
  • 📍 Facebook, PhD Open House (poster), London, UK — October 25, 2018
Deep Sequence Learning on Mobile Data for Mood Prediction
  • 📍 MobiUK Symposium, Cambridge, UK — September 13, 2018
Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words Detecting Irony on Greek Political Tweets: A Text Mining Approach
  • 📍 EANN'15, Rhodes, Greece — September 25, 2015

Mentoring


I find it incredibly stimulating working with ambitious students in research projects. Some recent graduate and bachelor theses:

  • Chi Ian Tang — Self-supervised learning in timeseries (papers at IMWUT & NeurIPS-W)
  • Benjamin Searle — Predicting compulsive behaviours with watches (paper at MobileHCI)
  • Kevalee Shah — Contrastive learning in activity/clinical timeseries (ongoing)
  • Chuen Low — Pure attention models for medical timeseries (ongoing)

I have also been a teaching assistant for the following undergraduate courses:

Playground


“The next big thing in technology often starts off looking like a toy”

Quantifying name-dropping

Communitypoprefs.com is a data visualization website, where we present every pop-culture reference over the course of 5 seasons of the TV series Community.


Map out your music taste on Spotify

Visualizing my favourite songs on Spotify with dimensionality reduction and anomaly detection. Data essay published in Cuepoint Magazine, Medium's premier music publication.


Children books and childish language?

Text mining Game of Thrones, Harry Potter, Hunger Games and Lord of the Rings books. Data essay featured in Medium's Editor Picks.


Anonymize kids' faces before posting online

Mobile app with face recognition, age estimation, & emotion recognition to blur kids or replace their face with emotion-based emoji. Developed during HackZurich 2018.


Discover top local news globally

Glocalne.ws was a mashup of Google News and Google Maps. Unfortunately it is now defunct due to API discontinuance.


Composing music and text with Recurrent Neural Networks

Training neural networks on massive amounts of musical notation and literature and letting them create their own art. Essay in Greek but you can still see/listen to the results.


Personal


Non-academic things about me: I love music, both playing and listening. I am mostly into art rock and indie folk, with the occasional exception of some well-crafted pop. Although I am an accordionist by training, over the last few years I've been playing mostly piano and ukulele. In a previous life, I performed in the critically acclaimed band The Children of the Oldness (aka Kore Ydro). You can listen to the album "Consortium in Amato" here.

I also enjoy street photography and in particular playing with light—photography comes from Greek φως (light) and γραφή (writing), or drawing with light. A subset of my pictures is on Flickr while I had my 15" of fame when one of my landscapes was featured in the Huffington Post.

Lastly, and perhaps most importantly, I'm always on the lookout for ways to move items from the "non-academic list" to the "academic list"—let me know if you'd like to help!