Hi 👋 I’m Dimitris Spathis

AI Researcher
University of Cambridge
         


I recently passed my PhD defense at the University of Cambridge, where I was supervised by Prof. Cecilia Mascolo. I am now a research intern at Microsoft Research working on machine learning for health.

My research as a whole enables AI models to learn richer semantics of high-dimensional real-world data (time-series, mobile sensors, audio, or other modalities). To this end, I work on data-efficient models that learn meaningful representations by leveraging the following paradigms:

  • Self-supervision. Contrastive or generative? What comes after data augmentations?
  • Transfer learning. How can models generalize out of distribution? What are their limits?
  • Multi-tasking. Which tasks can be used as auxiliary targets and why?

Data-driven models of human behavior encode many complexities of the real world and hence I spend a significant amount of time thinking about missingness, irregular sampling, noise, multi-modality, and long-tails. Beyond theory, I collaborate closely with world-class health experts to apply robust concepts from data science and accelerate scientific discovery.

During my studies, I was fortunate to work in diverse industries including multinational telcos (Telefonica Research), internet startups (Qustodio), retail tech (Ocado), and research labs. Further, our ongoing research in audio AI (covid-19-sounds.org) has drawn international attention (covered by BBC, The Guardian, Forbes, The Times, El País).

CV

📖 Publications


2021

COVID-19 Sounds: A Large-Scale Audio Dataset for Digital Respiratory Screening
Tong Xia*, Dimitris Spathis*, Chloe Brown, Jagmohan Chauhan, Andreas Grammenos, Jing Han, Apinan Hasthanasombat, Erika Bondareva, Ting Dang, Andres Floto, Pietro Cicuta, Cecilia Mascolo (*equal contribution)
Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, to appear

Self-supervised transfer learning of physiological representations from free-living wearable data
Dimitris Spathis, Ignacio Perez-Pozuelo, Soren Brage, Nicholas Wareham, Cecilia Mascolo
ACM Conference on Health, Inference, and Learning (CHIL), Virtual Event, USA

Exploring Automatic COVID-19 Diagnosis via voice and symptoms from Crowdsourced Data
Jing Han, Chloë Brown*, Jagmohan Chauhan*, Andreas Grammenos*, Apinan Hasthanasombat*, Dimitris Spathis*, Tong Xia*, Pietro Cicuta, Cecilia Mascolo
International Conference on Acoustics, Speech, & Signal Processing (ICASSP), Toronto, Canada

SelfHAR: Improving Human Activity Recognition through Self-training with Unlabeled Data
Chi Ian Tang, Ignacio Perez-Pozuelo*, Dimitris Spathis*, Soren Brage, Nicholas Wareham, Cecilia Mascolo
Proc. on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT/Ubicomp), 5(1)

The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates
Björn W. Schuller, Anton Batliner, Christian Bergler, Cecilia Mascolo, Jing Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J. M. Rothkrantz, Joeri Zwerts, Jelle Treep, Casper Kaandorp
Conference of the International Speech Communication Association (Interspeech), Brno, Czechia

Digital Phenotyping and Sensitive Health Data: Implications for Data Governance
Ignacio Perez-Pozuelo, Dimitris Spathis, Jordan Gifford-Moore, Jessica Morley, Josh Cowls
Journal of the American Medical Informatics Association, 28(9): 2002–2008

Anticipatory Detection of Compulsive Body-focused Repetitive Behaviors with Wearables
Benjamin Searle, Dimitris Spathis, Marios Constantinides, Daniele Quercia, Cecilia Mascolo
ACM International Conference on Mobile Human-Computer Interaction (MobileHCI), Toulouse, France

Wearables, smartphones and artificial intelligence for digital phenotyping and health
Ignacio Perez-Pozuelo, Dimitris Spathis, Emma Clifton, Cecilia Mascolo
Digital Health, Chapter 3

Universals and variations in musical preferences: A study of preferential reactions to Western music in 350,000 people across 53 countries
David M. Greenberg, Sebastian Wride, Daniel Snowden, Dimitris Spathis, Jeff Potter, Jason Rentfrow
Journal of Personality and Social Psychology, to appear

2020

Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data
Chloë Brown*, Jagmohan Chauhan*, Andreas Grammenos*, Jing Han*, Apinan Hasthanasombat*, Dimitris Spathis*, Tong Xia*, Pietro Cicuta, Cecilia Mascolo
International Conference on Knowledge Discovery and Data Mining (KDD), San Diego, USA

Learning Generalizable Physiological Representations from Large-scale Wearable Data
Dimitris Spathis, Ignacio Perez-Pozuelo, Soren Brage, Nicholas Wareham, Cecilia Mascolo
NeurIPS Machine Learning for Mobile Health workshop (ML4MH @ NeurIPS), Vancouver, Canada

Exploring Contrastive Learning in Human Activity Recognition for Healthcare
Chi Ian Tang, Ignacio Perez-Pozuelo, Dimitris Spathis, Cecilia Mascolo
NeurIPS Machine Learning for Mobile Health workshop (ML4MH @ NeurIPS), Vancouver, Canada

2019

Sequence Multi-task Learning to Forecast Mental Wellbeing from Sparse Self-reported Data
Dimitris Spathis, Sandra Servia, Katayoun Farrahi, Cecilia Mascolo, Jason Rentfrow
International Conference on Knowledge Discovery and Data Mining (KDD), Anchorage, USA

Passive mobile sensing and psychological traits for large scale mood prediction
Dimitris Spathis, Sandra Servia, Katayoun Farrahi, Cecilia Mascolo, Jason Rentfrow
International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth), Trento, Italy

Interactive dimensionality reduction using similarity projections
Dimitris Spathis, Nikolaos Passalis, Anastasios Tefas
Knowledge-Based Systems, 165: 77-91

2018

Fast, Visual and Interactive Semi-supervised Dimensionality Reduction
Dimitris Spathis, Nikolaos Passalis, Anastasios Tefas
ECCV Efficient Feature Representation Learning workshop (CEFRL @ ECCV 2018), Munich, Germany

2017

Diagnosing Asthma and Chronic Obstructive Pulmonary Disease with Machine Learning
Dimitris Spathis, Panayiotis Vlamos
Health Informatics Journal, 25(3): 811–827 (issue published in 2019)

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words
Joan Serra, Ilias Leontiadis, Dimitris Spathis, Gianluca Stringhini, Jeremy Blackburn, Athena Vakali
ACL Abusive Language Online workshop (ALW @ ACL 2017), Vancouver, Canada

2016

A comparison between semi-supervised and supervised text mining techniques on detecting irony in greek political tweets
Basilis Charalampakis, Dimitris Spathis, Elias Kouslis, Katia Kermanidis
Engineering Applications of Artificial Intelligence, 51: 50-57

2015

Detecting Irony on Greek Political Tweets: A Text Mining Approach
Basilis Charalampakis, Dimitris Spathis, Elias Kouslis, Katia Kermanidis
International Conference on Engineering Applications of Neural Networks, Rhodes, Greece

2014

Glocal News: An Attempt to Visualize the Discovery of Localized Top Local News, Globally
Dimitris Spathis, Theofilos Mouratidis, Spyros Sioutas, Athanasios Tsakalidis
International Conference on Conceptual Modeling, Hong Kong, China

Preprints

Sounds of COVID-19: exploring realistic performance of audio-based digital testing
Jing Han*, Tong Xia*, Dimitris Spathis, Erika Bondareva, Chloë Brown, Jagmohan Chauhan, Ting Dang, Andreas Grammenos, Apinan Hasthanasombat, Andres Floto, Pietro Cicuta, Cecilia Mascolo
arXiv preprint, 2106.15523, 2021

Detecting sleep in free-living conditions without sleepdiaries: a device-agnostic, wearable heart rate sensing approach
Ignacio Perez-Pozuelo, Marius Posa, Dimitris Spathis, Kate Westgate, Nicholas Wareham, Cecilia Mascolo, Soren Brage, Joao Palotti
medRxiv preprint, 2021

Photo-Quality Evaluation based on Computational Aesthetics: Review of Feature Extraction Techniques
Dimitris Spathis
arXiv preprint, 1612.06259, 2016

🧐 Academic service


I take great joy in participating in the academic community, contributing and learning from peer review. I am serving on the following international conferences and journals:

Program Committee (PC): AAAI 2021 & 2022, IJCAI 2020, KDD 2020 & 2021 (PC & Session Chair), SDM 2022, Sensiblend @ Ubicomp 2021

Organizer: Tutorial on federated learning for mobile sensing at MobiCom 2021 (co-organized with Stefanos Laskaridis & Mario Almeida at Samsung AI)

Reviewer: Nature Scientific Reports, Nature Digital Medicine, NeurIPS, ICLR, ICML, AAAI, IJCAI, KDD, CHI, Ubicomp/IMWUT, ICASSP, Expert Systems with Applications, Neurocomputing, WWW/The Web Conference, Engineering Applications of Artificial Intelligence, Sensors, ICWSM, ICPR, and more.

📢 Talks


Towards unsupervised wearable representations for longitudinal cardio-fitness prediction
  • 📍 MobiUK'21, virtual event, UK — July 6, 2021
Learning Generalizable Physiological Representations from Large-scale Wearable Data AI to model Human Behaviour and Health Sequence Multi-task Learning to Forecast Mental Wellbeing from Sparse Self-reported Data
  • 📍 KDD'19, Anchorage, USA — August 4, 2019
Passive mobile sensing and psychological traits for large scale mood prediction Deep sequence learning for large-scale inference of human behaviour from mobile sensor data
  • 📍 MRC Epidemiology Unit, University of Cambridge, UK — March 5, 2019
  • 📍 Ocado, Barcelona, Spain & Hatfield, UK (remote) — July 10, 2019
Fast, Visual and Interactive Semi-supervised Dimensionality Reduction
  • 📍 Facebook PhD Open House (poster), London, UK — October 25, 2018
Deep Sequence Learning on Mobile Data for Mood Prediction
  • 📍 MobiUK'18, Cambridge, UK — September 13, 2018
Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words Detecting Irony on Greek Political Tweets: A Text Mining Approach
  • 📍 EANN'15, Rhodes, Greece — September 25, 2015

🎒 Mentoring


I find it incredibly stimulating working with driven students in research projects. Some recent graduate and bachelor theses:

  • Chi Ian Tang — Self-supervised learning in timeseries (papers at IMWUT & NeurIPS-W)
  • Benjamin Searle — Predicting compulsive behaviours with watches (paper at MobileHCI)
  • Kevalee Shah — Contrastive learning in activity/clinical timeseries (ongoing)
  • Chuen Low — Pure attention models for medical timeseries (ongoing)

I have also been a teaching assistant for the following undergraduate courses:

🎠 Playground


“The next big thing in technology often starts off looking like a toy”

Quantifying name-dropping

Communitypoprefs.com is a data visualization website, where we present every pop-culture reference over the course of 5 seasons of the TV series Community.


Map out your music taste on Spotify

Visualizing my favourite songs on Spotify with dimensionality reduction and anomaly detection. Data essay published in Cuepoint Magazine, Medium's premier music publication.


Children books and childish language?

Text mining Game of Thrones, Harry Potter, Hunger Games and Lord of the Rings books. Data essay featured in Medium's Editor Picks.


Anonymize kids' faces before posting online

Mobile app with face recognition, age estimation, & emotion recognition to blur kids or replace their face with emotion-based emoji. Developed during HackZurich 2018.


Discover top local news globally

Glocalne.ws was a mashup of Google News and Google Maps. Unfortunately it is now defunct due to API discontinuance.


Composing music and text with Recurrent Neural Networks

Training neural networks on massive amounts of musical notation and literature and letting them create their own art. Essay in Greek but you can still see/listen to the results.

🕳️🐇 Personal


Non-academic things about me: I love music, both playing and listening. I am mostly into art rock and indie folk, with the occasional exception of some well-crafted pop. Although I am an accordionist by training, over the last few years I've been playing mostly piano and ukulele. In a previous life, I performed in the critically acclaimed band The Children of the Oldness (aka Kore Ydro). You can listen to the album "Consortium in Amato" here.

I also enjoy street photography and in particular playing with light—photography comes from Greek φως (light) and γραφή (writing), or drawing with light. A subset of my pictures is on Flickr while I had my 15" of fame when one of my landscapes was featured in the Huffington Post.

Lastly, and perhaps most importantly, I'm always on the lookout for ways to move items from the "non-academic list" to the "academic list"—let me know if you'd like to help!