Yiannos A. Stathopoulos
I am a final year PhD student working on Mathematical Information Retrieval (MIR) under the supervision of Dr. Simone Teufel.
Publications
2018
 Variable Typing: Assigning Meaning to Variables in Mathematical Text
Yiannos A. Stathopoulos, Simon Baker, Marek Rei and Simone Teufel
In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018) New Orleans, United States, 2018
2016
 Mathematical Information Retrieval Based on Type Embeddings and Query Expansion
Yiannos A. Stathopoulos and Simone Teufel
In Proceedings of the 26th International Conference on Computational Linguistics (Coling 2016). Osaka, Japan, 2016.
2015
 Retrieval of researchlevel mathematical information needs: A Test Collection and Technical Terminology Experiment
Yiannos A. Stathopoulos and Simone Teufel
In Proceedings of the Short Papers of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015). Beijing, China, 2015.
2011
 OMEX: Software for Mining Mathematical Expression Semantics from Scientific Documents.
Yiannos A. Stathopoulos and Brian Harrington
In Proceedings of the 2011 IEEE Fifth International Conference on Semantic Computing (ICSC '11). IEEE Computer Society, Washington, DC, USA, 209210. DOI=10.1109/ICSC.2011.65 http://dx.doi.org/10.1109/ICSC.2011.65
Code and Data Downloads
 Download the Cambridge University MathIR Test Collection (for retrieval of researchlevel mathematics) described in
"Retrieval of researchlevel mathematical information needs: A Test Collection and Technical Terminology Experiment"  Download the Cambridge Dictionary of Mathematical Types (CDMT) seed type dictionary (10601 phrases), goldstandard data set for type detection from "Mathematical Information Retrieval Based on Type Embeddings and Query Expansion" and extended type dictionary (1.23m phrases) from "Variable Typing: Assigning Meaning to Variables in Mathematical Text"
 Download the Variable Typing Data Set for assigning meaning to mathematical variables using Machine Learning
Cool things I've built
This is a partial list of cool stuff I've built.

Mathalyzer  an interactive tool for analysing mathematical formuale in PDF documents. Written in C++ and GTK+, this tool employs the PresentationAbstractionControl (PAC) pattern to synchronise multiple data elements in a unified presentation. The idea behind Mathalyzer is to produce a tool that combines elements of Acrobat, Photoshop and SPSS.
 Spine  A small C++ library, forked from the subsystems of Mathalyzer, that implements PresentationAbstractionControl (PAC) message passing with GTK+ controls. This library is used to synchronise the datamodel of GUI apps, with various independent GUI elements implemented in GTK+.
 Interval and range trees  A small C++ library of interval and range trees for optimising the Mathalyzer canvas. My implementation of interval and range trees is built on top of Redblack trees. Upon rotation, the RB tree implementation raises a rotation event. Event handlers at higher levels are responsible for applying transformations that reestablish the invariants of the interval and range trees.
 OMEX  Software that detects and extracts mathematical expressions from PDF. The pipeline is the subject of my paper with Dr. Brian Harrington. Mathalyzer was built to extend aspects of this pipeline with machine learning.
 MapReduce in C++  I built a small C++ implementation of Google's MapReduce. The implementation is designed to abstract parallelisation of tasks using Mappers, Grouppers and Reducers on multicore systems.