Yufan Guo


I am a Research Associate in the Natural Language and Information Processing Group at the Computer Laboratory at the University of Cambridge, working with Anna Korhonen on the CRAB project. My main research interest is statistical NLP and applications of NLP in real-world tasks, e.g. scientific text processing and text mining.

I hold a PhD in Computation, Cognition and Language and an MPhil in Computer Speech, Text and Internet Technology from the University of Cambridge, and a bachelor's degree in Computer Science from Peking University.



  • Yufan Guo, Ilona Silins, Ulla Stenius and Anna Korhonen. 2013. Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review. In Bioinformatics 2013, doi: 10.1093/bioinformatics/btt163. Link
  • Yufan Guo, Roi Reichart and Anna Korhonen. 2013. Improved Information Structure Analysis of Scientific Documents Through Discourse and Lexical Constraints. In Proceedings of NAACL 2013. Atlanta, US. Link


  • Yufan Guo, Ilona Silins, Roi Reichart and Anna Korhonen. 2012. CRAB Reader: A Tool for Analysis and Visualization of Argumentative Zones in Scientific Literature. In Proceedings of COLING 2012. Mumbai, India. Link
  • Danish Contractor, Yufan Guo and Anna Korhonen. 2012. Using Argumentative Zones for Extractive Summarization of Scientific Articles. In Proceedings of COLING 2012. Mumbai, India. Link
  • Yufan Guo. 2012. E-mail Spam Filtering and Natural Language Processing. In Hakin9 Exploiting Software, 2:5. Link


  • Yufan Guo, Anna Korhonen, Ilona Silins and Ulla Stenius. 2011. Weakly-supervised learning of information structure of scientific abstracts - is it accurate enough to benefit real-world tasks in biomedicine? In Bioinformatics 2011, doi: 10.1093/bioinformatics/btr536. Link
  • Yufan Guo, Anna Korhonen and Thierry Poibeau. 2011. A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents. In Proceedings of EMNLP 2011. Edinburgh, UK. Link
  • Yufan Guo, Anna Korhonen, Maria Liakata, Ilona Silins, Johan Hogberg and Ulla Stenius. 2011. A comparison and user-based evaluation of models of textual information structure in the context of cancer risk assessment. In BMC Bioinformatics 2011, 12:69. Link


  • Yufan Guo, Anna Korhonen, Maria Liakata, Ilona Silins, Lin Sun and Ulla Stenius. 2010. Identifying the Information Structure of Scientific Abstracts: An Investigation of Three Different Schemes. In Proceedings of bio-NLP 2010. Uppsala, Sweden. Link


  • Using Models of Textual Information Structure to Aid the Review of Biomedical Abstracts in Cancer Risk Assessment. Invited talk at Laboratoire d'Informatique de Paris-Nord, France. 2010. Slides
  • © Yufan Guo. Last updated: Oct. 2013. Email: yg244 AT cam.ac.uk