I am a PHD student in NLIP group. My supervisor is Dr. Anna Korhonen.

I come from Qingdao, People Republic of China. Here is a short CV of mine.

Research Interests

Verb classification, Text mining, Unsupervised Learning


I studied MPhil in Computer Speech Text and Internet Technology in Computer Laboratory, University of Cambridge from 2006-07. My MPhil project was automatic lexical classification, supervised by Dr. Anna Korhonen.

In 2006, I obtained Bsc in Computer Science from School of Informatics (now School of Computer Science), University of Manchester. My final year project was proposed by myself, and under the supervision of Dr. Xiaojun Zeng. The project builds one of the world's first web-based document management and processing system, similar to Google document and Microsoft Live workspace. Some screenshots are available in the thesis.


Lin Sun, Diana McCarthy and Anna Korhonen Diathesis alternation approximation for verb clustering ACL 2013

Ekaterina Shutova, Lin Sun Unsupervised Metaphor Identification Using Hierarchical Graph Factorization Clustering NAACL 2013

Anna Korhonen, Diarmuid Ó Séaghdha, Ilona Silins, Lin Sun, Johan Högberg and Ulla Stenius Text mining for literature review and knowledge discovery in cancer risk assessment and research. PLoS ONE 7(4):e33427. PDF

Lin Sun, Anna Korhonen Hierarchical Verb Clustering Using Graph Factorization EMNLP 2011 PDF

Lin Sun, Thierry Poibeau, Anna Korhonen and Cedric Messiant Investigating the cross-linguistic potential of VerbNet -style classification Coling 2010

Ekaterina Shutova, Lin Sun and Anna Korhonen Metaphor Identification Using Verb and Noun Clustering Coling 2010

Tom Lippincott, Diarmuid Ó Séaghdha, Lin Sun and Anna Korhonen Three Different Schemes Exploring variation across biomedical subdomains Coling 2010

Yufan Guo, Anna Korhonen, Maria Liakata, Ilona Silins, Lin Sun and Ulla Stenius Identifying the Information Structure of Scientific Abstracts: An Investigation of Three Different Schemes BioNLP 2010

Anna Korhonen, Ilona Silins, Lin Sun and Ulla Stenius The first step in the development of text mining technology for cancer risk assessment: identifying and organizing scientific evidence in risk assessment literature . BMC Bioinformatics 2009, 10:303. on BMC

Lin Sun, Anna Korhonen. 2009 Improving Verb Clustering with Automatically Acquired Selectional Preferences. Proceedings of EMNLP 2009 PDF

Ilona Silins, Anna Korhonen, Johan Hogberg, Lin Sun, and Ulla Stenius. 2009. Improved Cancer Risk Assessment Using Text Mining. Proceedings of the 100th Annual Meeting of the American Association for Cancer Research. Denver, Colorado.

Lin Sun, Anna Korhonen, Ilona Silins and Ulla Stenius. 2009 User-Driven Development of Text Mining Resources for Cancer Risk Assessment . Proceedings of the HLT-NAACL BioNLP Workshop 2009 PDF POSTER

Lin Sun, Anna Korhonen and Yuval Krymolowski. 2008. "Verb Class Discovery from Rich Syntactic Data", Ninth International Conference on Computational Linguistics and Intelligent Text Processing (CICLING 2008) PDF

Lin Sun, Anna Korhonen and Yuval Krymolowski. 2008, "Automatic Classification of English Verbs Using Rich Syntactic Features" Third International Joint Conference on Natural Language Processing (IJCNLP 2008) PDF

Lin Sun. 2007 "Automatic lexical classification" MPhil dissertation. Computer laboratory, University of Cambridge PDF


I am supported by a BT-badged Dorothy Hodgkin Postgraduate Awards. I would like to thank EPSRC and British Telecom.

Technical Interests

Programming languages: Java, Perl, Python ; Data warehouse; Recommendation System

Blog most of the entries are written in simplified chinese


Lexical Aquisition

Machine learning and Math


