If you are interested in one of these projects, please email me before filling out the form!

Modelling polysemy in compositional distributional semantic models

Proposer: Ekaterina Shutova and Jean Maillard

Supervisor: Ekaterina Shutova and Jean Maillard 

Special Resources: WordNet, Wikipedia corpus 

Requirements: NLP, Machine Learning


The vast majority of compositional distributional models build a single representation for all senses of a word, collapsing distinct senses together. Several researchers argue that terms with ambiguous senses can be handled by such models without any recourse to additional disambiguation steps, as long as contextual information is available. For instance, Baroni et al (2014) suggest that their models largely avoid problems handling polysemous adjectives because the adjective matrices implicitly incorporate contextual information. However, they do draw a distinction between two ways in which the meaning of a term can vary. Continuous polysemy — the subtle and continuous variations in meaning resulting from the different contexts in which a word appears, e.g. run to the store vs. run a marathon, — is relatively tractable, in their opinion. This contrasts with discrete homonymy — the association of a single term with completely independent meanings,  e.g. river bank vs investment bank; and regular polysemy — systematic shifts in meaning due to metaphorical or metonymic use, e.g. bright light vs. bright student. The latter are more challenging to handle in compositional distributional models. The recent approaches of Kartsaklis and Sadrzadeh (2013) and Gutierrez et al. (2016) have shown that training sense-disambiguated functions for the cases of homonymy and regular polysemy resulted in improved model performance as compared to single-sense models. However, none of the approaches studied polysemy systematically and at a large scale, nor contrasted discrete and continuous polysemy in their models. This project will test to what extent the widely used single-sense compositional distributional models can handle a variety of senses, using WordNet as a gold standard for sense distinction. WordNet is the largest collection of word senses available to date and it captures both the fine-grained distinctions in continuous polysemy and the discrete homonymy and regular polysemy. After testing the single-sense models, the project will move on to building a novel compositional distributional model incorporating sense distinctions learned from WordNet and corpus data.

References and further reading:

Marco Baroni, Raffaella Bernardi and Roberto Zamparelli. 2014. Frege in space: A program for compositional distributional semantics. In Linguistic Issues in Language Technology, special issue on Perspectives on Semantic Representations for Textual Inference. Volume 9, pp 241–346. 

Marco Baroni and Roberto Zamparelli. 2010. Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1183–1193. Association for Computational Linguistics. 

Dimitri Kartsaklis, Mehrnoosh Sadrzadeh, and Stephen Pulman. 2013b. Separating disambiguation from composition in distributional semantics. In Proceedings of the 2013 Conference on Computational Natural Language Learning, pages 114–123

Stephen Clark, Laura Rimell, Tamara Polajnar and Jean Maillard. The Categorial Framework for Compositional Distributional Semantics. Technical Report, University of Cambridge Computer Laboratory.

Dario Gutierrez, Ekaterina Shutova, Tyler Marghetis and Benjamin Bergen. 2016. Literal and Metaphorical Senses in Compositional Distributional Semantic Models. In Proceedings of ACL 2016, Berlin, Germany.

Modelling visual context in multimodal semantics

Proposer: Ekaterina Shutova and Anita Verő

Supervisor: Ekaterina Shutova and Anita Verő 

Special Resources: Visual Genome and MS COCO datasets; MMfeat toolkit 

Requirements: NLP, Machine Learning, interest in Computer Vision


Much research in cognitive science and neuroscience suggests that human meaning representations are not merely a product of our linguistic exposure, but are also grounded in our perceptual system and sensori-motor experience (see e.g. Barsalou (2008)). This suggests that we acquire word meanings and relational knowledge not merely from linguistic input, such as text or speech, but also from other modalities, such as vision, taste, smell, touch and motor activity. Multimodal models of word meaning have thus enjoyed a growing interest in semantics, mostly focusing on combining linguistic and visual information so far. Such models outperform purely text-based models in a variety of tasks, such as semantic similarity estimation, predicting compositionality, bilingual lexicon induction and figurative language processing among others. However, the multimodal models used to date extracted visual features from complete images and none of the approaches investigated the the role of different kinds of visual information in natural language semantics. For instance, some concepts may be better characterised by the internal visual features of objects in the image and others by wider visual context and scene information. This project will investigate whether explicitly differentiating between internal visual properties and visual context improves the performance of multimodal semantic models. The student will first experiment with learning a multimodal semantic model from image regions manually annotated for specific objects, actions or scenes, using Convolutional Neural Networks (CNN) and the MMfeat toolkit for multimodal representation learning (Kiela, 2016); and then integrate the multimodal model with automatic object recognition techniques. The project will evaluate the new models in the context of semantic similarity and metaphor identification tasks.

References and further reading:

Lawrence W. Barsalou. 2008. Grounded cognition. Annual Review of Psychology, 59(1):617–645. 

Elia Bruni, Nam Khanh Tran, and Marco Baroni. 2014. Multimodal distributional semantics. Journal of Artificial Intelligence Research, 49:1–47. 

Douwe Kiela and Leon Bottou. 2014. Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics. Proceedings of EMNLP 2014, Doha, Qatar.

Douwe Kiela. 2016 MMFEAT: A Toolkit for Extracting Multi-Modal Features. Proceedings of ACL 2016: System Demonstrations, Berlin, Germany.

Douwe Kiela, Anita Vero and Stephen Clark. Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics. In Proceedings of EMNLP 2016, Austin, TX.

Ekaterina Shutova, Douwe Kiela and Jean Maillard. 2016. Black Holes and White Rabbits: Metaphor Identification with Visual Features. In Proceedings of NAACL-HLT 2016, San Diego, CA.