Computer Laboratory

Visual Bilingual Lexicon Induction with Transferred ConvNet Features

This paper is concerned with the task of bilingual lexicon induction using imagebased features. By applying features from a convolutional neural network (CNN), we obtain state-of-the-art performance on a standard dataset, obtaining a 79% relative improvement over previous work which uses bags of visual words based on SIFT features. The CNN image-based approach is also compared with state-of-the-art linguistic approaches to bilingual lexicon induction, even outperforming these for one of three language pairs on another standard dataset. Furthermore, we shed new light on the type of visual similarity metric to use for genuine similarity versus relatedness tasks, and experiment with using multiple layers from the same network in an attempt to improve performance.

  • [pdf]
  • [bib]
  • [data & code] (593M) - image embedding Python pickle files for all languages, evaluation code and evaluation data.

Datasets: