Department of Computer Science and Technology

Technical reports

Gaussian Pixie Autoencoder: Introducing Functional Distributional Semantics to continuous latent spaces

Primož Fabiani

January 2022, 50 pages

This technical report is based on a dissertation submitted June 2021 by the author for the degree of Master of Philosophy (Advanced Computer Science) to the University of Cambridge, Hughes Hall.

DOI: 10.48456/tr-967

Abstract

Functional Distributional Semantics (FDS) is a recent lexical semantics framework that represents word meaning as a function from the latent space of entities to a probability for each word. This thesis examines previous FDS models, highlighting the advantages and drawbacks. A new Gaussian Pixie Autoencoder model is proposed to introduce FDS to continuous latent modelling. The proposed model improves on the predecessors in terms of simplicity and efficiency setting a new baseline for continuous FDS models. The thesis shows dropout is necessary for context learning with this model type. Evaluated on contextual similarity the proposed model outperforms the discrete autoencoder and BERT baseline on one task with satisfactory performance on the other.

Full text

PDF (0.7 MB)

BibTeX record

@TechReport{UCAM-CL-TR-967,
  author =	 {Fabiani, Primo{\v z}},
  title = 	 {{Gaussian Pixie Autoencoder: Introducing Functional
         	   Distributional Semantics to continuous latent spaces}},
  year = 	 2022,
  month = 	 jan,
  url = 	 {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-967.pdf},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-967},
  number = 	 {UCAM-CL-TR-967}
}