CONTRASIM -A SIMILARITY MEASURE BASED ON CONTRASTIVE LEARNING

Abstract

Recent work has compared neural network representations via similarity-based analyses, shedding light on how different aspects (architecture, training data, etc.) affect models' internal representations. The quality of a similarity measure is typically evaluated by its success in assigning a high score to representations that are expected to be matched. However, existing similarity measures perform mediocrely on standard benchmarks. In this work, we develop a new similarity measure, dubbed ContraSim, based on contrastive learning. In contrast to common closed-form similarity measures, ContraSim learns a parameterized measure by using both similar and dissimilar examples. We perform an extensive experimental evaluation of our method, with both language and vision models, on the standard layer prediction benchmark and two new benchmarks that we develop: the multilingual benchmark and the image-caption benchmark. In all cases, Con-traSim achieves much higher accuracy than previous similarity measures, even when presented with challenging examples.

1. INTRODUCTION

Representation learning is a key property in deep neural networks. But how can we assess the similarity of representations learned by two models? A recent line of work is concerned with developing similarity measures and using them to analyze the models' internal representations. Similarity-based analyses may shed light on how different datasets, architectures, etc., change the model's learned representations. For example, a similarity analysis showed that lower layers in different models are more similar to each other, while fine-tuning affects mostly the top layers (Wu et al., 2020) . Various similarity measures have been proposed for comparing representations, among them the most popular ones are based on centered kernel alignment (CKA) (Kornblith et al., 2019) and canonical correlation analysis (CCA) (Hotelling, 1936; Morcos et al., 2018) . They all share a similar methodology: given a pair of feature representations of the same input, they estimate the similarity between them, without considering other examples. However, they all perform mediocrely on standard benchmarks. Motivated by that, we propose a new learnable similarity measure. In this paper, we introduce ContraSim, a new similarity measure, based on contrastive learning (CL) (Chen et al., 2020; He et al., 2020) . In contrast to prior work, which defines closed-form generalpurpose similarity measures, ContraSim is a task-specific learnable similarity measure that uses examples a high similarity (the positive set) and examples that have a low similarity (the negative set), to train an encoder that maps representations to the space where similarity is measured. In the projected space, we maximize the representation similarity with examples from the positive set, and minimize it with examples from the negative set. We experimentally evaluate ContraSim on one standard similarity metrics benchmark and two new benchmarks we introduce in this paper, and demonstrate its superiority compared to common similarity measures. First, we use the known layer prediction benchmark (Kornblith et al., 2019) , which assesses whether high similarity is assigned to two architecturally-corresponding layers in two models differing only in their weight initialization. Second, in our proposed multilingual benchmark, we assume a multilingual model and a parallel dataset of translations in two languages. A good similarity measure should assign a higher similarity to the (multi-lingual) representations of a sentence in language A and its translation in language B, compared to the similarity of the same sentence in language A and a random sentence in language B. Third, we design the image-caption benchmark, based on a similar idea. Given an image and its text caption, and correspondingly a vision model

