SEQUENCE METRIC LEARNING AS SYNCHRONIZATION OF RECURRENT NEURAL NETWORKS

Abstract

Sequence metric learning is becoming a widely adopted approach for various applications dealing with sequential multi-variate data such as activity recognition or natural language processing and is most of the time tackled with sequence alignment approaches or representation learning. In this paper, we propose to study this subject from the point of view of dynamical system theory by drawing the analogy between synchronized trajectories produced by dynamical systems and the distance between similar sequences processed by a siamese recurrent neural network. Indeed, a siamese recurrent network comprises two identical sub-networks, two identical dynamical systems which can theoretically achieve complete synchronization if a coupling is introduced between them. We therefore propose a new neural network model that implements this coupling with a new gate integrated into the classical Gated Recurrent Unit architecture. This model is thus able to simultaneously learn a similarity metric and the synchronization of unaligned multi-variate sequences in a weakly supervised way. Our experiments show that introducing such a coupling improves the performance of the siamese Gated Recurrent Unit architecture on an activity recognition dataset.

1. INTRODUCTION

Metric learning aims at learning an essential component for numerous machine learning algorithms used for classification or clustering: a similarity. It has the benefit to be usable in weakly supervised settings where only equivalence constraints between samples are known (Xing et al. (2003) ), which allows for a large number of applications on various data types: from person re-identification (Yang et al. (2018) ), object tracking (Bertinetto et al. (2016) ) and gesture recognition (Berlemont et al. (2018) ) to sentence similarity computation (Mueller & Thyagarajan (2016) ). Among those applications, less attention has been given to design specific sequence metric learning algorithms, specifically with neural networks despite the simplicity of the siamese architecture (Bromley et al. (1994) ). One easy way to adapt existing approaches to sequential data is to learn representations through Sequence-to-Sequence models (Sutskever et al. ( 2014)) or Transformers (Vaswani et al. (2017) ). However, these models would be difficult to learn in a weakly supervised way for providing a similarity metric and further lose temporal dependency information inside the sequence and alignment information between sequences. On the contrary, Dynamic Time Warping (DTW) (Sakoe & Chiba (1978) ) is a classical approach to measure distance between sequences and relies on aligning sequences. Its integration inside learning algorithms has been rendered difficult by its non-differentiability and its theoretical quadratic time complexity which badly suits the equivalence constraint framework and some associated more complex losses (Oh Song et al., 2016; Sohn, 2016; Yang et al., 2018) . Recent works mitigate these drawbacks notably with virtual metric learning (Perrot & Habrard, 2015; Su & Wu, 2019) and soft versions of DTW (Cai et al., 2019; Abid & Zou, 2018) . Therefore, we aim at designing a neural network architecture specifically adapted to sequence metric learning. Recurrent neural networks (RNN) have a temporal dynamic behavior which allows to study them as dynamical systems. We propose in this paper a new framework for sequence metric

