SIGNATORY: DIFFERENTIABLE COMPUTATIONS OF THE SIGNATURE AND LOGSIGNATURE TRANSFORMS, ON BOTH CPU AND GPU

Abstract

Signatory is a library for calculating and performing functionality related to the signature and logsignature transforms. The focus is on machine learning, and as such includes features such as CPU parallelism, GPU support, and backpropagation. To our knowledge it is the first GPU-capable library for these operations. Signatory implements new features not available in previous libraries, such as efficient precomputation strategies. Furthermore, several novel algorithmic improvements are introduced, producing substantial real-world speedups even on the CPU without parallelism. The library operates as a Python wrapper around C++, and is compatible with the PyTorch ecosystem. It may be installed directly via pip.

1. INTRODUCTION

The signature transform, sometimes referred to as the path signature or simply signature, is a central object in rough path theory (Lyons, 1998; 2014) . It is a transformation on differentiable pathsfoot_0 , and may be thought of as loosely analogous to the Fourier transform. However whilst the Fourier transform extracts information about frequency, treats each channel separately, and is linear, the signature transform exacts information about order and area, explicitly considers combinations of channels, and is in a precise sense 'universally nonlinear' (Bonnier et al., 2019, Proposition A.6 ). The logsignature transform (Liao et al., 2019 ) is a related transform, that we will also consider. In both cases, by treating sequences of data as continuous paths, then the (log)signature transform may be applied for use in problems with sequential structure, such as time series. Indeed there is a significant body of work using the (log)signature transform in machine learning, with examples ranging from handwriting identification to sepsis prediction, see for example We introduce Signatory, a CPU-and GPU-capable library for calculating and performing functionality related to the signature and logsignature transforms. To our knowledge it is the first GPU-capable library for these operations. The focus is on machine learning applications. Signatory is significantly faster than previous libraries (whether run on the CPU or the GPU), due to a combination of parallelism and novel algorithmic improvements. In particular the latter includes both uniform and asymptotic rate improvements over previous algorithms. Additionally, Signatory provides functionality not available in previous libraries, such as precomputation strategies for efficient querying of the (log)signature transform over arbitrary overlapping intervals. The library integrates with the open source PyTorch ecosystem and runs on Linux or Windows. Documentation, examples, benchmarks and tests form a part of the project. Much of the code is written in C++ primitives and the CPU implementation utilises OpenMP. The backward operations are handwritten for both speed and memory efficiency, and do not rely on the autodifferentiation provided by PyTorch. The source code is located at https://github.com/patrick-kidger/signatory, documentation and examples are available at https://signatory.readthedocs.io, and the project may be installed directly via pip. This paper is not a guide to using Signatory-for that we refer to the documentation. This is meant as a technical exposition of its innovations. 

2. BACKGROUND

We begin with some exposition on theory of the signature and logsignature transforms. We begin with definitions and offer intuition afterwards. Also see Reizenstein & Graham (2018) for an introduction focusing on computational concerns, and Lyons et al. (2004) and Hodgkinson et al. (2020) for pedagogical introductions to the motivating theory of rough paths.

2.1. THE

SIGNATURE TRANSFORM Definition 1. Let R d1 ⊗R d2 ⊗• • •⊗R dn denote the space of all real tensors with shape d 1 ×d 2 ×• • •× d n . There is a corresponding binary operation ⊗, called the tensor product, which maps a tensor of shape (d 1 , . . . , d n ) and a tensor of shape (e 1 , . . . , e m ) to a tensor of shape (d 1 , . . . , d n , e 1 , . . . , e m ) via (A i1,...,in , B j1,...,jm ) → A i1,...,in B j1,...,jm . For example when applied to two vectors, it reduces to the outer product. Definition 2. Let N ∈ N. The signature transform to depth N is defined as Sig N : f ∈ C([0, 1]; R d ) f differentiable → N k=1 R d ⊗k , Sig N (f ) =   • • • 0<t1<•••<t k <1 df dt (t 1 ) ⊗ • • • ⊗ df dt (t k ) dt 1 • • • dt k   1≤k≤N . (1)



And may be extended to paths of bounded variation, or merely finite p-variation(Lyons et al., 2004).



Morrill et al. (2019); Fermanian (2019); Király & Oberhauser (2019); Toth & Oberhauser (2020); Morrill et al. (2020b). Earlier work often used the signature and logsignature transforms as a feature transformation. See Levin et al. (2013); Chevyrev & Kormilitzin (2016); Yang et al. (2016a;b); Kormilitzin et al. (2016); Li et al. (2017); Perez Arribas et al. (2018) for a range of examples. In this context, when training a model on top, it is sufficent to simply preprocess the entire dataset with the signature or logsignature transform, and then save the result. However, recent work has focused on embedding the signature and logsignature transforms within neural networks. Recent work includes Bonnier et al. (2019); Liao et al. (2019); Moor et al. (2020); Morrill et al. (2020a); Kidger et al. (2020) among others. In this context, the signature and logsignature transforms are evaluated many times throughout a training procedure, and as such efficient and differentiable implementations are crucial. Previous libraries(Lyons, 2017; Reizenstein & Graham,  2018)  have been CPU-only and single-threaded, and quickly become the major source of slowdown when training and evaluating these networks.

APPLICATIONS Signatory has already seen a rapid uptake amongst the signature community. Recent work using Signatory include Morrill et al. (2020b); Perez Arribas et al. (2020) who involve signatures in neural differential equations, or Moor et al. (2020); Min & Ichiba (2020) who study deep signature models (Bonnier et al., 2019). Meanwhile Ni et al. (2020) apply Signatory to hybridise signatures with GANs, and Morrill et al. (2020a) create a generalised framework for the "signature method". As a final example, Signatory is now itself a dependency for other libraries (Kidger, 2020).

Let R d ⊗k = R d ⊗ • • • ⊗ R d , and v ⊗k = v ⊗ • • • ⊗ v for v ∈ R d , in each case with k -1 many ⊗.

availability

tests may be found at https://github.com/

