TIME-TRANSFORMER AAE: CONNECTING TEMPORAL CONVOLUTIONAL NETWORKS AND TRANSFORMER FOR TIME SERIES GENERATION

Abstract

Generating time series data is a challenging task due to the complex temporal properties of this type of data. Such temporal properties typically include local correlations as well as global dependencies. Most existing generative models have failed to effectively learn both the local and global properties of time series data. To address this open problem, we propose a novel time series generative model consisting of an adversarial autoencoder (AAE) and a newly designed architecture named 'Time-Transformer' within the decoder. We call this generative model 'Time-Transformer AAE'. The Time-Transformer first simultaneously learns local and global features in a layer-wise parallel design, combining the abilities of Temporal Convolutional Networks (TCNs) and Transformer in extracting local features and global dependencies respectively. Second, a bidirectional cross attention is proposed to provide complementary guidance across the two branches and achieve proper fusion between local and global features. Experimental results demonstrate that our model can outperform existing state-of-the-art models in most cases, especially when the data contains both global and local properties. We also show our model's ability to perform a downstream task: data augmentation to support the solution of imbalanced classification problems.

1. INTRODUCTION

Automatically generating realistic synthetic data assists in solving real-world problems when there is limited access to real data and manual generation is cumbersome and/or impractical. Deep generative models have shown considerable success in domains such as computer vision and natural language processing in the last decade. Numerous models have been introduced to produce synthetic images or text to address downstream tasks such as image in-painting (Pathak et al., 2016 ), text to image translation (Zhang et al., 2016) , and automated captioning (Guo et al., 2017) . Although data generation is similarly important in the time series domain, there exist relatively few works that address this problem. This is due to the fact that the generated data is required to share a similar global distribution with the original time series data and also preserve its unique temporal properties. As such, generative models for time series data, especially those universally applicable to different types of time series data are relatively rare. Many existing works utilize Generative Adversarial Networks (GANs) (Goodfellow et al., 2014) for time series generation and most of these address the temporal challenges using Recurrent Neural Networks (RNNs) such as Long Short Term Memory (LSTM) (Esteban et al., 2017; Yoon et al., 2019; Pei et al., 2021) . There are also approaches that use Variational Autoencoder (VAE) (Kingma & Welling, 2013) as the basic framework to generate time series data (Fabius & van Amersfoort, 2014; Desai et al., 2021) . However, none of these works have succeeded in efficiently learning both local correlation and global interaction, which is crucial for time series processing. Recently, Transformer based models have been successful in learning global features for different types of data including time series (Raffel et al., 2019; Dosovitskiy et al., 2020; Zerveas et al., 2021; Chen et al., 2021; 2022) . On the other hand, models based on Convolutional Neural Networks (CNNs) have been shown to be better at extracting local patterns with their filters (Howard et al., 2017; Yamashita et al., 2018; Liu et al., 2019) . Temporal Convolutional Networks (TCNs), consisting of dilated convolutional layers Oord et al. (2016a) , preserve the original local processing capability of CNNs, but also have an enhanced ability to learn temporal dependencies in sequential data (Lea et al., 2016; Bai et al., 2018) . This makes them appropriate for time series modeling. Therefore, it is natural to combine Transformer and TCN together to learn better time series features. For example, some previous works use sequential combinations for time series tasks (Lin et al., 2019; Cao et al., 2021) . However, such sequential designs do not consider the interaction between local and global features inherent in these datasets. Motivated by the above observations and analysis, we propose a novel time series generative model named 'Time-Transformer AAE'. Specifically, we first select the Adversarial Autoencoder (AAE) Makhzani et al. (2015) as the generative framework due to its success at learning different types of tasks. The Time-Transformer is designed as part of the decoder to effectively learn and integrate both local and global features. In each Time-Transformer block, the temporal properties are learnt by both a TCN layer and a Transformer block. They are then connected through a bidirectional cross attention block to fuse local and global features. This layer-wise parallel structure along with bidirectional interaction, combines the advantages of the TCN and Transformer models: the ability of the TCN to efficiency extract local features, as well as the Transformer's ability in building global dependency. We evaluate our proposed Time-Transformer AAE on different types of time series data including artificial and real-world datasets. Experiments show that the proposed model surpasses the existing state-of-the-art (SOTA) models in addressing the time series generation task. Furthermore, we also show our model's effectiveness on a downstream task -imbalanced classification -using several real-world datasets. To summarize, our contributions are as follows: et al., 2016; Zhang et al., 2016; Karras et al., 2017; Arjovsky et al., 2017; Guo et al., 2017; Kadurin et al., 2017; He et al., 2019; Ahamad, 2019) . Successes in the fields of graphs and text have led to the application of DGMs in the time series domain. Most of them are derived from the GAN framework with additional modifications to incorporate temporal properties. The first of these, called C-RNN-GAN (Mogren, 2016), directly uses the GAN structure with LSTM to generate music data. Esteban et al. (2017) propose a Recurrent Conditional GAN (RCGAN) which uses a basic RNN as generator and discriminator and auxiliary label information as condition to generate medical time series. Since then, a number of works have utilized similar designs to generate time series data in various fields including finance, medicine and the internet (Zhou et al., 2018; Hartmann et al., 2018; Chen & Jiang, 2018; Koochali et al., 2019; Wiese et al., 2020; Smith & Smith, 2020; Lin et al., 2020) 



We propose a new time series generative model called Time-Transformer AAE, which effectively combines the advantages of TCN and Transformer in extracting local and global patterns respectively. • We introduce the Time-Transformer module that simultaneously learns local and global features in a layer-wise parallel design and facilitates interaction between these two types of features by performing feature fusion in a bidirectional manner. • We show empirically that the proposed Time-Transformer AAE can generate better synthetic time series data, with respect to different benchmarks, compared to SOTA methods. 2 RELATED WORKS 2.1 TIME SERIES GENERATION Deep generative models (DGMs) have gained increasing attention since their introduction. Kingma & Welling (2013) propose a Variational autoencoder (VAE) that uses Bayesian method to learn latent representations and turn the classic autoencoder into a generative model. Goodfellow et al. (2014) introduce an adversarial approach to shape the output distribution and propose the Generative adversarial networks (GANs). Makhzani et al. (2015) combine the previous two models together in Adversarial autoencoders (AAE). They use the adversarial training procedure to perform variational inference in the VAE. Numerous models have been designed based on these basic generative frameworks and shown superior performance in image and text processing (Oord et al., 2016c;b; Pathak

. TimeGAN Yoon et al. (2019) introduces embedding function and supervised loss to the original GAN framework to generate universal time series. Pei et al. (2021) proposes RTSGAN based on WGAN (Arjovsky et al., 2017) and autoencoder. It focuses on real-world data generation and achieves good performance. Jeha

